Continue of Part 1
For real-time captioning done outside of captioning facilities, the following syntax is used:
- ‘>>’ (two prefixed greater-than signs) indicates a change in single speaker.
- Sometimes appended with the speaker’s name in alternate case, followed by a colon.
- ‘>>>’ (three prefixed greater-than signs) indicates a change in news story or multiple speakers.
Styles of syntax that are used by various captioning producers:
- Capitals indicate main on-screen dialogue and the name of the speaker.
- Legacy EIA-608 home caption decoder fonts had no descenders on lowercase letters.
- Outside North America, capitals with background coloration indicate a song title or sound effect description.
- Outside North America, capitals with black or no background coloration indicates when a word is stressed or emphasized.
- Descenders indicate background sound description and off-screen dialogue.
- ‘-‘ (a prefixed dash) indicates a change in single speaker (used by CaptionMax).
- Words in italics indicate when a word is stressed or emphasized and when real world names are quoted.
- Text coloration indicates captioning credits and sponsorship.
- Occasionally, it is for a karaoke effect for music videos on MTV or VH-1.
- In Ceefax/Teletext countries, it indicates a change in single speaker in place of ‘>>’.
- Some Teletext countries use coloration to indicate when a word is stressed or emphasized.
- Coloration is limited to white, green, blue, cyan, red, yellow and magenta.
- UK order of use for text is white, green, cyan, yellow; and backgrounds is black, red, blue, magenta, white.
- US order of use for text is white, yellow, cyan, green; and backgrounds is black, blue, red, magenta, white.
- Square brackets or parentheses indicate a song title or sound effect description.
- Parentheses indicate speaker’s vocal pitch e.g., (man), (woman), (boy) or (girl).
- Outside North America, parentheses indicate a silent on-screen action.
- A pair of eighth notes is used to bracket a line of lyrics to indicate singing.
- A pair of eighth notes on a line of no text are used during a section of instrumental music.
- Outside North America, a single number sign is used on a line of lyrics to indicate singing.
- An additional musical notation character is appended to the end of the last line of lyrics to indicate the song’s end.
- As the symbol is unsupported by Ceefax/Teletext, a number sign – which resembles a musical sharp – is substituted.
The Technical Aspects of Closed-Captioning
There were many shortcomings in the original Line 21 specification from a typographic standpoint, since, for example, it lacked many of the characters required for captioning in languages other than English. Since that time, the core Line 21 character set has been expanded to include quite a few more characters, handling most requirements for languages common in North and South America such as French, Spanish, and Portuguese, though those extended characters are not required in all decoders and are thus unreliable in everyday use. The problem has been almost eliminated with a market specific full set of Western European characters and a private adopted Norpak extension for South Korean and Japanese markets. The full EIA-708 standard for digital television has worldwide character set support, but there has been little use of it due to EBU Teletext dominating DVB countries, which has its own extended character sets.
Captions are often edited to make them easier to read and to reduce the amount of text displayed onscreen. This editing can be very minor, with only a few occasional unimportant missed lines, to severe, where virtually every line spoken by the actors is condensed. The measure used to guide this editing is words per minute, commonly varying from 180 to 300, depending on the type of program. Offensive words are also captioned, but if the program is censored for TV broadcast, the broadcaster might not have arranged for the captioning to be edited or censored also. The “TV Guardian”, a television set-top box, is available to parents who wish to censor offensive language of programs-the video signal is fed into the box and if it detects an offensive word in the captioning, the audio signal is bleeped or muted for that period of time.
The Line 21 data stream can consist of data from several data channels multiplexed together. Odd field 1 can have four data channels: two separate synchronized captions (CC1, CC2) with caption-related text, such as website URLs (T1, T2). Even field 2 can have five additional data channels: two separate synchronized captions (CC3, CC4) with caption related text (T3, T4), and Extended Data Services (XDS) for Now/Next EPG details. XDS data structure is defined in CEA-608. As CC1 and CC2 share bandwidth, if there is a lot of data in CC1, there will be little room for CC2 data and is generally only used for the primary audio captions. Similarly, CC3 and CC4 share the second even field of line 21. Since some early caption decoders supported only single field decoding of CC1 and CC2, captions for SAP in a second language were often placed in CC2. This led to bandwidth problems, however, and the current U.S. Federal Communications Commission (FCC) recommendation is that bilingual programming should have the second caption language in CC3. Many Spanish television networks such as Univision and Telemundo, for example, provides English subtitles for many of its Spanish programs in CC3. Canadian broadcasters use CC3 for French translated SAPs, which is also a similar practice in South Korea and Japan. Ceefax and Teletext can have a larger number of captions for other languages due to the use of multiple VBI lines. However, only European countries used a second subtitle page for second language audio tracks where either the NICAM dual mono or Zweikanalton were used.
Digital Television Interoperability Issues for Closed-Captioning
The US ATSC digital television system originally specified two different kinds of closed captioning datastream standards: the original analog-compatible (available by Line 21) and the more modern digital-only CEA-708 formats are delivered within the video stream. The US FCC mandates that broadcasters deliver (and generate, if necessary) both datastream formats with the CEA-708 format merely a conversion of the Line 21 format. The Canadian CRTC has not mandated that broadcasters either broadcast both datastream formats or exclusively in one format. Most broadcasters and networks to avoid large conversion cost outlays just provide EIA-608 captions along with a transcoded CEA-708 version encapsulated within CEA-708 packets.
Incompatibility Issues with Digital TV
Many viewers find that when they acquire a digital television or set-top box they are unable to view closed caption (CC) information, even though the broadcaster is sending it and the TV is able to display it.
Originally, CC information was included in the picture (“line 21”) via a composite video input, but there is no equivalent capability in digital video interconnects (such as DVI and HDMI) between the display and a “source”. A “source”, in this case, can be a DVD player or a terrestrial or cable digital television receiver. When CC information is encoded in the MPEG-2 data stream, only the device that decodes the MPEG-2 data (a source) has access to the closed caption information; there is no standard for transmitting the CC information to a display monitor separately. Thus, if there is CC information, the source device needs to overlay the CC information on the picture prior to transmitting to the display over the interconnect’s video output.
Many source devices do not have the ability to overlay CC information, for controlling the CC overlay can be complicated. For example, the Motorola DCT-5xxx and -6xxx cable set-top receivers have the ability to decode CC information located on the MPEG-2 stream and overlay it on the picture, but turning CC on and off requires turning off the unit and going into a special setup menu (it is not on the standard configuration menu and it cannot be controlled using the remote). Historically, DVD players, VCRs and set-top tuners did not need to do this overlaying, since they simply passed this information on to the TV, and they are not mandated to perform this overlaying.
Many modern digital television receivers can be directly connected to cables, but often cannot receive scrambled channels that the user is paying for. Thus, the lack of a standard way of sending CC information between components, along with the lack of a mandate to add this information to a picture, results in CC being unavailable to many hard-of-hearing and deaf users.
Closed-Captioning in UK/Australia
The EBU Ceefax-based teletext systems are the source for closed captioning signals, thus when teletext is embedded into DVB-T or DVB-S the closed captioning signal is included. However, for DVB-T and DVB-S, it is not necessary for a teletext page signal to also be present (ITV1, for example, does not carry analogue teletext signals on Sky Digital, but does carry the embedded version, accessible from the “Services” menu of the receiver, or more recently by turning them off/on from a mini menu accessible from the “help” button).
In New Zealand, captions use an EBU Ceefax-based teletext system on DVB broadcasts via satellite and cable television with the exception of MediaWorks New Zealand channels who completely switched to DVB RLE subtitles in 2012 on both Freeview satellite and UHF broadcasts, this decision was made based on the TVNZ practice of using this format on only DVB UHF broadcasts (aka Freeview HD). This made composite video connected TVs incapable of decoding the captions on their own. Also, these pre-rendered subtitles use classic caption style opaque backgrounds with an overly large font size and obscure the picture more than the more modern, partially transparent backgrounds.
Digital Television Improvements for Closed-Captioning Standards
The CEA-708 specification provides for dramatically improved captioning
- An enhanced character set with more accented letters and non-Latin letters, and more special symbols
- Viewer-adjustable text size (called the “caption volume control” in the specification), allowing individuals to adjust their TVs to display small, normal, or large captions
- More text and background colors, including both transparent and translucent backgrounds to optionally replace the big black block
- More text styles, including edged or drop shadowed text rather than the letters on a solid background
- More text fonts, including monospaced and proportional spaced, serif and sans-serif, and some playful cursive fonts
- Higher bandwidth, to allow more data per minute of video
- More language channels, to allow the encoding of more independent caption streams
As of 2009, however, most closed captioning for digital television environments is done using tools designed for analog captioning (working to the CEA-608 NTSC specification rather than the CEA-708 ATSC specification). The captions are then run through transcoders made by companies like EEG Enterprises or Evertz, which convert the analog Line 21 caption format to the digital format. This means that none of the CEA-708 features are used unless they were also contained in CEA-608.
Uses of Closed-Captioning in Other Media
Closed-Captioning in DVDs, BDs, HD DVDs
NTSC DVDs may carry closed captions in data packets of the MPEG-2 video streams inside of the Video-TS folder. Once played out of the analog outputs of a set top DVD player, the caption data is converted to the Line 21 format. They are output by the player to the composite video (or an available RF connector) for a connected TV’s built-in decoder or a set-top decoder as usual. They can not be output on S-Video or component video outputs due to the lack of a colorburst signal on line 21. (Actually, regardless of this, if the DVD player is in interlaced rather than progressive mode, closed captioning will be displayed on the TV over component video input if the TV captioning is turned on and set to CC1.) When viewed on a personal computer, caption data can be viewed by software that can read and decode the caption data packets in the MPEG-2 streams of the DVD-Video disc. Windows Media Player (before Windows 7) in Vista supported only closed caption channels 1 and 2 (not 3 or 4). Apple’s DVD Player does not have the ability to read and decode Line 21 caption data which are recorded on a DVD made from an over-the-air broadcast. It can display some movie DVD captions.
In addition to Line 21 closed captions, video DVDs may also carry subtitles, which generally rendered from the EIA-608 captions as a bitmap overlay that can be turned on and off via a set top DVD player or DVD player software, just like the textual captions. This type of captioning is usually carried in a subtitle track labeled either “English for the hearing impaired” or, more recently, “SDH” (subtitled for the deaf and Hard of hearing). Many popular Hollywood DVD-Videos can carry both subtitles and closed captions (e.g. Stepmom DVD by Columbia Pictures). On some DVDs, the Line 21 captions may contain the same text as the subtitles; on others, only the Line 21 captions include the additional non-speech information (even sometimes song lyrics) needed for deaf and hard-of-hearing viewers. European Region 2 DVDs do not carry Line 21 captions, and instead list the subtitle languages available-English is often listed twice, one as the representation of the dialogue alone, and a second subtitle set which carries additional information for the deaf and hard-of-hearing audience. (Many deaf/HOH subtitle files on DVDs are reworkings of original teletext subtitle files.)
HD DVD and Blu-ray disc media cannot carry any VBI data such as Line 21 closed captioning due to the design of DVI-based High-Definition Multimedia Interface (HDMI) specifications that was only extended for synchronized digital audio replacing older analog standards, such as VGA, S-Video, component video, and SCART. Both Blu-ray disc and HD DVD can use either PNG bitmap subtitles or ‘advanced subtitles’ to carry SDH type subtitling, the latter being an XML-based textual format which includes font, styling and positioning information as well as a unicode representation of the text. Advanced subtitling can also include additional media accessibility features such as “descriptive audio“.
There are several competing technologies used to provide captioning for movies in theaters. Cinema captioning falls into the categories of ‘open’ and ‘closed.’ The definition of “closed” captioning in this context is different from television, as it refers to any technology that allows as few as one member of the audience to view the captions.
Open captioning in a film theater can be accomplished through burned-in captions, projected text or bitmaps, or (rarely) a display located above or below the movie screen. Typically, this display is a large LED sign. In a digital theater, open caption display capability is built into the digital projector. Closed caption capability is also available, with the ability for 3rd-party closed caption devices to plug into the digital cinema server. Probably the best known closed captioning option for film theaters is the Rear Window Captioning System from the National Center for Accessible Media. Upon entering the theater, viewers requiring captions are given a panel of flat translucent glass or plastic on a gooseneck stalk, which can be mounted in front of the viewer’s seat. In the back of the theater is an LED display that shows the captions in mirror image. The panel reflects captions for the viewer but is nearly invisible to surrounding patrons. The panel can be positioned so that the viewer watches the movie through the panel, and captions appear either on or near the movie image. A company called Cinematic Captioning Systems has a similar reflective system called Bounce Back. A major problem for distributors has been that these systems are each proprietary, and require separate distributions to the theater to enable them to work. Proprietary systems also incur license fees.
For film projection systems, Digital Theater Systems, the company behind the DTS surround sound standard, has created a digital captioning device called the DTS-CSS (Cinema Subtitling System). It is a combination of a laser projector which places the captioning (words, sounds) anywhere on the screen and a thin playback device with a CD that holds many languages. If the Rear Window Captioning System is used, the DTS-CSS player is also required for sending caption text to the Rear Window sign located in the rear of the theater. Special effort has been made to build accessibility features into digital projection systems (see digital cinema). Through SMPTE, standards now exist that dictate how open and closed captions, as well as hearing-impaired and visually impaired narrative audio, are packaged with the rest of the digital movie. This eliminates the proprietary caption distributions required for film, and the associated royalties. SMPTE has also standardized the communication of closed caption content between the digital cinema server and 3rd-party closed caption systems (the CSP/RPL protocol). As a result, new, competitive closed caption systems for digital cinema are now emerging that will work with any standards-compliant digital cinema server. These newer closed caption devices include cupholder-mounted electronic displays and wireless glasses which display caption text in front of the wearer’s eyes. Bridge devices are also available to enable the use of Rear Window systems.
As of mid-2010, the remaining challenge to the wide introduction of accessibility in digital cinema is the industry-wide transition to SMPTE DCP, the standardized packaging method for very high quality, secure distribution of digital movies.
Captioning systems have also been adopted by some stadiums, typically through dedicated portions of their main scoreboards or as part of balcony fascia LED boards. These screens display captions of the public address announcer and other spoken content, such as those contained within in-game segments, public service announcements, and lyrics of songs played in-stadium. In some facilities, these systems were added as a result of discrimination lawsuits. Following a lawsuit under the Americans with Disabilities Act, FedEx Field added caption screens in 2006. After a similar lawsuit that declared special “deaf seating” areas with screen-mounted captioning, and later the use of smartphones to be insufficient due to the small size of text, University of Phoenix Stadium added dedicated caption displays in 2013.
Some stadiums utilize on-site captioners while others outsource them to external providers who caption remotely. A prominent provider of in-arena captioning systems is Good Sport Captioning, founded by Patti White of St. Louis. White had worked as a stenographer at a courthouse near where Busch Stadium was being constructed and reached a deal with the team to provide in-stadium captioning upon the stadium’s 2006 opening-conducting her activity from her home. Patti later formed Good Sport Captioning to provide remote captioning for other teams and venues.
Closed captioning of video games is becoming more common. One of the first video game companies to feature closed captioning was Bethesda Softworks in their 1990 release of Hockey League Simulator and The Terminator 2029. Infocom also offered Zork Grand Inquisitor in 1997. Many games since then have at least offered subtitles for spoken dialog during cutscenes, and many include significant in-game dialog and sound effects in the captions as well; for example, with subtitles turned on in the Metal Gear Solid series of stealth games, not only are subtitles available during cut scenes, but any dialog spoken during real-time gameplay will be captioned as well, allowing players who can’t hear the dialog to know what enemy guards are saying and when the main character has been detected. Also, in many of developer Valve Corporation‘s video games (such as Half-Life 2 or Left 4 Dead), when closed captions are activated, dialog and nearly all sound effects either made by the player or from other sources (e.g. gunfire, explosions) will be captioned. Video games don’t offer Line 21 captioning, decoded and displayed by the television itself but rather a built-in subtitle display, more akin to that of a DVD. The game systems themselves have no role in the captioning either; each game must have its subtitle display programmed individually.
Reid Kimball, a game designer who is hearing impaired, is attempting to educate game developers about closed captioning for games. Reid started the Games[CC] group to closed caption games and serve as a research and development team to aid the industry. Kimball designed the Dynamic Closed Captioning system, writes articles and speaks at developer conferences. Games[CC]’s first closed captioning project called Doom3[CC] was nominated for an award as Best Doom3 Mod of the Year for IGDA’s Choice Awards 2006 show.
Online Video Streaming
Internet video streaming service YouTube offers captioning services in videos. The author of the video can upload a SubViewer (*.SUB), SubRip (*.SRT) or *.SBV file. As a beta feature, the site also added the ability to automatically transcribe and generate captioning on videos, with varying degrees of success based upon the content of the video. However, the automatic captioning is often inaccurate on videos with background music and exaggerated emotion in speaking. Variations in volume can also result in nonsensical machine-generated captions. Additional problems arise with strong accents, sarcasm, differing contexts, or homonyms. On June 30, 2010, YouTube announced a new “YouTube Ready” designation for professional caption vendors in the United States. The initial list included twelve companies who passed a caption quality evaluation administered by the Described and Captioned Media Project, have a website and a YouTube channel where customers can learn more about their services and have agreed to post rates for the range of services that they offer for YouTube content.
Flash video also supports captions via the Distribution Exchange profile (DFXP) of W3C timed text format. The latest Flash authoring software adds free player skins and caption components that enable viewers to turn captions on/off during playback from a web page. Previous versions of Flash relied on the Captionate 3rd party component and skin to caption Flash video. Custom Flash players designed in Flex can be tailored to support the timed-text exchange profile, Captionate .XML, or SAMI file (e.g. Hulu captioning). This is the preferred method for most US broadcast and cable networks that are mandated by the U.S. Federal Communications Commission to provide captioned on-demand content. The media encoding firms generally use software such as MacCaption to convert EIA-608 captions to this format. The Silverlight Media Framework also includes support for the timed-text exchange profile for both download and adaptive streaming media. Windows Media Video can support closed captions for both video on demand streaming or live streaming scenarios. Typically Windows Media captions support the SAMI file format but can also carry embedded closed caption data. QuickTime video supports raw 608 caption data via proprietary closed caption track, which are just EIA-608 byte pairs wrapped in a QuickTime packet container with different IDs for both line 21 fields. These captions can be turned on and off and appear in the same style as TV closed captions, with all the standard formatting (pop-on, roll-up, paint-on), and can be positioned and split anywhere on the video screen. QuickTime closed caption tracks can be viewed in Macintosh or Windows versions of QuickTime Player, iTunes (via QuickTime), iPod Nano, iPod Classic, iPod Touch, iPhone, and iPad.
Closed-Captioning in Theatre
Closed-Captioning on Telephone
A captioned telephone is a telephone that displays real-time captions of the current conversation. The captions are typically displayed on a screen embedded into the telephone base.
Closed-Captioning for Media Monitoring Services
In the United States especially, most media monitoring services capture and index closed captioning text from news and public affairs programs, allowing them to search the text for client references. The use of closed captioning for television news monitoring was pioneered by Universal Press Clipping Bureau (Universal Information Services) in 1992, and later in 1993 by Tulsa-based NewsTrak of Oklahoma (later known as Broadcast News of Mid-America, acquired by video news release pioneer Medialink Worldwide Incorporated in 1997). US patent 7,009,657 describes a “method and system for the automatic collection and conditioning of closed caption text originating from multiple geographic locations” as used by news monitoring services.
Closed-Captioning for Conversations
Software programs are now available that automatically generate a closed-captioning of conversations. Examples of such conversations include discussions in conference rooms, classroom lectures, and/or religious services.
Closed-Captioning for Non-Linear Video Editing Systems
In 2010, Vegas Pro, the professional non-linear editor, was updated to support importing, editing, and delivering CEA-608 closed captions. Vegas Pro 10, released on October 11, 2010, added several enhancements to the closed captioning support. TV-like CEA-608 closed captioning can now be displayed as an overlay when played back in the Preview and Trimmer windows, making it easy to check placement, edits, and timing of CC information. CEA708 style Closed Captioning is automatically created when the CEA-608 data is created. Line 21 closed captioning is now supported, as well as HD-SDI closed captioning capture and print from AJA and Blackmagic Design cards. Line 21 support provides a workflow for existing legacy media. Other improvements include increased support for multiple closed captioning file types, as well as the ability to export closed caption data for DVD Architect, YouTube, RealPlayer, QuickTime, and Windows Media Player. In mid-2009, Apple released Final Cut Pro version 7 and began support for inserting closed caption data into SD and HD tape masters via firewire and compatible video capture cards. Up until this time, it was not possible for video editors to insert caption data with both CEA-608 and CEA-708 to their tape masters. The typical workflow included first printing the SD or HD video to a tape and sending it to a professional closed caption service company that had a stand-alone closed caption hardware encoder.
This new closed captioning workflow known as e-Captioning involves making a proxy video from the non-linear system to import into a third-party non-linear closed captioning software. Once the closed captioning software project is completed, it must export a closed caption file compatible with the non-linear editing system. In the case of Final Cut Pro 7, three different file formats can be accepted: a .SCC file (Scenarist Closed Caption file) for Standard Definition video, a QuickTime 608 closed caption track (a special 608 coded track in the .mov file wrapper) for standard-definition video, and finally a QuickTime 708 closed caption track (a special 708 coded track in the .mov file wrapper) for high-definition video output. Alternatively, Matrox video systems devised another mechanism for inserting closed caption data by allowing the video editor to include CEA-608 and CEA-708 in a discrete audio channel on the video editing timeline. This allows real-time preview of the captions while editing and is compatible with Final Cut Pro 6 and 7. Other non-linear editing systems indirectly support closed captioning only in Standard Definition line-21. Video files on the editing timeline must be composited with a line-21 VBI graphic layer known in the industry as a “blackmovie” with closed caption data. Alternately, video editors working with the DV25 and DV50 firewire workflows must encode their DV .avi or .mov file with VAUX data which includes CEA-608 closed caption data.
The current and most familiar logo for closed captioning consists of two Cs (for “closed captioned”) inside a television screen. It was created by WGBH. The other logo, trademarked by the National Captioning Institute, is that of a simple geometric rendering of a television set merged with the tail of a speech balloon; two such versions exist – one with a tail on the left, the other with a tail on the bright.