mozilla
Your Search Results

    WebVTT

    Introduced in HTML5

    WebVTT is a format for displaying timed text tracks (e.g. subtitles) with the <track> element. The primary purpose of WebVTT files is to add subtitles to a <video>.

    WebVTT is a text based format. A WebVTT file must be encoded in UTF-8 format. Where you can use spaces you can also use tabs.

    The mime type of WebVTT is text/vtt.

     

    WebVTT Body

    The structure of a WebVTT file requires two things and has four optional components.

    • An optional byte order mark (BOM)
    • The string WEBVTT
    • An optional text header to the right of WEBVTT.
      • There must be at least one space after WEBVTT
      • You might use this to add a description to the file
      • You may use anything except newlines or the string "-->"
    • A blank line, which is equivalent to two consecutive newlines.
    • Zero or more cues or comments.
    • Zero or more blank lines.
    Example 1 - Simplest possible WEBVTT file
      WEBVTT
    
    
    Example 2 - Very simple WebVTT file
      WEBVTT - This file has no cues.
    
    
    Example 3 - Common WebVTT example
      WEBVTT - This file has cues.
    
      14
      00:01:14.815 --> 00:01:18.114
      - What?
      - Where are we now?
    
      15
      00:01:18.171 --> 00:01:20.991
      - This is big bat country.
    
      16
      00:01:21.058 --> 00:01:23.868
      - [ Bats Screeching ]
      - They won't get in your hair. They're after the bugs.
    

     

    WebVTT Comment

    Comments are an optional component that can be used to add information to a WebVTT file. Comments are intended for those reading the file and are not seen by users. Comments may contain newlines but it cannot contain a blank line, which is equivalent to two consecutive newlines. A blank line signifies the end of a comment.

    A comment cannot contain the string "-->", the ampersand character (&), or the less-than sign (<). Instead use the escape sequence "&amp;" for ampersand and "&lt;" for less-than. It is also recommended that you use the greater-than escape sequence "&gt;" instead of the greater-than character (>) to avoid confusion with tags.

    A comment consists of three parts:

    • The string NOTE
    • A space or a newline
    • Zero or more characters other than those noted above
    Example 4 - Common WebVTT example
      NOTE This is a comment
    
    Example 5 - Multi-line comment
      NOTE
      Another comment that is spanning
      more than one line.
    
      NOTE You can also make a comment
      across more than one line this way.
    
    Example 6 - Common comment usage
      WEBVTT - Translation of that film I like
    
      NOTE
      This translation was done by Kyle so that
      some friends can watch it with their parents.
    
      1
      00:02:15.000 --> 00:02:20.000
      - Ta en kopp varmt te.
      - Det är inte varmt.
    
      2
      00:02:20.000 --> 00:02:25.000
      - Har en kopp te.
      - Det smakar som te.  
    
      NOTE This last line may not translate well.
    
      3
      00:02:25.000 --> 00:02:30.000
      -Ta en kopp.
    

     

    WebVTT Cues

    A cue is a single subtitle block that has a single start time, end time, and textual payload. Example 6 consists of the header, a blank line, and then five cues separated by blank lines. A cue consists of five components:

    • An optional cue identifier followed by a newline
    • Cue timings
    • Optional cue settings with at least one space before the first and between each setting
    • One or more newlines
    • The cue payload text
    Example 7 - Example of a cue
    1 - Title Crawl
    00:00:5.000 --> 00:00:10.000 line:0 position:20% size:60% align:start
    Some time ago in a place rather distant....

     

    Cue Identifier

    The identifier is a name that identifies the cue. It can be used to reference the cue from a script. It must not contain a newline and cannot contain the string "-->". It must end with a single newline. They do not have to be unique, although it is common to number them (e.g. 1, 2, 3, ...).

    Example 8 - Cue identifier from Example 7
    1 - Title Crawl
    Example 9 - Common usage of identifiers
    WEBVTT
    
    1
    00:00:22.230 --> 00:00:24.606
    This is the first subtitle.
    
    2
    00:00:30.739 --> 00:00:34.074
    This is the second.
    
    3
    00:00:34.159 --> 00:00:35.743
    Third
    

     

    Cue Timings

    A cue timing indicates when the cue is shown. It has a start and end time which are represented by timestamps. The end time must be greater than the start time, and the start time must be greater than or equal to all previous start times. Cues may have overlapping timings.

    If the WebVTT file is being used for chapters (<track> kind is chapters) then the file cannot have overlapping timings.

    Each cue timing contains five components:

    • Timestamp for start time
    • At least one space
    • The string "-->"
    • At least one space
    • Timestamp for end time
      • Which must be greater than the start time

    The timestamps must be in one of two formats:

    • mm:ss.ttt
    • hh:mm:ss.ttt

    Where the components are defined as follows:

    • hh is hours
      • Must be at least two digits and not less than 01
      • Hours can be greater than two digits (e.g. 9999:00:00.000)
    • mm is minutes
      • Must be between 00 and 59 inclusive
    • ss is senconds
      • Must be between 00 and 59 inclusive
    • ttt is miliseconds
      • Must be between 000 and 999 inclusive
    Example 10 - Basic cue timing examples
    00:22.230 --> 00:24.606
    00:30.739 --> 00:00:34.074
    00:00:34.159 --> 00:35.743
    00:00:35.827 --> 00:00:40.122
    Example 11 - Overlapping cue timing examples
    00:00:00.000 --> 00:00:10.000
    00:00:05.000 --> 00:01:00.000
    00:00:30.000 --> 00:00:50.000
    Example 12 - Non-overlapping cue timing examples
    00:00:00.000 --> 00:00:10.000
    00:00:10.000 --> 00:01:00.581
    00:01:00.581 --> 00:02:00.100
    00:02:01.000 --> 00:02:01.000

     

    Cue Settings

    Cue settings are optional components used to position where the cue payload text will be displayed over the video. This includes whether the text is displayed horizontally or vertically. There can be zero or more of them, and they can be used in any order so long as each setting is used no more than once.

    The cue settings are added to the right of the cue timings. There must be one or more spaces between the cue timing and the first setting and between each setting. A setting's name and value are separated by a colon. The settings are case sensitive so use lower case as shown. There are five cue settings:

    • vertical
      • Indicates that the text will be displayed vertically rather than horizontally, such as in some Asian languages.
      Table 1 - vertical values
      vertical:rl writing direction is right to left
      vertical:lr writing direction is left to right
    • line
      • Specifies where text appears vertically. If vertical is set, line specifies where text appears horizontally.
      • Value can be a line number
        • The line height is the height of the first line of the cue as it appears on the video
        • Positive numbers indicate top down
        • Negative numbers indicate bottom up
      • Or value can be a percentage
        • Must be an integer (i.e. no decimals) between 0 and 100 inclusive
        • Must be followed by a percent sign (%)
      Table 2 - line examples
        vertical omitted vertical:rl vertical:lr
      line:0 top right left
      line:-1 bottom left right
      line:0% top right left
      line:100% bottom left right
    • position
      • Specifies where the text will appear horizontally. If vertical is set, position specifies where the text will appear vertically.
      • Value is a percentage
      • Must be an integer (no decimals) between 0 and 100 inclusive
      • Must be followed by a percent sign (%)
      Table 3 - position examples
        vertical omitted vertical:rl vertical:lr
      position:0% left top top
      position:100% right bottom bottom
    • size
      • Specifies the width of the text area. If vertical is set, size specifies the height of the text area.
      • Value is a percentage
      • Must be an integer (i.e. no decimals) between 0 and 100 inclusive
      • Must be followed by a percent sign (%)
      Table 4 - size examples
        vertical omitted vertical:rl vertical:lr
      size:100% full width full height full height
      size:50% half width half height half height
    • align
      • Specifies the alignment of the text. Text is aligned within the space given by the size cue setting if it is set.
      Table 5 - align values
        vertical omitted vertical:rl vertical:lr
      align:start left top top
      align:middle centred horizontally centred vertically centred vertically
      align:end right bottom bottom
    Example 13 - Cue setting examples

    The first line demonstrates no settings. The second line might be used to overlay text on a sign or label. The third line might be used for a title. The last line might be used for an Asian language.

    00:00:5.000 --> 00:00:10.000
    00:00:5.000 --> 00:00:10.000 line:63% position:72% align:start
    00:00:5.000 --> 00:00:10.000 line:0 position:20% size:60% align:start
    00:00:5.000 --> 00:00:10.000 vertical:rt line:-1 align:end
    

     

    Cue Payload

    The payload is where the main information or content is located. In normal usage the payload contains the subtitles to be displayed. The payload text may contain newlines but it cannot contain a blank line, which is equivalent to two consecutive newlines. A blank line signifies the end of a cue.

    A cue text payload cannot contain the string "-->", the ampersand character (&), or the less-than sign (<). Instead use the escape sequence "&amp;" for ampersand and "&lt;" for less-than. It is also recommended that you use the greater-than escape sequence "&gt;" instead of the greater-than character (>) to avoid confusion with tags. If you are using the WebVTT file for metadata these restrictions do not apply.

    In addition to the three escape sequences mentioned above, there are fours others. They are listed in the table below.

    Table 6 - Escape sequences
    Name Character Escape Sequence
    Ampersand & &amp;
    Less-than < &lt;
    Greater-than > &gt;
    Left-to-right mark   &lrm;
    Right-to-left mark   &rlm;
    Non-breaking space   &nbsp;

     

    Cue Payload Text Tags

    There are a number of tags, such as <bold>, that can be used. However, if the WebVTT file is used in a <track> element where the attribute kind is chapters then you cannot use tags.

    • Timestamp tag
      • The timestamp must be greater that the cue's start timestamp, greater than any previous timestamp in the cue payload, and less than the cue's end timestamp. The active text is the text between the timestamp and the next timestamp or to the end of the payload if there is not another timestamp in the payload. Any text before the active text in the payload is previous text . Any text beyond the active text is future text . This enables karaoke style captions.
      Example 12 - Karaoke style text
      1
      00:16.500 --> 00:18.500
      When the moon <00:17.500>hits your eye
      
      1
      00:00:18.500 --> 00:00:20.500
      Like a <00:19.000>big-a <00:19.500>pizza <00:20.000>pie
      
      1
      00:00:20.500 --> 00:00:21.500
      That's <00:00:21.000>amore
            

     

    The following tags require opening and closing tags (e.g. <b>text</b>).

    • Class tag (<c></c>)
      • Style the contained text using a CSS class.
      Example 14 - Class tag
      <c.classname>text</c>
    • Italics tag (<i></i>)
      • Italicize the contained text.
      Example 15 - Italics tag
      <i>text</i>
    • Bold tag (<b></b>)
      • Bold the contained text.
      Example 16 - Bold tag
      <b>text</b>
    • Underline tag (<u></u>)
      • Underline the contained text.
      Example 17 - Underline tag
      <u>text</u>
    • Ruby tag (<ruby></ruby>)
      • Used with ruby text tags to display ruby characters (i.e. small annotative characters above other characters).
      Example 18 - Ruby tag
      <ruby>WWW<rt>World Wide Web</rt>oui<rt>yes</rt></ruby>
    • Ruby text tag (<rt></rt>)
      • Used with ruby tags to display ruby characters (i.e. small annotative characters above other characters).
      Example 19 - Ruby text tag
      <ruby>WWW<rt>World Wide Web</rt>oui<rt>yes</rt></ruby>
    • Voice tag (<v></v>)
      • Similar to class tag, also used to style the contained text using CSS.
      Example 20 - Voice tag
      <v Bob>text</v>

     

    Compatibility

    Feature Chrome Firefox (Gecko) Internet Explorer Opera Safari
    Basic support 18 onwards 24 (disabled by default) 10 onwards 15.0 onwards  7 onwards
    Feature Android Firefox Mobile (Gecko) IChrome for Mobile Opera Mobile Safari Mobile
    Basic support 4.4 onwards        None till now          35.0 onwards    21.0 onwards 7 onwards

     

     

    Specifications

    Document Tags and Contributors

    Tags: 
    Last updated by: crazyformozilla01,