Codecs in common media types

At a fundamental level, you can specify the type of a media file using a simple MIME type, such as video/mp4 or audio/mpeg. However, many media types—especially those that support video tracks—can benefit from the ability to more precisely describe the format of the data within them. For instance, just describing a video in an MPEG-4 file with the MIME type video/mp4 doesn't say anything about what format the actual media within takes.

For that reason, the codecs parameter can be added to the MIME type describing media content. With it, container-specific information can be provided. This information may include things like the profile of the video codec, the type used for the audio tracks, and so forth.

This guide briefly examines the syntax of the media type codecs parameter and how it's used with the MIME type string to provide details about the contents of audio or video media beyond indicating the container type.

Container format MIME types

The MIME type for a container format is expressed by stating the type of media (audio, video, etc.), then a slash character (/), then the format used to contain the media:

audio/mpeg

An audio file using the MPEG file type, such as an MP3.

video/ogg

A video file using the Ogg file type.

video/mp4

A video file using the MPEG-4 file type.

video/quicktime

A video file in Apple's QuickTime format. As noted elsewhere, this format was once commonly used on the web but no longer is, since it required a plugin to use.

However, each of these MIME types is vague. All of these file types support a variety of codecs, and those codecs may have any number of profiles, levels, and other configuration factors. For this reason, you may want to include the codecs parameter along with the media type.

Basic syntax

You can add the codecs parameter to the media type. To do so, append a semicolon (;) followed by codecs= and then the string describing the format of the contents of the file. Some media types only let you specify the names of the codecs to use, while others allow you to specify various constraints on those codecs as well. You can specify multiple codecs by separating them with commas.

audio/ogg; codecs=vorbis

An Ogg file containing a Vorbis audio track.

video/webm; codecs="vp8, vorbis"

A WebM file containing VP8 video and/or Vorbis audio.

video/mp4; codecs="avc1.4d002a"

An MPEG-4 file containing AVC (H.264) video, Main Profile, Level 4.2.

As is the case with any MIME type parameter, codecs must be changed to codecs* (note the asterisk character, *) if any of the properties of the codec use special characters which must be percent-encoded per RFC 2231, section 4: MIME Parameter Value and Encoded Word Extensions. You can use the JavaScript encodeURI() function to encode the parameter list; similarly, you can use decodeURI() to decode a previously encoded parameter list.

Note: When the codecs parameter is used, the specified list of codecs must include every codec used for the contents of the file. The list may also contain codecs not present in the file.

Codec options by container

The containers below support extended codec options in their codecs parameters:

Several of the links above go to the same section; that's because those media types are all based on ISO Base Media File Format (ISO BMFF), so they share the same syntax.

AV1

The syntax of the codecs parameter for AV1 is defined the AV1 Codec ISO Media File Format Binding specification, section 5: Codecs Parameter String.

av01.P.LLT.DD[.M.CCC.cp.tc.mc.F]

Note: Chromium-based browsers will accept any subset of the optional parameters (rather than all or none, as required by the specification).

This codec parameter string's components are described in more detail in the table below. Each component is a fixed number of characters long; if the value is less than that length, it must be padded with leading zeros.

AV1 codec parameter string components
Component Details
P

The one-digit profile number:

AV1 profile numbers
Profile number Description
0 "Main" profile; supports YUV 4:2:0 or monochrome bitstreams with bit depth of 8 or 10 bits per component.
1 "High" profile adds support for 4:4:4 chroma subsampling.
2 "Professional" profile adds support for 4:2:2 chroma subsampling and 12-bit per component color.
LL The two-digit level number, which is converted to the X.Y format level format, where X = 2 + (LL >> 2) and Y = LL & 3. See Appendix A, section 3 in the AV1 Specification for details.
T The one-character tier indicator. For the Main tier (seq_tier equals 0), this character is the letter M. For the High tier (seq_tier is 1), this character is the letter H. The High tier is only available for level 4.0 and up.
DD The two-digit component bit depth. This value must be one of 8, 10, or 12; which values are valid varies depending on the profile and other properties.
M The one-digit monochrome flag; if this is 0, the video includes the U and V planes in addition to the Y plane. Otherwise, the video data is entirely in the Y plane and is therefore monochromatic. See [YUV](/en-US/docs/Web/Media/Formats/Video_concepts#yuv) for details on how the YUV color system works. The default value is 0 (not monochrome).
CCC

CCC indicates the chroma subsampling as three digits. The first digit is subsampling_x, the second is subsampling_y. If both of those are 1, the third is the value of chroma_sample_position; otherwise, the third digit is always 0. This, combined with the M component, can be used to construct the chroma subsampling format:

Determining the chroma subsampling format
subsampling_x subsampling_y Monochrome flag Chroma subsampling format
0 0 0 YUV 4:4:4
1 0 0 YUV 4:2:2
1 1 0 YUV 4:2:0
1 1 1 YUV 4:0:0 (Monochrome)

The third digit in CCC indicates the chroma sample position, with a value of 0 indicating that the position is unknown and must be separately provided during decoding; a value of 1 indicating that the sample position is horizontally collocated with the (0, 0) luma sample; and a value of 2 indicating that the sample position is collocated with (0, 0) luma.

The default value is 110 (4:2:0 chroma subsampling).

cp The two-digit color_primaries value indicates the color system used by the media. For example, BT.2020/BT.2100 color, as used for HDR video, is 09. The information for this—and for each of the remaining components—is found in the Color config semantics section of the AV1 specification. The default value is 01 (ITU-R BT.709).
tc The two-digit transfer_characteristics value. This value defines the function used to map the gamma (delightfully called the "opto-electrical transfer function" in technical parlance) from the source to the display. For example, 10-bit BT.2020 is 14. The default value is 01 (ITU-R BT.709).
mc The two-digit matrix_coefficients constant selects the matrix coefficients used to convert the red, blue, and green channels into luma and chroma signals. For example, the standard coefficients used for BT.709 are indicated using the value 01. The default value is 01 (ITU-R BT.709).
F A one-digit flag indicating whether the color should be allowed to use the full range of possible values (1), or should be constrained to those values considered legal for the specified color configuration (that is, the studio swing representation). The default is 0 (use the studio swing representation).

All fields from M (monochrome flag) onward are optional; you may stop including fields at any point (but can't arbitrarily leave out fields). The default values are included in the table above. Some example AV1 codec strings:

av01.2.15M.10.0.100.09.16.09.0

AV1 Professional Profile, level 5.3, Main tier, 10 bits per color component, 4:2:2 chroma subsampling using ITU-R BT.2100 color primaries, transfer characteristics, and YCbCr color matrix. The studio swing representation is indicated.

av01.0.15M.10

AV1 Main Profile, level 5.3, Main tier, 10 bits per color component. The remaining properties are taken from the defaults: 4:2:0 chroma subsampling, BT.709 color primaries, transfer characteristics, and matrix coefficients. Studio swing representation.

ISO Base Media File Format: MP4, QuickTime, and 3GP

All media types based upon the ISO Base Media File Format (ISO BMFF) share the same syntax for the codecs parameter. These media types include MPEG-4 (and, in fact, the QuickTime file format upon which MPEG-4 is based) as well as 3GP. Both video and audio tracks can be described using the codecs parameter with the following MIME types:

MIME type Description
audio/3gpp 3GP audio (RFC 3839: MIME Type Registrations for 3rd generation Partnership Project (3GP) Multimedia files)
video/3gpp 3GP video (RFC 3839: MIME Type Registrations for 3rd generation Partnership Project (3GP) Multimedia files)
audio/3gp2 3GP2 audio (RFC 4393: MIME Type Registrations for 3GPP2 Multimedia files)
video/3gp2 3GP2 video (RFC 4393: MIME Type Registrations for 3GPP2 Multimedia files)
audio/mp4 MP4 audio (RFC 4337: MIME Type Registration for MPEG-4)
video/mp4 MP4 video (RFC 4337: MIME Type Registration for MPEG-4)
application/mp4 Non-audiovisual media encapsulated in MPEG-4

Each codec described by the codecs parameter can be specified either as the container's name (3gp, mp4, quicktime, etc.) or as the container name plus additional parameters to specify the codec and its configuration. Each entry in the codec list may contain some number of components, separated by periods (.).

The syntax for the value of codecs varies by codec; however, it always starts with the codec's four-character identifier, a period separator (.), followed by the Object Type Indication (OTI) value for the specific data format. For most codecs, the OTI is a two-digit hexadecimal number; however, it's six hexadecimal digits for AVC (H.264).

Thus, the syntaxes for each of the supported codecs look like this:

cccc[.pp]* (Generic ISO BMFF)

Where cccc is the four-character ID for the codec and pp is where zero or more two-character encoded property values go.

mp4a.oo[.A] (MPEG-4 audio)

Where oo is the Object Type Indication value describing the contents of the media more precisely and A is the one-digit audio OTI. The possible values for the OTI can be found on the MP4 Registration Authority website's Object Types page. For example, Opus audio in an MP4 file is mp4a.ad. For further details, see MPEG-4 audio.

mp4v.oo[.V] (MPEG-4 video)

Here, oo is again the OTI describing the contents more precisely, while V is the one-digit video OTI.

avc1[.PPCCLL] (AVC video)

PPCCLL are six hexadecimal digits specifying the profile number (PP), constraint set flags (CC), and level (LL). See AVC profiles for the possible values of PP.

The constraint set flags byte is comprised of one-bit Boolean flags, with the most significant bit being referred to as flag 0 (or constraint_set0_flag, in some resources), and each successive bit being numbered one higher. Currently, only flags 0 through 2 are used; the other five bits must be zero. The meanings of the flags vary depending on the profile being used.

The level is a fixed-point number, so a value of 14 (decimal 20) means level 2.0 while a value of 3D (decimal 61) means level 6.1. Generally speaking, the higher the level number, the more bandwidth the stream will use and the higher the maximum video dimensions are supported.

AVC profiles

The following are the AVC profiles and their profile numbers for use in the codecs parameter, as well as the value to specify for the constraints component, CC.

Profile Number (Hex) Constraints byte
Constrained Baseline Profile (CBP) CBP is primarily a solution for scenarios in which resources are constrained, or resource use needs to be controlled to minimize the odds of the media performing poorly. 42 40
Baseline Profile (BP) Similar to CBP but with more data loss protections and recovery capabilities. This is not as widely used as it was before CBP was introduced. All CBP streams are considered to also be BP streams. 42 00
Extended Profile (XP) Designed for streaming video over the network, with high compression capability and further improvements to data robustness and stream switching. 58 00
Main Profile (MP) The profile used for standard-definition digital television being broadcast in MPEG-4 format. Not used for high-definition television broadcasts. This profile's importance has faded since the introduction of the High Profile—which was added for HDTV use—in 2004. 4D 00
High Profile (HiP) Currently, HiP is the primary profile used for broadcast and disc-based HD video; it's used both for HD TV broadcasts and for Blu-Ray video. 64 00
Progressive High Profile (PHiP) Essentially High Profile without support for field coding. 64 08
Constrained High Profile PHiP, but without support for bi-predictive slices ("B-slices"). 64 0C
High 10 Profile (Hi10P) High Profile, but with support for up to 10 bits per color component. 6E 00
High 4:2:2 Profile (Hi422P) Expands upon Hi10P by adding support for 4:2:2 chroma subsampling along with up to10 bits per color component. 7A 00
High 4:4:4 Predictive Profile (Hi444PP) In addition to the capabilities included in Hi422P, Hi444PP adds support for 4:4:4 chroma subsampling (in which no color information is discarded). Also includes support for up to 14 bits per color sample and efficient lossless region coding. The option to encode each frame as three separate color planes (that is, each color's data is stored as if it were a single monochrome frame). F4 00
High 10 Intra Profile High 10 constrained to all-intra-frame use. Primarily used for professional apps. 6E 10
High 4:2:2 Intra Profile The Hi422 Profile with all-intra-frame use. 7A 10
High 4:4:4 Intra Profile The High 4:4:4 Profile constrained to use only intra frames. F4 10
CAVLC 4:4:4 Intra Profile The High 4:4:4 Profile constrained to all-intra use, and to using only CAVLC entropy coding. 44 00
Scalable Baseline Profile Intended for use with video conferencing as well as surveillance and mobile uses, the SVC Baseline Profile is based on AVC's Constrained Baseline profile. The base layer within the stream is provided at a high quality level, with some number of secondary substreams that offer alternative forms of the same video for use in various constrained environments. These may include any combination of reduced resolution, reduced frame rate, or increased compression levels. 53 00
Scalable Constrained Baseline Profile Primarily used for real-time communication applications. Not yet supported by WebRTC, but an extension to the WebRTC API to allow SVC is in development. 53 04
Scalable High Profile Meant mostly for use in broadcast and streaming applications. The base (or highest quality) layer must conform to the AVC High Profile. 56 00
Scalable Constrained High Profile A subset of the Scalable High Profile designed mainly for real-time communication. 56 04
Scalable High Intra Profile Primarily useful only for production applications, this profile supports only all-intra usage. 56 20
Stereo High Profile The Stereo High Profile provides stereoscopic video using two renderings of the scene (left eye and right eye). Otherwise, provides the same features as the High profile. 80 00
Multiview High Profile Supports two or more views using both temporal and MVC inter-view prediction. Does not support field pictures or macroblock-adaptive frame-field coding. 76 00
Multiview Depth High Profile Based on the High Profile, to which the main substream must adhere. The remaining substreams must match the Stereo High Profile. 8A 00

MPEG-4 audio

When the value of an entry in the codecs list begins with mp4a, the syntax of the value should be:

mp4a.oo[.A]

Here, oo is the two-digit hexadecimal Object Type Indication which specifies the codec class being used for the media. The OTIs are assigned by the MP4 Registration Authority, which maintains a list of the possible OTI values. A special value is 40; this indicates that the media is MPEG-4 audio (ISO/IEC 14496 Part 3). In order to be more specific still, a third component—the Audio Object Type—is added for OTI 40 to narrow the type down to a specific subtype of MPEG-4.

The Audio Object Type is specified as a one or two digit decimal value (unlike most other values in the codecs parameter, which use hexadecimal). For example, MPEG-4's AAC-LC has an audio object type number of 2, so the full codecs value representing AAC-LC is mp4a.40.2.

Thus, ER AAC LC, whose Audio Object Type is 17, can be represented using the full codecs value mp4a.40.17. Single digit values can be given either as one digit (which is the best choice, since it will be the most broadly compatible) or with a leading zero padding it to two digits, such as mp4a.40.02.

Note: The specification originally mandated that the Audio Object Type number in the third component be only one decimal digit. However, amendments to the specification over time extended the range of these values well beyond one decimal digit, so now the third parameter may be either one or two digits. Padding values below 10 with a leading 0 is optional. Older implementations of MPEG-4 codecs may not support two-digit values, however, so using a single digit when possible will maximize compatibility.

The Audio Object Types are defined in ISO/IEC 14496-3 subpart 1, section 1.5.1. The table below provides a basic list of the Audio Object Types and in the case of the more common object types provides a list of the profiles supporting it, but you should refer to the specification for details if you need to know more about the inner workings of any given MPEG-4 audio type.

MPEG-4 audio object types
ID Audio Object Type Profile support
0 NULL
1 AAC Main Main
2 AAC LC (Low Complexity) Main, Scalable, HQ, LD v2, AAC, HE-AAC, HE-AAC v2
3 AAC SSR (Scalable Sampling Rate) Main
4 AAC LTP (Long Term Prediction) Main, Scalable, HQ
5 SBR (Spectral Band Replication) HE-AAC, HE-AAC v2
6 AAC Scalable Main, Scalable, HQ
7 TwinVQ (Coding for ultra-low bit rates) Main, Scalable
8 CELP (Code-Excited Linear Prediction) Main, Scalable, Speech, HQ, LD
9 HVXC (Harmonic Vector Excitation Coding) Main, Scalable, Speech, LD
1011 Reserved
12 TTSI (Text to Speech Interface) Main, Scalable, Speech, Synthetic, LD
13 Main Synthetic Main, Synthetic
14 Wavetable Synthesis
15 General MIDI
16 Algorithmic Synthesis and Audio Effects
17 ER AAC LC (Error Resilient AAC Low-Complexity) HQ, Mobile Internetworking
18 Reserved
19 ER AAC LTP (Error Resilient AAC Long Term Prediction) HQ
20 ER AAC Scalable (Error Resilient AAC Scalable) Mobile Internetworking
21 ER TwinVQ (Error Resilient TwinVQ) Mobile Internetworking
22 ER BSAC (Error Resilient Bit-Sliced Arithmetic Coding) Mobile Internetworking
23 ER AAC LD (Error Resilient AAC Low-Delay; used for two-way communication) LD, Mobile Internetworking
24 ER CELP (Error Resilient Code-Excited Linear Prediction) HQ, LD
25 ER HVXC (Error Resilient Harmonic Vector Excitation Coding) LD
26 ER HILN (Error Resilient Harmonic and Individual Line plus Noise)
27 ER Parametric (Error Resilient Parametric)
28 SSC (Sinusoidal Coding)
29 PS (Parametric Stereo) HE-AAC v2
30 MPEG Surround
31 Escape
32 MPEG-1 Layer-1
33 MPEG-1 Layer-2 (MP2)
34 MPEG-1 Layer-3 (MP3)
35 DST (Direct Stream Transfer)
36 ALS (Audio Lossless)
37 SLS (Scalable Lossless)
38 SLS Non-core (Scalable Lossless Non-core)
39 ER AAC ELD (Error Resilient AAC Enhanced Low Delay)
40 SMR Simple (Symbolic Music Representation Simple)
41 SMR Main (Symbolic Music Representation Main)
42 Reserved
43

SAOC (Spatial Audio Object Coding)

Defined in ISO/IEC 14496-3:2009/Amd.2:2010(E).

44

LD MPEG Surround (Low Delay MPEG Surround)

Defined in ISO/IEC 14496-3:2009/Amd.2:2010(E).

45 and up Reserved

WebM

The basic form for a WebM codecs parameter is to list one or more of the four WebM codecs by name, separated by commas. The table below shows some examples:

MIME type Description
video/webm;codecs="vp8" A WebM video with VP8 video in it; no audio is specified.
video/webm;codecs="vp9" A WebM video with VP9 video in it.
audio/webm;codecs="vorbis" Vorbis audio in a WebM container.
audio/webm;codecs="opus" Opus audio in a WebM container.
video/webm;codecs="vp8,vorbis" A WebM container with VP8 video and Vorbis audio.
video/webm;codecs="vp9,opus" A WebM container with VP9 video and Opus audio.

The strings vp8.0 and vp9.0 also work, but are not recommended.

ISO Base Media File Format syntax

As part of a move toward a standardized and powerful format for the codecs parameter, WebM is moving toward describing video content using a syntax based on that defined by the ISO Base Media File Format. This syntax is defined in VP Codec ISO Media File Format Binding, in the section Codecs Parameter String. The audio codec continues to be indicated as either vorbis or opus.

In this format, the codecs parameter's value begins with a four-character code identifying the codec being used in the container, which is then followed by a series of period (.) separated two-digit values.

cccc.PP.LL.DD
cccc.PP.LL.DD.CC.cp.tc.mc.FF

The first four components are required; everything from CC (chroma subsampling) onward is optional, but all or nothing. Each of these components is described in the following table. Following the table are some examples.

WebM codecs parameter components
Component Details
cccc

A four-character code indicating which indicates which of the possible codecs is being described. Potential values are:

Four-character codes for WebM-supported codecs
Four-character code Codec
vp08 VP8
vp09 VP9
vp10 VP10
PP

The two-digit profile number, padded with leading zeroes if necessary to be exactly two digits.

WebM profile numbers
Profile Description
00 Only 4:2:0 (chroma subsampled both horizontally and vertically). Allows only 8 bits per color component.
01 All chroma subsampling formats are allowed. Allows only 8 bits per color component.
02 Only 4:2:0 (chroma subsampled both horizontally and vertically). Supports 8, 10, or 12 bits per color sample component.
03 All chroma subsampling formats are allowed. Supports 8, 10, or 12 bits per color sample component.
LL The two-digit level number. The level number is a fixed-point notation, where the first digit is the ones digit, and the second digit represents tenths. For example, level 3 is 30 and level 6.1 is 61.
DD The bit depth of the luma and color component values; permitted values are 8, 10, and 12.
CC

A two-digit value indicating which chroma subsampling format to use. The following table lists permitted values; see [Chroma subsampling](/en-US/docs/Web/Media/Formats/Video_concepts#chroma_subsampling) in our "Digital video concepts" guide for additional information about this topic and others.

WebM chroma subsampling identifiers
Value Chroma subsampling format
00 4:2:0 with the chroma samples sited interstitially between the pixels
01 4:2:0 chroma subsampling with the samples collocated with luma (0, 0)
02 4:2:2 chroma subsampling (4 out of each 4 horizontal pixels' luminance are used)
03 4:4:4 chroma subsampling (every pixel's luminance and chrominance are both retained)
04 Reserved
cp

A two-digit integer specifying which of the color primaries from Section 8.1 of the ISO/IEC 23001-8:2016 standard. This component, and every component after it, is optional.

The possible values of the color primaries component are:

ISO/IEC Color primary identifiers
Value Details
00 Reserved for future use by ITU or ISO/IEC
01 BT.709, sRGB, sYCC. BT.709 is the standard for high definition (HD) television; sRGB is the most common color space used for computer displays. Broadcast BT.709 uses 8-bit color depth with the legal range being from 16 (black) to 235 (white).
02 Image characteristics are unknown, or are to be determined by the application
03 Reserved for future use by ITU or ISO/IEC
04 BT.470 System M, NTSC (standard definition television in the United States)
05 BT.470 System B, G; BT.601; BT.1358 625; BT.1700 625 PAL and 625 SECAM
06 BT.601 525; BT.1358 525 or 625; BT.1700 NTSC; SMPTE 170M. Functionally identical to 7.
70 SMPTE 240M (historical). Functionally identical to 6.
08 Generic film
09 BT.2020; BT.2100. Used for ultra-high definition (4K) High Dynamic Range (HDR) video, these have a very wide color gamut and support 10-bit and 12-bit color component depths.
10 SMPTE ST 428 (D-Cinema Distribution Master: Image characteristics). Defines the uncompressed image characteristics for DCDM.
11 SMPTE RP 431 (D-Cinema Quality: Reference projector and environment). Describes the reference projector and environment conditions that provide a consistent film presentation experience.
12 SMPTE EG 432 (Digital Source Processing: Color Processing for D-Cinema). Engineering guideline making color signal decoding recommendations for digital movies.
1321 Reserved for future use by ITU-T or ISO/IEC
22 EBU Tech 3213-E
23255 Reserved for future use by ITU-T or ISO/IEC
tc A two-digit integer indicating the transferCharacteristics for the video. This value is from Section 8.2 of ISO/IEC 23001-8:2016, and indicates the transfer characteristics to be used when adapting the decoded color to the render target.
mc The two-digit value for the matrixCoefficients property. This value comes from the table in Section 8.3 of the ISO/IEC 23001-8:2016 specification. This value indicates which set of coefficients to use when mapping from the native red, blue, and green primaries to the luma and chroma signals. These coefficients are in turn used with the equations found in that same section.
FF Indicates whether to restrict the black level and color range of each color component to the legal range. For 8-bit color samples, the legal range is 16-235. A value of 00 indicates that these limitations should be enforced, while a value of 01 allows the full range of possible values for each component, even if the resulting color is out of bounds for the color system.

WebM media type examples

video/webm;codecs="vp08.00.41.08,vorbis"

VP8 video, profile 0 level 4.1, using 8-bit YUV with 4:2:0 chroma subsampling, using BT.709 color primaries, transfer function, and matrix coefficients, with the luminance and chroma values encoded within the legal ("studio") range. The audio is in Vorbis format.

video/webm;codecs="vp09.02.10.10.01.09.16.09.01,opus"

VP9 video, profile 2 level 1.0, with 10-bit YUV content using 4:2:0 chroma subsampling, BT.2020 primaries, ST 2084 EOTF (HDR SMPTE), BT.2020 non-constant luminance color matrix, and full-range chroma and luma encoding. The audio is in Opus format.

Using the codecs parameter

You can use the codecs parameter in a few situations. Firstly, you can use it with the <source> element when creating an <audio> or <video> element, in order to establish a group of options for the browser to choose from when selecting the format of the media to present to the user in the element.

You can also use the codecs parameter when specifying a MIME media type to the MediaSource.isTypeSupported() method; this method returns a Boolean which indicates whether or not the media is likely to work on the current device.

See also