Параметр "codecs" для распространённых типов носителей
На базовом уровне, можно задать тип медиа файла, используя простой
MIME тип, такой как video/mp4
или audio/mpeg
. Однако, многие медиа типы, особенно те, которые поддерживают видео дорожки, более привлекательные из-за способности более точного описания содержащегося формата данных. Например, просто описывая видео в файле MPEG-4 с MIME типом video/mp4
ничего не скажет о том, какой формат в действительности он содержит.
По этой причине в MIME тип может быть добавлен параметр codecs
, описывающий медиа контент, предоставляя более подробную информацию о содержимом. Эта информация может содержать, к примеру, профиль видео кодека, или тип, используемый аудио треками, и так далее.
В этом руководстве кратко рассматривается синтаксис параметра codecs
мультимедийного типа и его использование со строкой, описывающей MIME тип, для предоставления подробных сведений о содержимом аудио- или видеоматериалов, помимо простого указания типа
Общий синтаксис
Основной медиатип определяется установкой строкового значения (audio
, video
, и т.д.), после которого идёт символ слеша (/
), затем указывается формат контейнера, который будет содержать информацию:
audio/mpeg
-
Аудио файл, использующий тип файла MPEG , к примеру, MP3.
video/ogg
-
Видео файл, использующий тип файла Ogg.
video/mp4
-
Видео файл, использующий тип файла MPEG-4.
video/quicktime
-
Видео файл, Apple формата QuickTime. Как уже отмечалось, этот формат обычно используется в Сети, поскольку требует использования плагинов.
Однако эти MIME являются не прозрачными. Все эти типы поддерживают несколько кодеков, и эти кодеки могут содержать несколько профилей, уровней , и других факторов конфигурирования. По этой причине указывается строковый параметр медиа типа codecs
.
Для его добавления, перед ним ставиться точка с запятой (;
) , за которой следует строка codecs=
, в значении указывается формат содержимого файла. Некоторые типы носителей позволяют указывать только имена используемых кодеков, в то время как другие позволяют также указывать различные ограничения для этих кодеков. Вы можете указать несколько кодеков, разделяя их запятыми.
audio/ogg; codecs=vorbis
video/webm; codecs="vp8, vorbis"
video/mp4; codecs="avc1.4d002a"
-
Файл MPEG-4 , содержащий AVC (H.264) видео, Основной профиль, Уровень 4.2.
Как и в случае с любым параметром MIME типа , codecs
должен заменяться на codecs*
(обратите внимание на символ звёздочки, *
) , если какое-либо из свойств кодека использует специальные символы для указания дополнительной информации (языковые отметки, кодировка байтов в шестнадцатеричные значения и т.д.), входящие в RFC 2231, раздел 4: MIME Parameter Value and Encoded Word Extensions. Можно использовать функции JavaScript encodeURI()
для кодирования списка параметров, можно использовать decodeURI()
для декодирования предварительно закодированного списка параметров.
Примечание: Когда используется параметр codecs
, указанный список кодеков должен включать каждый кодек, используемый для содержимого файла Список также может содержать кодеки, которых нет в файле.
Свойства кодеков для контейнеров
Контейнеры ниже поддерживают расширенные свойства кодеков в своих параметрах codecs
:
Несколько ссылок выше входят в одину и то же секцию, потому, что все медиатипы основаны на файловом формате ISO Base Media File Format (ISO BMFF), поэтому они используют тот же синтаксис.
AV1
Синтаксис параметра codecs
для AV1 определяется спецификацией AV1 Codec ISO Media File Format Binding, секция 5: Строки параметра codecs.
av01.P.LLT.DD[.M[.CCC[.cp[.tc[.mc[.F]]]]]]
Компоненты строковых параметров кодеков описываются более подробно в таблице ниже. Каждый компонент имеет фиксированное количество символов, и если значение меньше этой длины, оно должно быть дополнено начальными нулями.
Компонент | Описание | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
P |
Однознаковый номер профиля:
|
||||||||||||||||||||
LL |
Двухзначный номер уровня, который преобразуется в формат X.Y, где X = 2 + (LL >> 2) ,
и Y = LL & 3 . Подробнее
Дополнение A, секция 3
в спецификации AV1 .
|
||||||||||||||||||||
T |
The one-character tier indicator. For the Main tier (seq_tier
equals 0), this character is the letter M . For the High
tier (seq_tier is 1), this character is the letter
H . The High tier is only available for level 4.0 and up.
|
||||||||||||||||||||
DD |
The two-digit component bit depth. This value must be one of 8, 10, or 12; which values are valid varies depending on the profile and other properties. | ||||||||||||||||||||
M |
The one-digit monochrome flag; if this is 0, the video includes the U and V planes in addition to the Y plane. Otherwise, the video data is entirely in the Y plane and is therefore monochromatic. See YUV for details on how the YUV color system works. The default value is 0 (not monochrome). | ||||||||||||||||||||
CCC |
The third digit in The default value is |
||||||||||||||||||||
cp |
The two-digit color_primaries value indicates the color
system used by the media. For example, BT.2020/BT.2100 color, as used
for HDR video, is 09 . The information for this—and for each
of the remaining components—is found in the
Color config semantics section
of the AV1 specification. The default value is 01 (ITU-R
BT.709).
|
||||||||||||||||||||
tc |
The two-digit transfer_characteristics value. This value
defines the function used to map the gamma (delightfully called the
"opto-electrical transfer function" in technical parlance) from the
source to the display. For example, 10-bit BT.2020 is 14 .
The default value is 01 (ITU-R BT.709).
|
||||||||||||||||||||
mc |
The two-digit matrix_coefficients constant selects the
matrix coefficients used to convert the red, blue, and green channels
into luma and chroma signals. For example, the standard coefficients
used for BT.709 are indicated using the value 01 . The
default value is 01 (ITU-R BT.709).
|
||||||||||||||||||||
F |
A one-digit flag indicating whether the color should be allowed to use
the full range of possible values (1 ), or should be
constrained to those values considered legal for the specified color
configuration (that is, the
studio swing representation). The default is 0 (use the
studio swing representation).
|
All fields from M
(monochrome flag) onward are optional; you may stop including fields at any point (but can't arbitrarily leave out fields). The default values are included in the table above. Some example AV1 codec strings:
av01.2.15M.10.0.100.09.16.09.0
-
AV1 Professional Profile, level 5.3, Main tier, 10 bits per color component, 4:2:2 chroma subsampling using ITU-R BT.2100 color primaries, transfer characteristics, and YCbCr color matrix. The studio swing representation is indicated.
av01.0.15M.10
-
AV1 Main Profile, level 5.3, Main tier, 10 bits per color component. The remaining properties are taken from the defaults: 4:2:0 chroma subsampling, BT.709 color primaries, transfer characteristics, and matrix coefficients. Studio swing representation.
ISO Base Media File Format: MP4, QuickTime, and 3GP
All media types based upon the ISO Base Media File Format (ISO BMFF) share the same syntax for the codecs
parameter. These media types include MPEG-4 (and, in fact, the QuickTime file format upon which MPEG-4 is based) as well as 3GP. Both video and audio tracks can be described using the codecs
parameter with the following MIME types:
MIME type | Description |
---|---|
audio/3gpp |
3GP audio (RFC 3839: MIME Type Registrations for 3rd generation Partnership Project (3GP) Multimedia files) |
video/3gpp |
3GP video (RFC 3839: MIME Type Registrations for 3rd generation Partnership Project (3GP) Multimedia files) |
audio/3gp2 |
3GP2 audio (RFC 4393: MIME Type Registrations for 3GPP2 Multimedia files) |
video/3gp2 |
3GP2 video (RFC 4393: MIME Type Registrations for 3GPP2 Multimedia files) |
audio/mp4 |
MP4 audio (RFC 4337: MIME Type Registration for MPEG-4) |
video/mp4 |
MP4 audio (RFC 4337: MIME Type Registration for MPEG-4) |
application/mp4 |
Non-audiovisual media encapsulated in MPEG-4 |
Each codec described by the codecs
parameter can be specified either as simply the container's name (3gp
, mp4
, quicktime
, etc.) or as the container name plus additional parameters to specify the codec and its configuration. Each entry in the codec list may contain some number of components, separated by periods (.
).
The syntax for the value of codecs
varies by codec; however, it always starts with the codec's four-character identifier, a period separator (.
), followed by the Object Type Indication (OTI) value for the specific data format. For most codecs, the OTI is a two-digit hexadecimal number; however, it's six hexadecimal digits for AVC (H.264).
Thus, the syntaxes for each of the supported codecs look like this:
cccc[.pp]*
(Generic ISO BMFF)-
Where
cccc
is the four-character ID for the codec andpp
is where zero or more two-character encoded property values go. mp4a.oo[.A]
(MPEG-4 audio)-
Where
oo
is the Object Type Indication value describing the contents of the media more precisely andA
is the one-digit audio OTI. The possible values for the OTI can be found on the MP4 Registration Authority web site's Object Types page. For example, Opus audio in an MP4 file ismp4a.ad
. For further details, see MPEG-4 audio. mp4v.oo[.V]
(MPEG-4 video)-
Here,
oo
is again the OTI describing the contents more precisely, whileV
is the one-digit video OTI. avc1.oo[.PPCCLL]
(AVC video)-
oo
is the OTI describing the contents, whilePPCCLL
is six hexadecimal digits specifying the profile number (PP
), constraint set flags (CC
), and level (LL
). See AVC profiles for the possible values ofPP
.The constraint set flags byte is comprised of one-bit Boolean flags, with the most significant bit being referred to as flag 0 (or
constraint_set0_flag
, in some resources), and each successive bit being numbered one higher. Currently, only flags 0 through 2 are used; the other five bits must be zero. The meanings of the flags vary depending on the profile being used.The level is a fixed-point number, so a value of
14
(decimal 20) means level 2.0 while a value of3D
(decimal 61) means level 6.1. Generally speaking, the higher the level number, the more bandwidth the stream will use and the higher the maximum video dimensions are supported.
AVC profiles
The following are the AVC profiles and their profile numbers for use in the codecs
parameter, as well as the value to specify for the constraints component, CC
.
Profile | Number (Hex) | Constraints byte |
---|---|---|
Constrained Baseline Profile (CBP) CBP is primarily a solution for scenarios in which resources are constrained, or resource use needs to be controlled to minimize the odds of the media performing poorly. | 42 |
40 |
Baseline Profile (BP) Similar to CBP but with more data loss protections and recovery capabilities. This is not as widely used as it was before CBP was introduced. All CBP streams are considered to also be BP streams. | 42 |
00 |
Extended Profile (XP) Designed for streaming video over the network, with high compression capability and further improvements to data robustness and stream switching. | 58 |
00 |
Main Profile (MP) The profile used for standard-definition digital television being broadcast in MPEG-4 format. Not used for high-definition television broadcasts. This profile's importance has faded since the introduction of the High Profile—which was added for HDTV use—in 2004. | 4D |
00 |
High Profile (HiP) Currently, HiP is the primary profile used for broadcast and disc-based HD video; it's used both for HD TV broadcasts and for Blu-Ray video. | 64 |
00 |
Progressive High Profile (PHiP) Essentially High Profile without support for field coding. | 64 |
08 |
Constrained High Profile PHiP, but without support for bi-predictive slices ("B-slices"). | 64 |
0C |
High 10 Profile (Hi10P) High Profile, but with support for up to 10 bits per color component. | 6E |
00 |
High 4:2:2 Profile (Hi422P) Expands upon Hi10P by adding support for 4:2:2 chroma subsampling along with up to10 bits per color component. | 7A |
00 |
High 4:4:4 Predictive Profile (Hi444PP) In addition to the capabilities included in Hi422P, Hi444PP adds support for 4:4:4 chroma subsampling (in which no color information is discarded). Also includes support for up to 14 bits per color sample and efficient lossless region coding. The option to encode each frame as three separate color planes (that is, each color's data is stored as if it were a single monochrome frame). | F4 |
00 |
High 10 Intra Profile High 10 constrained to all-intra-frame use. Primarily used for professional apps. | 6E |
10 |
High 4:2:2 Intra Profile The Hi422 Profile with all-intra-frame use. | 7A |
10 |
High 4:4:4 Intra Profile The High 4:4:4 Profile constrained to use only intra frames. | F4 |
10 |
CAVLC 4:4:4 Intra Profile The High 4:4:4 Profile constrained to all-intra use, and to using only CAVLC entropy coding. | 44 |
00 |
Scalable Baseline Profile Intended for use with video conferencing as well as surveillance and mobile uses, the SVC Baseline Profile is based on AVC's Constrained Baseline profile. The base layer within the stream is provided at a high quality level, with some number of secondary substreams that offer alternative forms of the same video for use in various constrained environments. These may include any combination of reduced resolution, reduced frame rate, or increased compression levels. | 53 |
00 |
Scalable Constrained Baseline Profile Primarily used for real-time communication applications. Not yet supported by WebRTC, but an extension to the WebRTC API to allow SVC is in development. | 53 |
04 |
Scalable High Profile Meant mostly for use in broadcast and streaming applications. The base (or highest quality) layer must conform to the AVC High Profile. | 56 |
00 |
Scalable Constrained High Profile A subset of the Scalable High Profile designed mainly for real-time communticions. | 56 |
04 |
Scalable High Intra Profile Primarily useful only for production applications, this profile supports only all-intra usage. | 56 |
20 |
Stereo High Profile The Stereo High Profile provides stereoscopic video using two renderings of the scene (left eye and right eye). Otherwise, provides the same features as the High profile. | 80 |
00 |
Multiview High Profile Supports two or more views using both temporal and MVC inter-view prediction. Does not support field pictures or macroblock-adaptive frame-field coding. | 76 |
00 |
Multiview Depth High Profile Based on the High Profile, to which the main substream must adhere. The remaining substreams must match the Stereo High Profile. | 8A |
00 |
MPEG-4 audio
When the value of an entry in the codecs
list begins with mp4a
, the syntax of the value should be:
mp4a.oo[.A]
Here, oo
is the two-digit hexadecimal Object Type Indication which specifies the codec class being used for the media. The OTIs are assigned by the MP4 Registration Authority, which maintains a list of the possible OTI values. A special value is 40
; this indicates that the media is MPEG-4 audio (ISO/IEC 14496 Part 3). In order to be more specific still, a third component—the Audio Object Type—is added for OTI 40
to narrow the type down to a specific subtype of MPEG-4.
The Audio Object Type is specified as a one or two digit decimal value (unlike most other values in the codecs
parameter, which use hexadecimal). For example, MPEG-4's AAC-LC has an audio object type number of 2
, so the full codecs
value representing AAC-LC is mp4a.40.2
.
Thus, ER AAC LC, whose Audio Object Type is 17, can be represented using the full codecs
value mp4a.40.17
. Single digit values can be given either as one digit (which is the best choice, since it will be the most broadly compatible) or with a leading zero padding it to two digits, such as mp4a.40.02
.
Примечание: The specification originally mandated that the Audio Object Type number in the third component be only one decimal digit. However, amendments to the specification over time extended the range of these values well beyond one decimal digit, so now the third parameter may be either one or two digits. Padding values below 10 with a leading 0
is optional. Older implementations of MPEG-4 codecs may not support two-digit values, however, so using a single digit when possible will maximize compatibility.
The Audio Object Types are defined in ISO/IEC 14496-3 subpart 1, section 1.5.1. The table below provides a basic list of the Audio Object Types and in the case of the more common object ypes provides a list of the profiles supporting it, but you should refer to the specification for details if you need to know more about the inner workings of any given MPEG-4 audio type.
ID | Audio Object Type | Profile support |
---|---|---|
0 |
NULL | |
1 |
AAC Main | Main |
2 |
AAC LC (Low Complexity) | Main, Scalable, HQ, LD v2, AAC, HE-AAC, HE-AAC v2 |
3 |
AAC SSR (Scalable Sampling Rate) | Main |
4 |
AAC LTP (Long Term Prediction) | Main, Scalable, HQ |
5 |
SBR (Spectral Band Replication) | HE-AAC, HE-AAC v2 |
6 |
AAC Scalable | Main, Scalable, HQ |
7 |
TwinVQ (Coding for ultra-low bit rates) | Main, Scalable |
8 |
CELP (Code-Excited Linear Prediction) | Main, Scalable, Speech, HQ, LD |
9 |
HVXC (Harmonic Vector Excitation Coding) | Main, Scalable, Speech, LD |
10 – 11 |
Reserved | |
12 |
TTSI (Text to Speech Interface) | Main, Scalable, Speech, Synthetic, LD |
13 |
Main Synthetic | Main, Synthetic |
14 |
Wavetable Synthesis | |
15 |
General MIDI | |
16 |
Algorithmic Synthesis and Audio Effects | |
17 |
ER AAC LC (Error Resilient AAC Low-Complexity) | HQ, Mobile Internetworking |
18 |
Reserved | |
19 |
ER AAC LTP (Error Resilient AAC Long Term Prediction) | HQ |
20 |
ER AAC Scalable (Error Resilient AAC Scalable) | Mobile Internetworking |
21 |
ER TwinVQ (Error Resilient TwinVQ) | Mobile Internetworking |
22 |
ER BSAC (Error Reslient Bit-Sliced Arithmetic Coding) | Mobile Internetworking |
23 |
ER AAC LD (Error Resilient AAC Low-Delay; used for two-way communication) | LD, Mobile Internetworking |
24 |
ER CELP (Error Resilient Code-Excited Linear Prediction) | HQ, LD |
25 |
ER HVXC (Error Resilient Harmonic Vector Excitation Coding) | LD |
26 |
ER HILN (Error Resilient Harmonic and Individual Line plus Noise) | |
27 |
ER Parametric (Error Resilient Parametric) | |
28 |
SSC (Sinusoidal Coding) | |
29 |
PS (Parametric Stereo) | HE-AAC v2 |
30 |
MPEG Surround | |
31 |
Escape | |
32 |
MPEG-1 Layer-1 | |
33 |
MPEG-1 Layer-2 (MP2) | |
34 |
MPEG-1 Layer-3 (MP3) | |
35 |
DST (Direct Stream Transfer) | |
36 |
ALS (Audio Lossless) | |
37 |
SLS (Scalable Lossless) | |
38 |
SLS Non-core (Scalable Lossless Non-core) | |
39 |
ER AAC ELD (Error Resilient AAC Enhanced Low Delay) | |
40 |
SMR Simple (Symbolic Music Representation Simple) | |
41 |
SMR Main (Symbolic Music Representation Main) | |
42 |
Reserved | |
43 |
SAOC (Spatial Audio Object Coding)[1] | |
44 |
LD MPEG Surround (Low Delay MPEG Surround)[1] | |
45 and up |
Reserved |
[1] SAOC and LD MPEG Surround are defined in ISO/IEC 14496-3:2009/Amd.2:2010(E).
WebM
The basic form for a WebM codecs
parameter is to simply list one or more of the four WebM codecs by name, separated by commas. The table below shows some examples:
MIME type | Description |
---|---|
video/webm;codecs="vp8" |
A WebM video with VP8 video in it; no audio is specified. |
video/webm;codecs="vp9" |
A WebM video with VP9 video in it. |
audio/webm;codecs="vorbis" |
Vorbis audio in a WebM container. |
audio/webm;codecs="opus" |
Opus audio in a WebM container. |
video/webm;codecs="vp8,vorbis" |
A WebM container with VP8 video and Vorbis audio. |
video/webm;codecs="vp9,opus" |
A WebM container with VP9 video and Opus audio. |
The strings vp8.0
and vp9.0
also work, but are not recommended.
ISO Base Media File Format syntax
As part of a move toward a standardized and powerful format for the codecs
parameter, WebM is moving toward describing video content using a syntax based on that defined by the ISO Base Media File Format. This syntax is defined in VP Codec ISO Media File Format Binding, in the section Codecs Parameter String. The audio codec continues to be indicated as either vorbis
or opus
.
In this format, the codecs
parameter's value begins with a four-character code identifying the codec being used in the container, which is then followed by a series of period (.
) separated two-digit values.
cccc.PP.LL.DD.CC[.cp[.tc[.mc[.FF]]]]
The first five components are required; everything from cp
(color primaries) onward is optional; you can stop including components at any point from then onward. Each of these components is described in the following table. Following the table are some examples.
Component | Details | ||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
cccc |
A four-character code indicating which indicates which of the possible codecs is being described. Potential values are:
|
||||||||||||||||||||||||||||||||||
PP |
The two-digit profile number, padded with leading zeroes if necessary to be exactly two digits.
|
||||||||||||||||||||||||||||||||||
LL |
The two-digit level number. The level number is a fixed-point notation,
where the first digit is the ones digit, and the second digit represents
tenths. For example, level 3 is 30 and level 6.1 is
61 .
|
||||||||||||||||||||||||||||||||||
DD |
The bit depth of the luma and color component values; permitted values are 8, 10, and 12. | ||||||||||||||||||||||||||||||||||
CC |
A two-digit value indicating which chroma subsampling format to use. The following table lists permitted values; see Chroma subsampling for additional information about this topic and others.
|
||||||||||||||||||||||||||||||||||
cp |
A two-digit integer specifying which of the color primaries from Section 8.1 of the ISO/IEC 23001-8:2016 standard. This component, and every component after it, is optional. The possible values of the color primaries component are:
|
||||||||||||||||||||||||||||||||||
tc |
A two-digit integer indicating the
transferCharacteristics for the video. This value is from
Section 8.2 of
ISO/IEC 23001-8:2016, and indicates the transfer characteristics to be used when adapting
the decoded color to the render target.
|
||||||||||||||||||||||||||||||||||
mc |
The two-digit value for the matrixCoefficients property.
This value comes from the table in Section 8.3 of the
ISO/IEC 23001-8:2016
specification. This value indicates which set of coefficients to use
when mapping from the native red, blue, and green primaries to the luma
and chroma signals. These coefficients are in turn used with the
equations found in that same section.
|
||||||||||||||||||||||||||||||||||
FF |
Indicates whether to restrict the black level and color range of each
color component to the legal range. For 8 bit color samples, the legal
range is 16-235. A value of 00 indicates that these
limitations should be enforced, while a value of 01 allows
the full range of possible values for each component, even if the
resulting color is out of bounds for the color system.
|
WebM media type examples
video/webm;codecs="vp08.00.41.08,vorbis"
-
VP8 video, profile 0 level 4.1, using 8-bit YUV with 4:2:0 chroma subsampling, using BT.709 color primaries, transfer function, and matrix coefficients, with the luminance and chroma values encoded within the legal ("studio") range. The video is Vorbis.
video/webm;codecs="vp09.02.10.10.01.09.16.09.01,opus"
-
VP9 video, profile 2 level 1.0, with 10-bit YUV content using 4:2:0 chroma subsampling, BT.2020 primaries, ST 2084 EOTF (HDR SMPTE), BT.2020 non-constant luminance color matrix, and full-range chroma and luma encoding. The audio is in Opus format.
Using the codecs parameter
You can use the codecs
parameter in a few situations. Firstly, you can use it with the <source>
element when creating an <audio>
or <video>
element, in order to establish a group of options for the browser to choose from when selecting the format of the media to present to the user in the element.
You can also use the codecs parameter when specifying a MIME media type to the MediaSource.isTypeSupported()
method; this method returns a Boolean which indicates whether or not the media is likely to work on the current device.