Transcoding assets for Media Source Extensions

This article needs a technical review. How you can help.

This page is not complete.

When working with media source extensions, it is likely that you need to condition your assets before you can stream them. This article takes you through the requirements and shows you a toolchain you can use to encode your assets appropriately.

Getting started

  1. The first and most important step is to ensure that your files are comprised of a container and codec that users' browsers support.
  2. Depending on the codec, you might need to fragment the file to comply with the ISO BMF spec.
  3. (Optional) If you decide to use Dynamic Adaptive Streaming over HTTP (DASH) for adaptive bitrate streaming, you transcode your assets into multiple resolutions. Most DASH clients expect a corresponding Manifest Presentation Description (MPD) manifest file, which is typically generated while generating the multiple resolution asset files.

Below we'll cover all of these steps, but first let's look at a toolchain we can use to do this fairly easily.

Sample Media

If you're looking to follow the steps listed here, but don't have any media to experiment with, you can grab the trailer to Big Buck Bunny [0] here. Big Buck Bunny is licensed under the Creative Commons Attribution 3.0 license.  Throughout this tutorial, you'll see the filename which is the download.

[0] (c) Copyright 2008, Blender Foundation / /

Tools required

When working with MSE, the following tools are a must have:

  1. ffmpeg — this command line utility is used for transcoding your media into the required formats. You can download an appropriate version for your system at the Download FFmpeg page. Once downloaded, unarchive the archive file, put the ffmpeg executable in an appropriate place, and add it to your PATH for each of use.  OSX users can also use homebrew to install ffmpeg.
  2. Bento4 — this set of command line utilities are used for getting asset metadata and creating content for DASH. To install...TODO

Get these installed successfully before moving to the next step.

Container and Codec Support

As specified in section 1.1 of the MSE spec: Goals, MSE is designed not to require support for any particular media format or codec.  While this is true on paper, browsers have varying support for various codecs contained within various containers.

To check if the browser supports a particular container, you can pass a string of the MIME type to the static method MediaSource.isTypeSupported:

MediaSource.isTypeSupported('audio/mp3'); // false
MediaSource.isTypeSupported('video/mp4'); // true
MediaSource.isTypeSupported('video/mp4; codecs="avc1.4D4028, mp4a.40.2"'); // true

The string is the MIME type of the container, optionally followed by a list of codecs. While the MIME type is fairly simple to figure out, we can get the codec string using this utility: TODO: Chris help me embed?

Currently, MP4 containers with H.264 video and AAC audio codecs have support across all modern browsers, while others don't.

To convert our sample media from a QuickTime MOV container to an MP4 container, we can use ffmpeg.  Because the audio codec in the MOV container is already AAC and the video codec is h.264, we can instruct ffmpeg not to perform transcoding. Instead, it will just copy the audio and video tracks over without performing any transcoding, which is relatively faster than having to transcode.

$ ffmpeg -i -c:v copy -c:a copy bunny.mp4
$ ls

Checking Fragmentation

In order to properly stream MP4, we need the asset to be an ISO BMF format MP4. Without proper fragmentation, any given mp4 file is not guaranteed to work with MSE.  This means that metadata within the container is spread out and not lumped together.

To check whether an MP4 file is a proper MP4 stream, you can use this utility to list the atoms of an MP4: TODO: Chris help me embed?

Note that the fragmented version is also slightly larger, due to additional metadata spread throughout the file. Note, this is usually a file size increase of 1 percent or less.


If you have an asset that is not already an MP4, ffmpeg can handle emitting a properly fragmented MP4 during the transcode process, with the -movflags frag_keyframe+empty_moov command line flag:

$ ffmpeg -i -c:v copy -c:a copy -movflags frag_keyframe+empty_moov bunny_fragmented.mp4

If you already have an MP4, but it's not properly fragmented, you can again use ffmpeg:

$ ffmpeg -i non_fragmented.mp4 -movflags frag_keyframe+empty_moov fragmented.mp4

Having a properly fragmented MP4 file is all you need to get started.  If you wish to employ adaptive bitrate streaming, you'll have to create encodings at multiple resolutions.  While MSE is flexible enough to allow you to make your implementation, it's highly recommended to use an existing DASH client as DASH is a well-specified application protocol.

Creating Content for Dash

Given that you have ffmpeg and Bento4's utilities accessible through your $PATH, you can run Bento4's Python script to generate multiple encodings of your content at various resolutions. Bento4's Python script can then be used to generate the corresponding MPD file needed by clients.

You can temporarily (for this shell session) add the directory containing the Bento4 utilities to your path with:

export PATH=$PATH:`pwd`

Assuming your working directory (pwd) is the directory containing the tools, you'd run the following commands (shown with sample output):

$ python -b 5 -v fragmented.mp4
Encoding 5 bitrates, min bitrate = 500.0 max bitrate = 2000.0
Media Source: Video: resolution=640x360
ENCODING bitrate: 500, resolution: 256x144
ENCODING bitrate: 875, resolution: 384x216
ENCODING bitrate: 1250, resolution: 480x270
ENCODING bitrate: 1625, resolution: 560x316
ENCODING bitrate: 2000, resolution: 640x360

$ python --exec-dir=<path to Bento4 tools> video_0*
Parsing media file 1: video_00500.mp4
Parsing media file 2: video_00875.mp4
Parsing media file 3: video_01250.mp4
Parsing media file 4: video_01625.mp4
Parsing media file 5: video_02000.mp4
Splitting media file (audio) video_00500.mp4
Splitting media file (video) video_00500.mp4
Splitting media file (video) video_00875.mp4
Splitting media file (video) video_01250.mp4
Splitting media file (video) video_01625.mp4
Splitting media file (video) video_02000.mp4

$ tree -L 2 output
├── audio
│   └── und
├── stream.mpd
└── video
    ├── 1
    ├── 2
    ├── 3
    ├── 4
    └── 5

8 directories, 1 file

With your video proper encoded and adaptive bitrate media generated, you're now ready to begin adaptive bitrate streaming on the web using DASH and

Document Tags and Contributors

 Contributors to this page: gsouquet, rolfedh, NickDesaulniers, chrisdavidmills
 Last updated by: gsouquet,