Audio and Video Delivery

We can deliver audio and video on the web in a number of ways, ranging from 'static' media files to adaptive live streams. This article is intended as a starting point for exploring the various delivery mechanisms of web-based media and compatibility with popular browsers.

The Audio and Video Elements

Whether we are dealing with pre-recorded audio files or live streams, the mechanism for making them available through the browser's <audio> and <video> elements remains pretty much the same. Currently, to support all browsers we need to specify two formats, although with the adoption of MP3 and MP4 formats in Firefox and Opera, this is changing fast. You can find compatibility information in the Guide to media types and formats on the web.

To deliver video and audio, the general workflow is usually something like this:

  1. Check what format the browser supports via feature detection (usually a choice of two, as stated above).
  2. If the browser doesn't support playback of any of the provided formats natively, either present a still image or use a fallback technology to present the video.
  3. Identify how you want to play/instantiate the media (e.g. a <video> element, or document.createElement('video') perhaps?)
  4. Deliver the media file to the player.

HTML Audio

html
<audio controls preload="auto">
  <source src="audiofile.mp3" type="audio/mpeg" />

  <!-- fallback for browsers that don't support mp3 -->
  <source src="audiofile.ogg" type="audio/ogg" />

  <!-- fallback for browsers that don't support audio tag -->
  <a href="audiofile.mp3">download audio</a>
</audio>

The code above will create an audio player that attempts to preload as much audio as possible for smooth playback.

Note: The preload attribute may be ignored by some mobile browsers.

For further info see Cross Browser Audio Basics (HTML Audio In Detail)

HTML Video

html
<video
  controls
  width="640"
  height="480"
  poster="initialimage.png"
  autoplay
  muted>
  <source src="videofile.mp4" type="video/mp4" />

  <!-- fallback for browsers that don't support mp4 -->
  <source src="videofile.webm" type="video/webm" />

  <!-- specifying subtitle files -->
  <track src="subtitles_en.vtt" kind="subtitles" srclang="en" label="English" />
  <track
    src="subtitles_no.vtt"
    kind="subtitles"
    srclang="no"
    label="Norwegian" />

  <!-- fallback for browsers that don't support video tag -->
  <a href="videofile.mp4">download video</a>
</video>

The code above creates a video player of dimensions 640x480 pixels, displaying a poster image until the video is played. We instruct the video to autoplay but to be muted by default.

Note: The autoplay attribute may be ignored by some mobile browsers. Also, the autoplay feature can be controversial when misused. It's strongly recommended that you read the Autoplay guide for media and Web Audio APIs to learn how to use autoplay wisely.

For further info see <video> element and Creating a cross-browser video player.

JavaScript Audio

js
const myAudio = document.createElement("audio");

if (myAudio.canPlayType("audio/mpeg")) {
  myAudio.setAttribute("src", "audiofile.mp3");
} else if (myAudio.canPlayType("audio/ogg")) {
  myAudio.setAttribute("src", "audiofile.ogg");
}

myAudio.currentTime = 5;
myAudio.play();

We set the source of the audio depending on the type of audio file the browser supports, then set the play-head 5 seconds in and attempt to play it.

Note: Play will be ignored by most browsers unless issued by a user-initiated event.

It's also possible to feed an <audio> element a base64 encoded WAV file, allowing to generate audio on the fly:

html
<audio id="player" src="data:audio/x-wav;base64,UklGRvC…"></audio>

Speak.js employs this technique.

JavaScript Video

js
const myVideo = document.createElement("video");

if (myVideo.canPlayType("video/mp4")) {
  myVideo.setAttribute("src", "videofile.mp4");
} else if (myVideo.canPlayType("video/webm")) {
  myVideo.setAttribute("src", "videofile.webm");
}

myVideo.width = 480;
myVideo.height = 320;

We set the source of the video depending on the type of video file the browser supports we then set the width and height of the video.

Web Audio API

In this example we retrieve an MP3 file using the fetch() API, load it into a source, and play it.

js
let audioCtx;
let buffer;
let source;

async function loadAudio() {
  try {
    // Load an audio file
    const response = await fetch("viper.mp3");
    // Decode it
    buffer = await audioCtx.decodeAudioData(await response.arrayBuffer());
  } catch (err) {
    console.error(`Unable to fetch the audio file. Error: ${err.message}`);
  }
}

const play = document.getElementById("play");
play.addEventListener("click", async () => {
  if (!audioCtx) {
    audioCtx = new AudioContext();
    await loadAudio();
  }
  source = audioCtx.createBufferSource();
  source.buffer = buffer;
  source.connect(audioCtx.destination);
  source.start();
  play.disabled = true;
});

You can run the full example live, or view the source.

Find more about Web Audio API basics in Using the Web Audio API.

getUserMedia / Stream API

It's also possible to retrieve a live stream from a webcam and/or microphone using getUserMedia and the Stream API. This makes up part of a wider technology known as WebRTC (Web Real-Time Communications) and is compatible with the latest versions of Chrome, Firefox and Opera.

To grab the stream from your webcam, first set up a <video> element:

html
<video id="webcam" width="480" height="360"></video>

Next, if supported connect the webcam source to the video element:

js
if (navigator.mediaDevices) {
  navigator.mediaDevices
    .getUserMedia({ video: true, audio: false })
    .then(function onSuccess(stream) {
      const video = document.getElementById("webcam");
      video.autoplay = true;
      video.srcObject = stream;
    })
    .catch(function onError() {
      alert(
        "There has been a problem retrieving the streams - are you running on file:/// or did you disallow access?",
      );
    });
} else {
  alert("getUserMedia is not supported in this browser.");
}

To find out more, read our MediaDevices.getUserMedia page.

Mediastream Recording

New standards are being rolled out to allow your browser to grab media from your mic or camera using getUserMedia and record it instantly using the new MediaStream Recording API. You take the stream you receive from getUserMedia, pass it to a MediaRecorder object, take the resulting output and feed it to your audio or video source*.

The main mechanism is outlined below:

js
navigator.mediaDevices
  .getUserMedia({ audio: true })
  .then(function onSuccess(stream) {
    const recorder = new MediaRecorder(stream);

    const data = [];
    recorder.ondataavailable = (e) => {
      data.push(e.data);
    };
    recorder.start();
    recorder.onerror = (e) => {
      throw e.error || new Error(e.name); // e.name is FF non-spec
    };
    recorder.onstop = (e) => {
      const audio = document.createElement("audio");
      audio.src = window.URL.createObjectURL(new Blob(data));
    };
    setTimeout(() => {
      rec.stop();
    }, 5000);
  })
  .catch(function onError(error) {
    console.log(error.message);
  });

See MediaStream Recording API for more details.

Media Source Extensions (MSE)

Media Source Extensions is a W3C working draft that plans to extend HTMLMediaElement to allow JavaScript to generate media streams for playback. Allowing JavaScript to generate streams facilitates a variety of use cases like adaptive streaming and time shifting live streams.

Encrypted Media Extensions (EME)

Encrypted Media Extensions is a W3C proposal to extend HTMLMediaElement, providing APIs to control playback of protected content.

The API supports use cases ranging from simple clear key decryption to high value video (given an appropriate user agent implementation). License/key exchange is controlled by the application, facilitating the development of robust playback applications supporting a range of content decryption and protection technologies.

One of the principal uses of EME is to allow browsers to implement DRM (Digital Rights Management), which helps to prevent web-based content (especially video) from being copied.

Adaptive Streaming

New formats and protocols are being rolled out to facilitate adaptive streaming. Adaptive streaming media means that the bandwidth and typically quality of the stream can change in real-time in reaction to the user's available bandwidth. Adaptive streaming is often used in conjunction with live streaming where smooth delivery of audio or video is paramount.

The main formats used for adaptive streaming are HLS and MPEG-DASH. MSE has been designed with DASH in mind. MSE defines byte streams according to ISOBMFF and M2TS (both supported in DASH, the latter supported in HLS). Generally speaking, if you are interested in standards, are looking for flexibility, or wish to support most modern browsers, you are probably better off with DASH.

Note: Currently Safari does not support DASH although dash.js will work on newer versions of Safari scheduled for release with OSX Yosemite.

DASH also provides a number of profiles including simple onDemand profiles that require no preprocessing and splitting up of media files. There are also a number of cloud based services that will convert your media to both HLS and DASH.

For further information see Live streaming web audio and video.

Customizing Your Media Player

You may decide that you want your audio or video player to have a consistent look across browsers, or just wish to tweak it to match your site. The general technique for achieving this is to omit the controls attribute so that the default browser controls are not displayed, create custom controls using HTML and CSS, then use JavaScript to link your controls to the audio/video API.

If you need something extra, it's possible to add features that are not currently present in default players, such as playback rate, quality stream switches or even audio spectrums. You can also choose how to make your player responsive — for example you might remove the progress bar under certain conditions.

You may detect click, touch and/or keyboard events to trigger actions such as play, pause and scrubbing. It's often important to remember keyboard controls for user convenience and accessibility.

A quick example — first set up your audio and custom controls in HTML:

html
<audio
  id="my-audio"
  src="http://jPlayer.org/audio/mp3/Miaow-01-Tempered-song.mp3"></audio>
<button id="my-control">play</button>

add a bit of JavaScript to detect events to play and pause the audio:

js
window.onload = () => {
  const myAudio = document.getElementById("my-audio");
  const myControl = document.getElementById("my-control");

  function switchState() {
    if (myAudio.paused) {
      myAudio.play();
      myControl.textContent = "pause";
    } else {
      myAudio.pause();
      myControl.textContent = "play";
    }
  }

  function checkKey(e) {
    if (e.keycode === 32) {
      // space bar
      switchState();
    }
  }

  myControl.addEventListener(
    "click",
    () => {
      switchState();
    },
    false,
  );

  window.addEventListener("keypress", checkKey, false);
};

You can try this example out here. For more information, see Creating your own custom audio player.

Other tips for audio/video

Stopping the download of media

While stopping the playback of media is as easy as calling the element's pause() method, the browser keeps downloading the media until the media element is disposed of through garbage collection.

Here's a trick that stops the download at once:

js
const mediaElement = document.querySelector("#myMediaElementID");
mediaElement.removeAttribute("src");
mediaElement.load();

By removing the media element's src attribute and invoking the load() method, you release the resources associated with the video, which stops the network download. You must call load() after removing the attribute, because just removing the src attribute does not invoke the load algorithm. If the <video> element also has <source> element descendants, those should also be removed before calling load().

Note that just setting the src attribute to an empty string will actually cause the browser to treat it as though you're setting a video source to a relative path. This causes the browser to attempt another download to something that is unlikely to be a valid video.

Seeking through media

Media elements provide support for moving the current playback position to specific points in the media's content. This is done by setting the value of the currentTime property on the element; see HTMLMediaElement for further details on the element's properties. Set the value to the time, in seconds, at which you want playback to continue.

You can use the element's seekable property to determine the ranges of the media that are currently available for seeking to. This returns a TimeRanges object listing the ranges of times that you can seek to.

js
const mediaElement = document.querySelector("#mediaElementID");
mediaElement.seekable.start(0); // Returns the starting time (in seconds)
mediaElement.seekable.end(0); // Returns the ending time (in seconds)
mediaElement.currentTime = 122; // Seek to 122 seconds
mediaElement.played.end(0); // Returns the number of seconds the browser has played

Specifying playback range

When specifying the URI of media for an <audio> or <video> element, you can optionally include additional information to specify the portion of the media to play. To do this, append a hash mark ("#") followed by the media fragment description.

A time range is specified using the syntax:

#t=[starttime][,endtime]

The time can be specified as a number of seconds (as a floating-point value) or as an hours/minutes/seconds time separated with colons (such as 2:05:01 for 2 hours, 5 minutes, and 1 second).

A few examples:

http://example.com/video.ogv#t=10,20

Specifies that the video should play the range 10 seconds through 20 seconds.

http://example.com/video.ogv#t=,10.5

Specifies that the video should play from the beginning through 10.5 seconds.

http://example.com/video.ogv#t=,02:00:00

Specifies that the video should play from the beginning through two hours.

http://example.com/video.ogv#t=60

Specifies that the video should start playing at 60 seconds and play through the end of the video.

Error handling

Errors get delivered to the child <source> elements corresponding to the sources resulting in the error.

This lets you detect which sources failed to load, which may be useful. Consider this HTML:

html
<video>
<source id="mp4_src"
  src="video.mp4"
  type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"'>
</source>
<source id="3gp_src"
  src="video.3gp"
  type='video/3gpp; codecs="mp4v.20.8, samr"'>
</source>
<source id="ogg_src"
  src="video.ogv"
  type='video/ogv; codecs="theora, vorbis"'>
</source>
</video>

Since Firefox doesn't support MP4 and 3GP on some platforms due to their patent-encumbered nature, the <source> elements with the IDs "mp4_src" and "3gp_src" will receive error events before the Ogg resource is loaded. The sources are tried in the order in which they appear, and once one loads successfully, the remaining sources aren't tried at all.

Checking whether the browser supports the supplied formats

Use the following verified sources within your audio and video elements to check support.

If these don't play then the browser you are testing doesn't support the given format. Consider using a different format or using a fallback.

If these work but the files you are supplying don't, there are two possible issues:

1. The media server is not delivering the correct mime types with the file

Although this is usually supported, you may need to add the following to your media server's .htaccess file.

# AddType TYPE/SUBTYPE EXTENSION

AddType audio/mpeg mp3
AddType audio/mp4 m4a
AddType audio/ogg ogg
AddType audio/ogg oga

AddType video/mp4 mp4
AddType video/mp4 m4v
AddType video/ogg ogv
AddType video/webm webm
AddType video/webm webmv

2. Your files have been encoded incorrectly

Your files may have been encoded incorrectly — try encoding using one of the following tools, which are proven to be pretty reliable:

  • Audacity — Free Audio Editor and Recorder
  • Miro — Free, open-source music and video player
  • Handbrake — Open Source Video Transcoder
  • Firefogg — Video and Audio encoding for Firefox
  • FFmpeg2 — Comprehensive command line encoder
  • Libav — Comprehensive command line encoder
  • Vid.ly — Video player, transcoding and delivery
  • Internet Archive — Free transcoding and storage

Detecting when no sources have loaded

To detect that all child <source> elements have failed to load, check the value of the media element's networkState attribute. If this is HTMLMediaElement.NETWORK_NO_SOURCE, you know that all the sources failed to load.

If at that point you add another source, by inserting a new <source> element as a child of the media element, Gecko attempts to load the specified resource.

Showing fallback content when no source could be decoded

Another way to show the fallback content of a video, when none of the sources could be decoded in the current browser, is to add an error handler on the last source element. Then you can replace the video with its fallback content:

html
<video controls>
  <source src="dynamicsearch.mp4" type="video/mp4"></source>
  <a href="dynamicsearch.mp4">
    <img src="dynamicsearch.jpg" alt="Dynamic app search in Firefox OS">
  </a>
  <p>Click image to play a video demo of dynamic app search</p>
</video>
js
const v = document.querySelector("video");
const sources = v.querySelectorAll("source");
const lastsource = sources[sources.length - 1];
lastsource.addEventListener(
  "error",
  (ev) => {
    const d = document.createElement("div");
    d.innerHTML = v.innerHTML;
    v.parentNode.replaceChild(d, v);
  },
  false,
);

Audio/Video JavaScript Libraries

A number of audio and video JavaScript libraries exist. The most popular libraries allow you to choose a consistent player design over all browsers and provide a fallback for browsers that don't support audio and video natively. Fallbacks have historically used now-obsolete plugins such as Adobe Flash or Microsoft Silverlight plugins to provide a media player in non-supporting browsers, although these are no longer supported on modern computers. Other functionality such as the <track> element for subtitles can also be provided through media libraries.

Audio only

Video only

  • flowplayer: Gratis with a flowplayer logo watermark. Open source (GPL licensed.)
  • JWPlayer: Requires registration to download. Open Source Edition (Creative Commons License.)
  • SublimeVideo: Requires registration. Form based set up with domain specific link to CDN hosted library.
  • Video.js: Gratis and Open Source (Apache 2 Licensed.)

Audio and Video

Web Audio API

Basic tutorials

Creating a cross-browser video player

A guide to creating a basic cross browser video player using the <video> element.

Video player styling basics

With the cross-browser video player put in place in the previous article, this article now looks at providing some basic, responsive styling for the player.

Cross-browser audio basics

This article provides a basic guide to creating an HTML audio player that works cross browser, with all the associated attributes, properties and events explained, and a quick guide to custom controls created using the Media API.

Media buffering, seeking, and time ranges

Sometimes it's useful to know how much <audio> or <video> has downloaded or is playable without delay — a good example of this is the buffered progress bar of an audio or video player. This article discusses how to build a buffer/seek bar using TimeRanges, and other features of the media API.

HTML playbackRate explained

The playbackRate property allows us to change the speed or rate at which a piece of web audio or video is playing. This article explains it in detail.

Using the Web Audio API

Explains the basics of using the Web Audio API to grab, manipulate and play back an audio source.

Streaming media tutorials

Live streaming web audio and video

Live streaming technology is often employed to relay live events such as sports, concerts and more generally TV and Radio programmes that are output live. Often shortened to just streaming, live streaming is the process of transmitting media 'live' to computers and devices. This is a fairly complex and nascent subject with a lot of variables, so in this article we'll introduce you to the subject and let you know how you can get started.

Setting up adaptive streaming media sources

Let's say you want to set up an adaptive streaming media source on a server, to be consumed inside an HTML media element. How would you do that? This article explains how, looking at two of the most common formats: MPEG-DASH and HLS (HTTP Live Streaming.)

DASH Adaptive Streaming for HTML 5 Video

Details how to set up adaptive streaming using DASH and WebM.

Advanced tutorials

Adding captions and subtitles to HTML video

This article explains how to add captions and subtitles to HTML <video>, using Web_Video_Text_Tracks_Format and the <track> element.

Web Audio API cross browser support

A guide to writing cross browser Web Audio API code.

Easy audio capture with the MediaRecorder API

Explains the basics of using the MediaStream Recording API to directly record a media stream.

Note: Firefox OS versions 1.3 and above support the RTSP protocol for streaming video delivery. A fallback solution for older versions would be to use <video> along with a suitable format for Gecko (such as WebM) to serve fallback content. More information will be published on this in good time.

References