Визуализация с Web Audio API

Этот перевод не завершен. Пожалуйста, помогите перевести эту статью с английского

Одна из самых интересных фич Web Audio API — возможность извлекать частоту, форму волны и другие данные из звукового источника, которые могут быть использованы для создания визуализаций. Эта статья объясняет, как это можно сделать, и приводит несколько базовых юзкейсов.

Примечание: Вы можете найти рабочие примеры всех фрагментов кода в нашей демонстрации автоизменения голоса.

Основные концепции

Чтобы извлечь данные из вашего источника звука, вам понадобится AnalyserNode, созданный при помощи метода AudioContext.createAnalyser(), например:

var audioCtx = new (window.AudioContext || window.webkitAudioContext)();
var analyser = audioCtx.createAnalyser();

Затем этот узел подключается к вашему источнику звука где-то между получением звука и его обработкой, например:

source = audioCtx.createMediaStreamSource(stream);

Примечание: вам не нужно подключать вывод анализатора к другому узлу для его работы, пока его ввод подключен к источнику, либо напрямую, либо через другой узел.

Затем анализатор захватит аудиоданные, используя быстрое преобразование Фурье (БПФ) в определенной частотной области, в зависимости от того, что вы укажете как значение свойства AnalyserNode.fftSize (если свойство не задано, то значение по умолчанию равно 2048).

Примечание: Вы так же можете указать значения минимума и максимума для диапазона масштабирования данных БПФ, используя AnalyserNode.minDecibels и AnalyserNode.maxDecibels, и разные константы усреднения данных с помощью AnalyserNode.smoothingTimeConstant. Прочтите эти страницы, чтобы получить больше информации о том как их использовать.

Чтобы получить данные, Вам нужно использовать методы AnalyserNode.getFloatFrequencyData() и AnalyserNode.getByteFrequencyData() to capture frequency data, and AnalyserNode.getByteTimeDomainData() and AnalyserNode.getFloatTimeDomainData() to capture waveform data.

These methods copy data into a specified array, so you need to create a new array to receive the data before invoking one. The first one produces 32-bit floating point numbers, and the second and third ones produce 8-bit unsigned integers, therefore a standard JavaScript array won't do — you need to use a Float32Array or Uint8Array array, depending on what data you are handling.

So for example, say we are dealing with an fft size of 2048. We return the AnalyserNode.frequencyBinCount value, which is half the fft, then call Uint8Array() with the frequencyBinCount as its length argument — this is how many data points we will be collecting, for that fft size.

analyser.fftSize = 2048;
var bufferLength = analyser.frequencyBinCount;
var dataArray = new Uint8Array(bufferLength);

To actually retrieve the data and copy it into our array, we then call the data collection method we want, with the array passed as it's argument. For example:


We now have the audio data for that moment in time captured in our array, and can proceed to visualize it however we like, for example by plotting it onto an HTML5 <canvas>.

Let's go on to look at some specific examples.

Creating a waveform/oscilloscope

To create the oscilloscope visualisation (hat tip to Soledad Penadés for the original code in Voice-change-O-matic), we first follow the standard pattern described in the previous section to set up the buffer:

analyser.fftSize = 2048;
var bufferLength = analyser.frequencyBinCount;
var dataArray = new Uint8Array(bufferLength);

Next, we clear the canvas of what had been drawn on it before to get ready for the new visualization display:

canvasCtx.clearRect(0, 0, WIDTH, HEIGHT);

We now define the draw() function:

function draw() {

In here, we use requestAnimationFrame() to keep looping the drawing function once it has been started:

      drawVisual = requestAnimationFrame(draw);

Next, we grab the time domain data and copy it into our array


Next, fill the canvas with a solid colour to start

      canvasCtx.fillStyle = 'rgb(200, 200, 200)';
      canvasCtx.fillRect(0, 0, WIDTH, HEIGHT);

Set a line width and stroke colour for the wave we will draw, then begin drawing a path

      canvasCtx.lineWidth = 2;
      canvasCtx.strokeStyle = 'rgb(0, 0, 0)';


Determine the width of each segment of the line to be drawn by dividing the canvas width by the array length (equal to the FrequencyBinCount, as defined earlier on), then define an x variable to define the position to move to for drawing each segment of the line.

      var sliceWidth = WIDTH * 1.0 / bufferLength;
      var x = 0;

Now we run through a loop, defining the position of a small segment of the wave for each point in the buffer at a certain height based on the data point value form the array, then moving the line across to the place where the next wave segment should be drawn:

      for(var i = 0; i < bufferLength; i++) {
        var v = dataArray[i] / 128.0;
        var y = v * HEIGHT/2;

        if(i === 0) {
          canvasCtx.moveTo(x, y);
        } else {
          canvasCtx.lineTo(x, y);

        x += sliceWidth;

Finally, we finish the line in the middle of the right hand side of the canvas, then draw the stroke we've defined:

      canvasCtx.lineTo(canvas.width, canvas.height/2);

At the end of this section of code, we invoke the draw() function to start off the whole process:


This gives us a nice waveform display that updates several times a second:

a black oscilloscope line, showing the waveform of an audio signal

Создание частотной гистограммы

Another nice little sound visualization to create is one of those Winamp-style frequency bar graphs. We have one available in Voice-change-O-matic; let's look at how it's done.

First, we again set up our analyser and data array, then clear the current canvas display with clearRect(). The only difference from before is that we have set the fft size to be much smaller; this is so that each bar in the graph is big enough to actually look like a bar rather than a thin strand.

    analyser.fftSize = 256;
    var bufferLength = analyser.frequencyBinCount;
    var dataArray = new Uint8Array(bufferLength);

    canvasCtx.clearRect(0, 0, WIDTH, HEIGHT);

Next, we start our draw() function off, again setting up a loop with requestAnimationFrame() so that the displayed data keeps updating, and clearing the display with each animation frame.

    function draw() {
      drawVisual = requestAnimationFrame(draw);


      canvasCtx.fillStyle = 'rgb(0, 0, 0)';
      canvasCtx.fillRect(0, 0, WIDTH, HEIGHT);

Now we set our barWidth to be equal to the canvas width divided by the number of bars (the buffer length). However, we are also multiplying that width by 2.5, because most of the frequencies will come back as having no audio in them, as most of the sounds we hear every day are in a certain lower frequency range. We don't want to display loads of empty bars, therefore we simply shift the ones that will display regularly at a noticeable height across so they fill the canvas display.

We also set a barHeight variable, and an x variable to record how far across the screen to draw the current bar.

      var barWidth = (WIDTH / bufferLength) * 2.5;
      var barHeight;
      var x = 0;

As before, we now start a for loop and cycle through each value in the dataArray. For each one, we make the barHeight equal to the array value, set a fill colour based on the barHeight (taller bars are brighter), and draw a bar at x pixels across the canvas, which is barWidth wide and barHeight/2 tall (we eventually decided to cut each bar in half so they would all fit on the canvas better.)

The one value that needs explaining is the vertical offset position we are drawing each bar at: HEIGHT-barHeight/2. I am doing this because I want each bar to stick up from the bottom of the canvas, not down from the top, as it would if we set the vertical position to 0. Therefore, we instead set the vertical position each time to the height of the canvas minus barHeight/2, so therefore each bar will be drawn from partway down the canvas, down to the bottom.

      for(var i = 0; i < bufferLength; i++) {
        barHeight = dataArray[i]/2;

        canvasCtx.fillStyle = 'rgb(' + (barHeight+100) + ',50,50)';

        x += barWidth + 1;

Again, at the end of the code we invoke the draw() function to set the whole process in motion.


This code gives us a result like the following:

a series of red bars in a bar graph, showing intensity of different frequencies in an audio signal

Note: The examples listed in this article have shown usage of AnalyserNode.getByteFrequencyData() and AnalyserNode.getByteTimeDomainData(). For working examples showing AnalyserNode.getFloatFrequencyData() and AnalyserNode.getFloatTimeDomainData(), refer to our Voice-change-O-matic-float-data demo (see the source code too) — this is exactly the same as the original Voice-change-O-matic, except that it uses Float data, not unsigned byte data.