この翻訳は不完全です。英語から この記事を翻訳 してください。

まだ実験段階のAPIですが、SIMD API はCPUが持つ SIMD/SSE 命令を使用するためのベクターオブジェクトを提供します。SIMD とは Single Instruction/Multiple Data (単一の命令/複数のデータ) の略です。SIMD命令は、一回の命令で複数のデータに同時に演算を適用することが出来ます。一方、スカラー命令 (SISD) は一回の命令で一つのデータに対してしか演算を適用出来ません。
 
一回の命令で複数のデータを処理すればアプリのパフォーマンスを向上できます。演算対象のデータ量が多ければ多いほど効果があります。そのため、SIMD命令は3Dグラフィックやオーディオ/ビデオ処理、物理シミュレーション、暗号、その他の領域などで幅広く使われています。
 
SIMDの欠点は、SIMD用に処理手順を組み立てなければならない点です。処理対象になるパック内の各データに別々の演算を適用することが出来ません。とは言え、各データを異なる方法で処理したいことはしばしばあります。マスクやデータ再配置でこの問題に対処する方法を後ほど述べます。
 
共有メモリやマルチスレッドを用いて、複数回の命令で複数のデータを取り扱うMIMD(multiple data and with multiple instructions)という手段もあります。このテクニックは並列プログラミングをシンプルに実現しますが、このコンセプトにさらにSIMDを併用することも出来ます。例えば、複数のスレッドを作成して各スレッド上でSIMDベクタ処理を行うことが考えられます。Peter Jensen (Intel) による Mandelbrot デモ は、SIMDと Web Worker を使ってSIMDのパフォーマンス上の優位性を示しています。

 JavaScriptにおけるSIMD

The JavaScript SIMD API consists of several new data types and operations allowing you to make use of SIMD instructions from JavaScript. Browsers provide highly optimized implementations of this API depending on the underlying hardware of the user. Currently, the JS SIMD API is especially modeled for ARMv7 platforms with NEON and x86 platforms with SSE.

Let's look at a SIMD data type, for example SIMD.Float32x4. A SIMD vector consists of multiple data units, they are called lanes. A SIMD register for the current API is 128-bit wide. For a vector of length 4 (x4), there are 4 Float32 types that fit in and the lanes are named x, y, z, and w. Now, instead of having to perform 4 separate operations on each of these lanes, SIMD allows you to perform the operation on all 4 lanes simultaneously.

In the following figure, there is only a single instruction (addition) and thus the data can be processed using SIMD:

SISD SIMD

例1 と 2: SISD と SIMD の比較。

The scalar / SSID code could like this (without any loop, just for illustration):

var a = [1, 2, 3, 4];
var b = [5, 6, 7, 8];
var c = [];

c[0] = a[0] + b[0];
c[1] = a[1] + b[1];
c[2] = a[2] + b[2];
c[3] = a[3] + b[3];
c; // Array[6, 8, 10, 12]

Now using SIMD:

var a = SIMD.Float32x4(1, 2, 3, 4);
var b = SIMD.Float32x4(5, 6, 7, 8);
var c = SIMD.Float32x4.add(a,b); // Float32x4[6, 8, 10, 12]

This will add the values in the four lanes simultaneously and give you back a new SIMD Float32 type with all lanes added.

As you can see in the three lines of SIMD code, a set of JavaScript functions lets you create the packaged data types and gives you access to the vectorized instructions (addition here). At the time of writing this article, there is no operator overloading (i.e. a `+` sign) implemented to ease the writing of SIMD code like this. However, the JavaScript SIMD API is not yet finished and there are plans to include operator overloading in one of the next drafts. You can follow the specification development in the ecmascript_simd GitHub repository.

Re-aligning data to suit SIMD vectors

Oftentimes there are arrays that serve as input data for SIMD vectors. However, the structure of these arrays might not always be suited for SIMD operations. Let's take a look at RGBA color data in images, for example.

With the help of the CanvasRenderingContext2D.getImageData() method in a canvas and the ImageData.data property you can get back the underlying pixel data as a one-dimensional array containing the image data in the RGBA order, with integer values between 0 and 255 (included):

[R, G, B, A, R, G, B, A, R, G, B, A, ...]

If we now want to process this image, for example calculating the perceived luminance/grayscale with the formula Y = 0.299r + 0.587g + 0.114b, we need to restructure the data for SIMD. The idea is to process the different weights for r, g, and b in a SIMD suitable format with one instruction per color data. This could look like this:

[R, R, R, R, R, R, ...] * 0.299 +
[G, G, G, G, G, G, ...] * 0.587 +
[B, B, B, B, B, B, ...] * 0.114b =
[Y, Y, Y, Y, Y, Y, ...]

条件付き並列ブランチ

In scalar code, branches based on conditions are used to control the processing flow like in the following example:

var a = [1, 2, 3, 4];
var b = [5, 6, 7, 8];
var c = [];

for (var i = 0; i < 4; i++) {
  if (a[i] < 3) {
    c[i] = a[i] * b[i];
  } else {
    c[i] = b[i] + a[i];
  }
}

console.log(c); // [5, 12, 10, 12]

We don't want to compose SIMD vectors for every branch and execute them sequentially. SIMD provides are more efficient way using selection masks.

Branching, masking, selecting

The SIMD.%type%.select() method selects lanes from a selection mask. This allows you to to create branches so that you can operate on a fraction of data in SIMD data types. In the following image, a custom mask selects a result from the SIMD vectors a and b:

Masks can either be returned by several comparison functions, or you can create your own custom selection mask with the SIMD.%type%.bool() function.

With this technique we can rewrite the the scalar branches from the last code example, use the select() function and execute the multiplication and addition in parallel:

var a = SIMD.Float32x4(1, 2, 3, 4);
var b = SIMD.Float32x4(5, 6, 7, 8);

var mask = SIMD.Float32x4.lessThan(a, SIMD.Float32x4.splat(3));
// Int32x4[-1, -1, 0, 0]

var result = SIMD.Float32x4.select(mask, 
                                   SIMD.Float32x4.mul(a, b),
                                   SIMD.Float32x4.add(a, b));

console.log(result); // Float32x4[5, 12, 10, 12]

In this SIMD version of the previous example, data is put into SIMD vectors again. Then, in order to create a branch based on a condition, the SIMD.Float32x4.lessThan() function is used. It returns a selection mask with Boolean values depending on which lane is true or false in this comparison. The first comparand is the vector a and the second comparand is created by the splat function, which sets all four lanes to 3. This makes this comparison the same as in the scalar version (a[i] < 3).

To get the actual result from the selection mask, the select function is used. It takes three parameters: the first is the mask, the second parameter is the trueValue. If the selector mask lane is true,  the corresponding lane value are picked from the trueValue. If not, lane values are picked from parameter number three, the falseValue.

他のSIMD アルゴリズムと使用例

In general, data that can be processed with the same set of instruction can highly benefit from SIMD. The following algorithms and use cases can benefit greatly from SIMD operations:

 JavaScriptにおけるSIMDの状態

最近バージョンのFirefox NightlyではSIMD APIが使用できます。 It is "in development" in Microsoft Edge and there is an "Intent to implement" in Blink/Chromium.

SIMD is not yet part of an official standards document or draft. It is, however, proposed for ECMAScript 7/2016 and the development of the specification is actively being worked on by Intel, Google, Mozilla and other contributors.

A Polyfill reference implementation based on typed arrays is available in the ecmascript_simd GitHub repository.

関連項目

ドキュメントのタグと貢献者

このページの貢献者: woodmix, lv7777
最終更新者: woodmix,