MDN’s new design is in Beta! A sneak peek: https://blog.mozilla.org/opendesign/mdns-new-design-beta/

Raptor: Gaia用パフォーマンスツール

この記事は Raptor について説明します: これは、特に Firefox OS に関するパフォーマンス計測用のCLI(コマンドライン)ツールです。これはツールの機能の背後にある戦略を見て、ツールを始める方法を示してくれて、先進トピック、例えば自身のテストを書いたり、可視化したり、自動化したりに移動できます。

Raptor は、以前のツール make test-perf でパフォーマンステストをする時に直面する、たくさんの落とし穴を克服することを狙っています:

  • test-perf ツールは、アプリが読み込みのライフサイクルでキーポイントで発行するイベントをリッスンするのに、 Marionette.js に依存していました。これは、あらゆるアプリのイベント毎にイベントリスナーをバインドするように、atomスクリプトが挿入されるのを要求します。仮想標準イベントがキャプチャされる毎に、スクリプトは変更されないといけません。これはMarionette.js 自体を使うことの上にオーバーヘッドがあり、多くのメンテナンス時間を意味します。
  • パフォーマンスイベントを作る API は一貫していません。標準パフォーマンスイベントを簡単にキャプチャするために、test-perf ではカスタムイベントを投げることで行われていました。たとえば window.dispatchEvent(new CustomEvent('moz-app-visually-complete'))。不幸にも、アプリが自身のパフォーマンスイベントを発行した場合、パフォーマンステストのヘルパースクリプトから、別のAPI を使わなければいけませんでした。
  • あらゆるアプリがパフォーマンステストのヘルパースクリプトを入れないといけません。このスクリプトは API がパフォーマンスイベントにアクセスするのに必要ではありますが、それ自身についてのオーバーヘッドとメンテも必要になります。
  • test-perf ツールはコア Gaia アプリのパフォーマンスメトリクスを集めるのに適していますが、それ以外の多くを扱うように拡張するのは難しいです。ホームスクリーンや、システムや、アプリ起動以外のインタラクションを、このフレームワーク空間内でテストするのはとても難しいです。

Raptor はこうした問題を解決し、より効率的で拡張性の高く、自身に多くのオーバーヘッドを加えないテストフレームワークを提供するように、設計されました。

戦略

この章では、Raptorの機能を実装する中で取られた戦略について述べます。.

ユーザタイミング

ユーザタイミング API は、カスタムパフォーマンス指標と計測とを指し示すメカニズムとwebドキュメントを提供しています。標準化されたAPIを使うことで、パフォーマンスイベントを無視するヘルパースクリプトをアプリが同梱しないといけなくなるのを回避できます。実際に、ユーザタイミングは、全くイベントに依存していません。

// Legacy performance events
window.dispatchEvent(new CustomEvent('moz-app-visually-complete'));
PerformanceTestingHelper.dispatch('settings-load-start');
// User Timing API
performance.mark('visuallyLoaded');

performance.mark('settingsStart');
performance.mark('settingsEnd');

performance.measure('settingsLoad', 'settingsStart', 'settingsEnd');

ロギング

パフォーマンスに影響するのを避けるように、アプリケーションから切り離したやり方でパフォーマンスエントリを捕捉するために、我々はパフォーマンスのメタデータを端末のログストリームに出力することを選びました、すなわち adb logcat です。Raptor はこのストリームを消費して、メトリクスを集めるログからパフォーマンスエントリを解析します。

 

Phases と拡張性

Raptor は "phases"という概念を導入し、これは汎用的な方法でテストの相互作用をするためのフレームワークを置くものです。現在、Raptor はコールド起動と、再起動、B2Gの再起動を、計画済みの追加フェーズでサポートしています。これらの作業は、端末をパフォーマンス測定前のあるフェーズに配置することで、実際のパフォーマンステストのロジックをより簡単にします。

端末のインタラクション

Raptor works to abstract device interactions. Some of its major features are as follows:

  • Raptor uses the Marionette.js client for familiar device interactions using a high-level API. The same Marionette.js client used for writing integration tests can be used for trigger device actions which contain performance measurements.
  • For low-level interactions, Raptor relies on the low-level Orangutan tool for triggering touch events. This works by injecting coordinate-based touch events directly into the driver interface, e.g. /dev/input/event0 on a Firefox OS Flame. This has the benefit of simulating the touch event through the OS very transparently. In addition, trigger touch events have a similar API for triggering, e.g. triggering a tap may look like: device.input.tap(300, 400, 1), which simulates a single tap at XY coordinates (300,400).
  • All calls to and from the logging interface (i.e. adb logcat) have a consistent and managed JavaScript-based API.
  • Raptor also has interfaces for pushing and pulling files to/from devices.

さぁ始めよう

NOTE: While Raptor can be run on emulators, the results should not be relied on for performance comparisons. Desktop computers and their power means that they are not comparable to the performance characteristics of devices and end users, and should not be used for time-based decision making.

前提条件

You must have a copy of Gaia v2.2+ available on your system, as well as Node.js v0.12+/npm v2+ installed.

Raptorのインストール

Raptor は、npmからインストールできる CLI (コマンドラインインターフェイス)ツールです。こうしてインストールできます:

$ npm install -g @mozilla/raptor

インストールが完了したら、コマンドラインのraptorコマンドから実行できます:

$ raptor --help

もう一つのインストール

Inpm が /usr/usr/local ディレクトリへのグローバルパッケージとしてインストールする方法に不満がある場合、 いくつか別のオプションがあります:

  1. npm のデフォルトディレクトリを別のディレクトリに変更する。npmの手順に従うと、npmがグローバルパッケージをインストールする場所を変更できて、ひょっとするとホームフォルダの特別なディレクトリに配置できます。
  2. Raptor をローカルディレクトリにインストールして、相対的に参照します、例えば:
$ cd ~
$ mkdir raptor-cli && cd raptor-cli
$ npm install @mozilla/raptor

# Elsewhere
$ ~/raptor-cli/node_modules/@mozilla/raptor/bin/raptor --help

# Symlink or add to aliases to save on verbosity
$ cd ~
$ ln -s ~/raptor-cli/node_modules/@mozilla/raptor/bin/raptor raptor

# Now you can use it elsewhere
$ raptor --help

プロファイルのインストール

In order to interact with the device in a predictable way, Raptor needs a few profile options and custom settings. The default make command for Raptor optimizes Gaia, disables FTU, enables User Timing to write to logcat, and resets Gaia.

# Equivalent of:
# PERF_LOGGING=1 DEVICE_DEBUG=1 GAIA_OPTIMIZE=1 NOFTU=1 SCREEN_TIMEOUT=0 make reset-gaia
make raptor

If you already have a profile on your device, at a bare minimum you need the following profile options/settings set in order to use Raptor for performance testing:

  • PERF_LOGGING=1, this sets dom.performance.enable_user_timing_logging in the profile to true.
  • NOFTU=1, this disables the First-time experience, which is only needed if you are dealing with a freshly-reset Gaia.
  • SCREEN_TIMEOUT=0, prevents the device from going to sleep and shutting off the screen.
  • NO_LOCKSCREEN=1, removes the lock screen for easy application launching from the homescreen.

コマンドラインインターフェイス

Raptor provides a bit of helpful information right through the command line:

$ raptor --help

Usage: raptor <command> [options]

command
  test     Run a performance test by name or path location.
  submit     Submit a Raptor metrics file to an InfluxDB database

Options:
   -v, --version               outputs the raptor cli tool version
   --config <path>             specify additional Orangutan device configuration JSON. Environment: RAPTOR_CONFIG
   --homescreen <origin>       specify the origin or gaiamobile.org prefix of an application that is the device homescreen  [verticalhome.gaiamobile.org]
   --system <origin>           specify the origin or gaiamobile.org prefix of an application that is the system application  [system.gaiamobile.org]
   --serial <serial>           target a specific device for testing. Environment: ANDROID_SERIAL
   --adb-host <host>           connect to a device on a remote host. tip: use with --adb-port. Environment: ADB_HOST
   --adb-port <port>           set port for connecting to a device on a remote host. use with --adb-host. Environment: ADB_PORT
   --marionette-host <host>    connect to marionette on a remote host. tip: use with --marionette-port. Envrionment: MARIONETTE_HOST
   --marionette-port <port>    set port for connecting to marionette on a remote host. tip: use with --marionette-host. Environment: MARIONETTE_PORT
   --forward-port <port>       forward an adb port to the --marionette-port.  [0]
   --host <host>               host for reporting metrics to InfluxDB database. Environment: RAPTOR_HOST [localhost]
   --port <port>               port for reporting metrics to InfluxDB database. Environment: RAPTOR_PORT [8086]
   --username <username>       username for reporting metrics to InfluxDB database. Environment: RAPTOR_USERNAME [root]
   --password <password>       password for reporting metrics to InfluxDB database. Environment: RAPTOR_PASSWORD [root]
   --database <database>       name of InfluxDB database for reporting metrics. Environment: RAPTOR_DATABASE
   --protocol <protocol>       Protocol used to connect to InfluxDB database for reporting metrics. Environment: RAPTOR_PROTOCOL  [http]
   --metrics <path>            path to store historical test metrics. Environment: RAPTOR_METRICS
   --output <mode>             output mode: normal or quiet. Environment: RAPTOR_OUTPUT [normal]
   --batch <count>             batch database requests to <count> number of records  [5000]

The core command to execute is the test command, which also has some helpful information:

$ raptor test --help

Usage: raptor test <nameOrPath> [options]

nameOrPath  named test or path to a particular test to run. Named tests:
   coldlaunch    cold-launch lifecycle of an application from appLaunch to fullyLoaded
   reboot        device reboot lifecycle from device power-on until System/Homescreen fullyLoaded
   restart-b2g   restart B2G lifecycle from B2G start until System/Homescreen fullyLoaded

Options:
   ...
   --runs <runs>                         number of times to run the test and aggregate results [1]
   --app <appOrigin>                     specify the origin or gaiamobile.org prefix of an application to test
   --entry-point <entryPoint>            specify an application entry point other than the default
   --timeout <milliseconds>              time to wait between runs for success to occur [60000]
   --retries <times>                     times to retry test or run if failure or timeout occurs [1]
   --launch-delay <milliseconds>         time to wait between subsequent application launches [10000]
   --memory-delay <milliseconds>         time to wait before capturing memory after application fully loaded [0]
   --script-timeout <milliseconds>       time to wait when running scripts via marionette  [10000]
   --connection-timeout <milliseconds>   marionette driver tcp connection timeout  [2000]
   --logcat <path>                       write the output from `adb logcat` to a file
   --time <epochMilliseconds>            override the start time and unique identifier for test runs

This should give us enough information to run our first performance test.

Running a performance test

Running a performance test consists of a few parts:

  • The raptor CLI command
  • A test to run, whether a named test or a path to a test
  • Any relevant test settings

For the most basic test, we can do a cold launch test against an application with a command like this:

$ raptor test coldlaunch --app clock

[Cold Launch: clock.gaiamobile.org] Preparing to start testing...
[Cold Launch: clock.gaiamobile.org] Priming application
[Cold Launch: clock.gaiamobile.org] Starting run 1
[Cold Launch: clock.gaiamobile.org] Run 1 complete
[Cold Launch: clock.gaiamobile.org] Results from clock.gaiamobile.org

| Metric                | Mean   | Median | Min    | Max    | StdDev | p95    |
| --------------------- | ------ | ------ | ------ | ------ | ------ | ------ |
| navigationLoaded      | 939    | 939    | 939    | 939    | 0      | 939    |
| navigationInteractive | 1014   | 1014   | 1014   | 1014   | 0      | 1014   |
| visuallyLoaded        | 1247   | 1247   | 1247   | 1247   | 0      | 1247   |
| contentInteractive    | 1249   | 1249   | 1249   | 1249   | 0      | 1249   |
| fullyLoaded           | 1250   | 1250   | 1250   | 1250   | 0      | 1250   |
| uss                   | 14.836 | 14.836 | 14.836 | 14.836 | 0      | 14.836 |
| pss                   | 19.137 | 19.137 | 19.137 | 19.137 | 0      | 19.137 |
| rss                   | 31.191 | 31.191 | 31.191 | 31.191 | 0      | 31.191 |

[Cold Launch: clock.gaiamobile.org] Testing complete

During the cold launch test, you'll see B2G restart; the stated application will then launch once to prime it, and a second time to measure its performance. Looking at the log output above, you can see when each application run starts and stops. When a particular application has completed its testing, you will be given a table of metrics and testing will continue, if applicable. In the metrics table you'll see statistics for each performance entry captured during the lifespan of the test: mean (average), median, minimum value, maximum value, standard deviation, and 95th percentile.

Note: One fun fact is that the table produced by Raptor is compatible with GitHub-flavored Markdown.

Note: Standard deviation and 95th percentile need a collection of runs before they output statistically-useful data.

All metrics relate to the name of the performance entry. The numbers gathered here are not just aggregations of the values produced by User Timing entries, so it's important to understand how these numbers are derived.

メトリクス集約

While Raptor relies on the User Timing API to gather its metrics, it also makes some assumptions about measurements that are different to what's expected in the context of normal web pages. In a typical web page, a performance marker represents the High-Resolution time from the moment of navigationStart. The User Timing API still captures this data, but Raptor's calculations also include additional time depending on the type of test running. Let's compare the creation of a performance marker in the context of a typical web page versus a Firefox OS application being cold launched.

典型的なwebページ

In any web page, Firefox OS application or not, creating a performance marker with the User Timing API is simple:

performance.mark('hello');
 

Now let's get the value back and inspect its contents:

performance.getEntriesByType('mark')[0];

// returns the following object
PerformanceMark { name: "hello", entryType: "mark", startTime: 5159.366323, duration: 0 }
 
 
 
 

Note the mark's startTime and duration. The startTime is nothing more than the high-resolution time elapsed since the time of performance.timing.navigationStart; in this case a little over 5,000 milliseconds. The duration is 0 because this represents a single point in time, which has no duration. The startTime simply states at what moment the marker was created. Inspecting the output of a performance marker is no different in Firefox OS.

A performance measure on the other hand does include a duration, because it is the delta between two performance markers:

performance.mark('hello');
performance.mark('goodbye');

performance.measure('greeting', 'hello', 'goodbye');
Again, let's inspect the performance entry:
performance.getEntriesByType('measure')[0];

// returns the following object
PerformanceMeasure { name: "greeting", entryType: "measure", startTime: 3528.523661, duration: 4183.291375999805 }
The duration is populated for performance measures, and in this example it took approximately 4.2 seconds to perform a greeting; going from hello to goodbye.

Raptor コンテキスト

The difference comes in the calculations that Raptor will report. Raptor makes an assumption that all markers generated are actually performance measures in reality, with their duration measured as the time between the application being instructed to launch and the marker being generated. For cold launch, the homescreen application (gaia_grid specifically) creates a special performance marker when an application is launching:

performance.mark('appLaunch@' + appOrigin);
In Raptor, performance markers can be given an @-directive that overrides the context of the marker. If the homescreen instead had invoked performance.mark('appLaunch'), normally we'd assume it is in the application's context. With an @-directive however we can key the performance marker to be against a different application, in essense creating a performance marker for one application inside another. This would evaluate to something like:
performance.mark('appLaunch@clock.gaiamobile.org');
In this case the homescreen is generating a performance marker for the clock application denoting the time of appLaunch. Raptor will then calculate a delta between appLaunch and all performance markers to achieve a more accurate user-perceived time for a marker to be hit. By moving the moment of capture to earlier in the loading process, specifically as close to icon touch as possible, it makes the data between Raptor and camera-based measurements much more comparable.

テストを選択する

Tests are selected by changing the name or file that Node.js executes. For example, to run the device reboot performance test instead of a cold launch test you'd do the following:

$ raptor test reboot

More examples:

# Test Dialer cold launch
$ raptor test coldlaunch --app communications --entry-point dialer

# Change the number of runs
$ raptor test coldlaunch --app clock --runs 10

# Introduce a 1-second delay before capturing memory
$ raptor test reboot --memory-delay 1000

# Target a particular device
$ raptor test reboot --serial f30eccef
$ ANDROID_SERIAL=f30eccef raptor test reboot

# Turn on Raptor debug output, useful for bugs or problems
$ DEBUG=raptor:* raptor test reboot

# JSON mode, useful for post-processing of aggregate values
$ raptor test coldlaunch --app clock --output json

# Quiet mode, useful if you only care about the results
$ raptor test coldlaunch --app clock --output quiet

テストを書く

While Raptor currently contains a few tests for running cold launch tests, rebooting, and restarting B2G, it is possible to write tests that run custom logic.

We can inspect the contents of the current launch test to glean how we can write new tests.

// mozilla-b2g/raptor
// tests/coldlaunch.js

setup(function(options) {
  options.test = 'cold-launch';
  options.phase = 'cold-launch';
});

afterEach(function(phase) {
  return phase.closeApp();
});
First comes setting up the test. In setup, pass a function to be executed, which will configure the test. This function will be passed all the current configuration settings. At a minimum, you will need the set the phase of the test, which determines the state the device is in when the test begins. Depending on which phase you select when setting options, you may need to pass additional information. For the launch test example, using the cold phase requires an application to be specified. This can either be set on the command line, or you can hard-code it via the app option to force the test to be specific to a certain app.

Note: If you hard-code the application to be launched, make you specify the origin host completely, e.g. "clock.gaiamobile.org". For entry-point-based apps, specify the app option and the entryPoint option.

Important: Any test harness functions doing asynchronous work should return a Promise so Raptor can properly wait.

The afterEach() function will be called once for each run after the phase has been started. For cold launch, it is after an application in context has been primed, exited, and re-opened, and the application denotes it is ready — i.e. performance.mark('fullyLoaded'). For reboot and B2G restart, the phase will be designated as ready when the System application and the Homescreen application are marked as fully loaded.

The phase argument passed to afterEach() represents the current context instance of the phase test runner; in other words, it is specific to the current test being run. It contains methods and functionality that help you trigger device actions which will have profiled performance code. For example, you can start a Marionette.js session and trigger commands:

setup(function(options) {
  options.phase = 'cold';
});

afterEach(function(phase) {
  // Note that returning a Promise denotes that we are done running the test
  return phase.device.marionette
    .startSession()
    .then(function(client) {
      client.executeScript(function() {
        // trigger code that captures the performance.measures created
        // by the application being tested
      });
      client.deleteSession();
    });
});
The runner can also run a teardown() function when all tests are complete.
teardown(function(phase) {
  return new Promise(function(resolve) {
    // teardown the test, then resolve
    resolve();
  });
});
The Raptor Phase API has not yet been documented, so currently you'll need to read the source for all the functionality available to you. It may be faster to seek help from a contributor for help on getting started writing a particular test.

可視化と自動化

Raptor has improved tooling available for automation and visualization. The test-perf tool used to use the Datazilla tool for graphing and visualizing results to gain insight into possible regressions and performance pulse of applications. Raptor has moved away from Datazilla however for its visualization capabilities — for maintenance and usability reasons — instead having its own UI at https://raptor.mozilla.org. The Raptor dashboards currently categorize performance metrics in a few key categories per device instance — measures and memory — with more metrics planned in the future.

Raptor's front-end uses the Grafana visualization tool, and its backing store is InfluxDB, a time series database. Grafana provides Raptor UI users with the ability to carry out custom drill-downs into charts, slice time as desired, view data point revisions, and build custom charts and data queries. The default view of several charts displays the 95th percentile of many metrics, but charts can be user-edited to graph other mathematical functions.

This guide is not meant to be a tutorial on the usage of Grafana and InfluxDB, so to learn more about taking full advantage of the Raptor UI, read through these important pieces of documentation:

プライベートな可視化

The Raptor dashboard visualization discussed in the previous section can also be installed and used privately. The installation is a Heroku-deployable environment for easy setup. It is also possible to run the Heroku application locally if you use Linux.

To get started with private visualization, or want to learn more about its innards, see the repository: https://github.com/mozilla-b2g/raptor-dashboards.

You will also need an installation of InfluxDB 0.9.3+. You can learn more about installing it at: https://influxdb.com/docs/v0.9/introduction/installation.html. Those who are familiar with Docker can also install InfluxDB from Docker Hub: https://hub.docker.com/r/tutum/influxdb/.

Raptor needs CLI options or environment variables for creating a connection to an InfluxDB database. It would be tedious to specify these continually on the command line, so to simplify this, you can export these environment variables from your shell, e.g. ~/.bashrc, ~/.zshrc, etc.

# These settings will point to the installation and credentials of your InfluxDB instance:
export RAPTOR_HOST=localhost
export RAPTOR_USERNAME=root
export RAPTOR_PASSWORD=root
export RAPTOR_DATABASE=raptor
export RAPTOR_PROTOCOL=https
export RAPTOR_PORT=8086

In addition, Raptor's database schema requires its results to be tagged properly in order to display it in correct categories in its dashboard UI. Failure to have these properties set when running performance tests will cause the data to not be displayed. By default, you need to persist the memory configuration of the device, the device type, and the branch the performance test is based on. For example, if you are performance testing a KitKat-based Flame set to 319MB of memory and your patch is based off of Gaia's master branch, you will set the following properties via ADB:

$ adb shell setprop persist.raptor.device flame-kk
$ adb shell setprop persist.raptor.memory 319
$ adb shell setprop persist.raptor.branch master

Note: If you are having trouble with the values being persisted or not saving at all, restart ADB as root with adb root.

If you were working on a branch that was based off of v2.5 on an Aries with 2 Gigabytes of memory, you would use the following properties:

$ adb shell setprop persist.raptor.device aries
$ adb shell setprop persist.raptor.memory 2048
$ adb shell setprop persist.raptor.branch v2.5

Important: Currently visualization is highly-dependent on the existence of these persisted properties. They are only necessary when using the local visualization tooling; if you flash your device or otherwise unset these properties, you will need to re-set them in order to visualize performance metrics.

Other than setting up the environment and device tags, Raptor can be run as normal locally. Upon each successful run, Raptor will report its metrics to the database. Once the test is complete, you can open a browser to your private visualization instance and view your own custom performance data.

必要に応じて動的にパフォーマンスマークを加える

One issue with Raptor is that since the tests require us to add performance marks into code, the Gaia codebase could quickly become littered with Performance.mark() calls without any meaningful relationship between them, making the code clutted and harder to understand. The best way to deal with this is to collect all the marks into some kind of patching files, and apply them dynamically as required when we want to run specific Raptor tests.

To this end, Greg Weng has created a code transformer tool that will do just what is described above. The tool is currently a work in progress, but you can find more about it (including how to get it running) at this newsgroup entry: Raptor: code transformer + marionette workflow now is almost ready. See also バグ 1181069 for implementation specifics.

We will publish more formal instructions once the tool has stabilised.

サポート

If you have questions about Raptor, visualization, or performance tooling in general, feel free to ping :Eli or :rwood in the #raptor channel on Mozilla IRC.

ドキュメントのタグと貢献者

 このページの貢献者: chrisdavidmills, hamasaki, Uemmra3
 最終更新者: chrisdavidmills,