The Gecko Profiler is a profiler that is built into Firefox. It has tighter integration with Firefox than external profilers, and can also be used in situations where external profilers aren't available, such as on a non-developer's machine or on a locked Android device. 

The Gecko Profiler has been previously known as "SPS" and "the built-in profiler". We have changed as many references to the old names as possible, but there may still be some around.

Getting the Gecko Profiler Add-on

The Gecko Profiler has two interfaces:

  1. for Web developers there is a simplified profiler that can be opened from the menu Tools > Web Developer > Performance.
  2. a more advanced interface for developers of Mozilla's internals can be accessed by installing the Gecko Profiler add-on.

Reporting a Performance Problem has a step-by-step guide for obtaining a profile when requested by Firefox developers.

Reporting a Thunderbird Performance Problem has a step-by-step guide for obtaining a profile when requested by Thunderbird developers.

Understanding Profiles

See the Gecko Profiler extension page to learn more on how to read and understand performance profiles.

Profiling local Windows builds

If you built Firefox for Windows locally and you would like to use the local symbols with the profiler, you will need to run an additional tool; see Profiling with the Gecko Profiler and Local Symbols on Windows.

Profiling Firefox mobile

  1. For local builds of Fennec, you should build with optimization and STRIP_FLAGS="--strip-debug" but NOT with --enable-profiling. Nightly builds are already built with the appropriate flags.
  2. You'll need to have adb and arm-eabi-addr2line (which is part of the Android NDK) in your bash PATH, so use locate arm-eabi-addr2line (on Linux) or mdfind name:arm-eabi-addr2line (on OS X) and stick an export to its location in ~/.bash_profile. The extension will invoke bash to use adb and addr2line.
  3. Install the latest pre-release build in your host machine's Firefox browser that has your phone reachable via ADB. This will add a icon in the top right of the browser.
  4. Set devtools.debugger.remote-enabled to true in about:config for Fennec.
  5. Select target Mobile USB and press Connect. The first run will take an additional 1 minute or so to pull in the required system libraries.

Profiling Firefox Startup

  1. Make sure you use latest version of the add-on
  2. Start your Firefox with MOZ_PROFILER_STARTUP=1 set. This way the profiler is started as early as possible during startup.
  3. Then capture the profile using the add-on as usual.

Some tips

  • If it looks like the buffer is not large enough, you can tweak the buffer size with the env var MOZ_PROFILER_STARTUP_ENTRIES. This defaults to 1000000, which is 9MB. If you want 90MB use 10000000, and 20000000 for 180MB, which are good values to debug long startups. You can also change these values in the Gecko Profiler extension UI.
  • If you'd like a coarser resolution, you can also choose a different interval using MOZ_PROFILER_STARTUP_INTERVAL, which defaults to 1 (unit is millisecond). You can't go below 1 ms but you can use e.g. 10 ms.

Profiling JS benchmark (xpcshell)

  1. To profile the script run.js with IonMonkey (-I), type inference (-n) and JäegerMonkey (-m). Thgis requires the following command:
    $ xpcshell -m -I -n -e '
        const Ci = Components.interfaces;
        const Cc = Components.classes;
        var profiler = Cc["@mozilla.org/tools/profiler;1"].getService(Ci.nsIProfiler);
        profiler.StartProfiler(
          10000000 /* = profiler memory */,
          1 /* = sample rate: 100µs with patch, 1ms without */,
          ["stackwalk", "js"], 2 /* = features, and number of features. */
        );
      ' -f ./run.js -e '
        var profileObj = profiler.getProfileData();
        print(JSON.stringify(profileObj));
      ' | tail -n 1 > run.cleo
    The xpcshell output all benchmark information and on its last line it output the result of the profiling, you can filter it with tail -n 1 and redirect it to a file to prevent printing it in your shell.  The expected size of the output is around 100 of MB.
  2. To add symbols to your build, you need to call ./scripts/profile-symbolicate.py available in B2G repository.
    $ GECKO_OBJDIR=<objdir> PRODUCT_OUT=<objdir> TARGET_TOOLS_PREFIX= \
        ./scripts/profile-symbolicate.py -o run.symb.cleo run.cleo
  3. Clone Cleopatra and start the server with ./run_webserver.sh.
  4. Access Cleopatra from your web browser by loading the page localhost:8000, and upload run.symb.cleo to render the profile with most of the symbol information.

Native stack vs. Pseudo stack

The profiler periodically samples the stack(s) of thread(s) in Firefox, collecting a stack trace, and presents the aggregated results using the Cleopatra UI.  Stack traces can be collected into two different ways: Pseudostack (the default) or Nativestack.

Native stack

Native stacks are the normal stacks most developers are used. They are the default.

Pseudostack

The pseudostack uses function entry/exit tags added by hand to important points in the code base.  The stacks you see in the UI are chains of these tags.  This is good for highlighting particularly interesting parts of the code, but they miss out on un-annotated areas of the code base, and give no visibility into system libraries or drivers.

Tagging is done by adding macros of the form PROFILER_LABEL("NAMESPACE", "NAME"). These add RAII helpers, which are used by the profiler to track entries/exits of the annotated functions.  For this to be effective, you need to liberally use PROFILER_LABEL throughout the code. See GeckoProfiler.h for more variations like PROFILER_LABEL_PRINTF.

Because of the non-zero overhead of the instrumentation, the sample label shouldn't be placed inside hot loops.  A profile reporting that a large portion is spent in "Unknown" code indicates that the area being executed doesn't have any sample labels.  As we focus on using this tool and add additional sample labels coverage should improve.

Sharing, saving and loading profiles

After capturing and viewing a profile you will see "Share..." and "Save as file..." buttons in the top-right of the window. Sharing will upload your profile to perf-html.io and make it public and is limited to 10 MB of profile data. Saving the profile to a local file allows you to have more control over who you share it with, and allows you to save arbitrarily large profiles.

If you have a profile saved as a local file you can view it using the file picker at the bottom of the perf-html.io homepage.

To host profiles on your own server they need to be served with the HTTP header Access-Control-Allow-Origin *. For apache (people.mozilla.org) use $ echo "Header set Access-Control-Allow-Origin *" > .htaccess and share the URL http://people.mozilla.com/~bgirard/cleopatra/?customProfile=<URL>, replacing <URL> with the location of your profile file.

Profiling a hung process

It is possible to get profiles from hung Firefox processes using lldb1.

  1. After the process has hung, attach lldb.
  2. Type in2, :
    p (void)profiler_save_profile_to_file("somepath/profile.txt")
  3. Clone mstange’s handy profile analysis repository.
  4. Run:
    python symbolicate_profile.py somepath/profile.txt

    To graft symbols into the profile. mstange’s scripts do some fairly clever things to get those symbols – if your Firefox was built by Mozilla, then it will retrieve the symbols from the Mozilla symbol server. If you built Firefox yourself, it will attempt to use some cleverness3 to grab the symbols from your binary.

    Your profile will now, hopefully, be updated with symbols.

    Then, load up Cleopatra, and upload the profile.

    I haven’t yet had the opportunity to try this, but I hope to next week. I’d be eager to hear people’s experience giving this a go – it might be a great tool in determining what’s going on in Firefox when it’s hung!

Profiling Threads

The Gecko Profiler has rudimentary support for profiling multiple threads. To enable it, check the 'Multi-Thread' box then enter one or more thread names into the textbox beside it. Thread names are the strings passed to the base::Thread class at initialization. At present there is no central list of these thread names, but you can find them by grepping the source.

Examples: 1 2

If the filter you entered is invalid, no threads will be profiled. You can identify this by hitting Analyze (Cleopatra will show you an error message). If the filter is left empty, only the main thread is captured (as if you had not enabled Multi-Thread.)

Profiler Features

The profiler supports several features. These are options to gather additional data in your profiles. Each option will increase the performance overhead of profiling so it's important to activate only options that will provide useful information for your particular problem to reduce the distortion.

Stackwalk

When taking a sample the profiler will attempt to unwind the stack using platform specific code appropriate for the ABI. This will provide an accurate callstack for most samples. On ABIs where framepointers are not avaiable this will cause a significant performance impact.

JS Profiling

Javascript callstacks will be generated and interleaved with the c++ callstacks. This will introduce an overhead when running JS.

GC Stats

Will embed GC stats from 'javascript.options.mem.notify' in the profile.

Main Thread IO

This will interpose file I/O and report them in the profiles.

Multi-Thread

This will sample other threads. This fields accept a comma seperated list of thread names. A thread can only be profiled if it is registered to the profiler.

GPU

This will insert a timer query during compositing and show the result in the Frames view. This will appropriate how much GPU time was spent compositing each frame.

Layers & Texture

The profiler can be used to view the layer tree at each composite, optionally with texture data. This can be used to debug correctness problems.

Viewing the Layer Tree

To view the layer tree, the layers.dump pref must be set to true in the Firefox or B2G program being profiled.

In addition, both the compositor thread and the content thread (in the case of B2G, the content thread of whichever app you're interested in) must be profiled. For example, on B2G, when profiling the Homescreen app, you might start the profiler with:

./profile.sh start -p b2g -t Compositor && ./profile.sh start -p Homescreen

Having gotten a profile this way, the layer tree for a composite can be seen by clicking on a composite in the "Frames" section of Cleopatra (you may need to a sub-range of samples to make individual composites large enough to be clicked). This will activate the "LayerTree" tab:

Screenshot of layer tree view in Cleopatra, with no textures.

In this screenshot, Composite #143 has been selected. The layer tree structure can be seen in the left panel. It contains, for each layer, the type of the layer, and various metrics about the layer, such as the visible region and any transforms. In the right panel, a visualization of the layer tree (based entirely on the aforementioned metrics) is shown. Hovering over a layer in the left panel highlights the layer in the right panel. This is useful for identifying what content each layer corresponds to. Here, I'm hovering over the last layer in the layer tree (a PaintedLayerComposite), and a strip at the top of the right panel is highlighted, telling me that this layer is for the system notification bar in B2G.

Viewing Textures

Sometimes, it's useful to see not only the structure of the layer tree for each composite, but also the rendered textures for each layer. This can be achieved by additionally setting the layers.dump-texture pref to true, or by adding -f layersdump to the profiler command line (the latter implies both the layers.dump and layers.dump-texture prefs).

Warning: Dumping texture data slows performance considerably, and requires a lot of storage for the profile files. Expect rendering to happen at a significantly reduced frame rate when profiling this way, and keep the duration of the capture short, to ensure the samples of interest aren't overwritten.

Here's how the Layer Tree view looks in Cleopatra with texture data:

Screenshot of layer tree view in Cleopatra, with textures.

This time, the visualization in right panel shows the actual textures rather than just the outlines of the layers. This can be very useful for debugging correctness problems such as a temporary visual/rendering glitch, because it allows you to find the precise composite that shows the glitch, and look at the layer tree for that composite.

Visualizing a layer tree without a profile

If you have a layer dump from somewhere (such as from adb logcat on B2G), you can get Cleopatra to visualize it (just the structure of course, not textures) without needing a profile. To do so, paste the layer dump into the "Enter your profile data here" text field on the front page of Cleopatra:

Screenshot of front page of Cleopatra, with pasted layer dump.

The resulting "profile" will have the Layer Tree view enabled (but nothing else). This is useful in cases where you want to gain a quick visual understanding of a layer dump without having to take a profile.

On B2G, each line of a layer dump in adb logcat output is prefixed with something like I/Gecko   (30593):. Cleopatra doesn't currently understand this prefix, so it needs to be removed before pasting.

Display List

Dump the display list after each refresh with the texture data. This can be used to debug correctness problems.

Contribute