JS::PerfMeasurement

  • Revision slug: Performance/JS::PerfMeasurement
  • Revision title: JS::PerfMeasurement
  • Revision id: 287437
  • Created:
  • Creator: Jorge.villalobos
  • Is current revision? Yes
  • Comment Fixed typo; 1 words added, 1 words removed

Revision Content

{{ gecko_minversion_header("2.0") }}

Note: At present, JS::PerfMeasurement is only functional on Linux, but it is planned to add support for Windows ({{ Bug(583322) }}) and OSX ({{ Bug(583323) }}) as well, and we welcome patches for other operating systems.

The JS::PerfMeasurement class, found in jsperf.h, lets you take detailed performance measurements of your code.  It is a stopwatch profiler -- that is, it counts events that occur while code of interest is running; it does not do call traces or sampling.  The current implementation can measure eleven different types of low-level hardware and software events:

Events that can be measured
Bitmask passed to constructor Counter variable What it measures
PerfMeasurement::CPU_CYCLES
.cpu_cycles
Raw CPU clock cycles
               ::INSTRUCTIONS
.instructions
Total instructions executed
               ::CACHE_REFERENCES
.cache_references
Total number of memory accesses
               ::CACHE_MISSES
.cache_misses
Memory accesses that missed the cache
               ::BRANCH_INSTRUCTIONS
.branch_instructions
Branch instructions executed
               ::BRANCH_MISSES
.branch_misses
Branch instructions that were not predicted correctly
               ::BUS_CYCLES
.bus_cycles
Total memory bus cycles
               ::PAGE_FAULTS
.page_faults
Total page-fault exceptions fielded by the OS
               ::MAJOR_PAGE_FAULTS
.major_page_faults
Page faults that required disk access
               ::CONTEXT_SWITCHES
.context_switches
Context switches involving the profiled thread
               ::CPU_MIGRATIONS
.cpu_migrations
Migrations of the profiled thread from one CPU core to another

These events map directly to "generic events" in the Linux 2.6.31+ <linux/perf_event.h> interface, and so unfortunately are a little vague in their specification; for instance, we can't tell you exactly which level of cache you get misses for if you measure CACHE_MISSES.  We also can't guarantee that all platforms will support all event types, once we have more than one back end for this interface.

Here is the complete C++-level API:

PerfMeasurement::EventMask
This is an enumeration defining all of the bit mask values in the above table.  You bitwise-OR the ones you want together and pass them to the constructor.
PerfMeasurement::ALL
In a constructor call, this special value means "measure everything that can possibly be measured."
PerfMeasurement::NUM_MEASURABLE_EVENTS
This constant equals the total number of events defined by the API - not necessarily the total number of events that a particular OS allows you to measure.  (At time of writing, Linux allows all the above events to be measured, and we don't have back ends for any other operating system.
static bool PerfMeasurement::canMeasureSomething()
This class method returns true if and only if some -- not necessarily all -- events can be measured by the current build of SpiderMonkey, running on the current OS.  At present, it returns true on Linux when the <linux/perf_event.h> API is available (kernel 2.6.31 or later), and false everywhere else.
PerfMeasurement::PerfMeasurement(EventMask toMeasure)
The constructor creates a new profiling object, which measures some subset of the requested events.
PerfMeasurement::~PerfMeasurement()
Take care not to leak profiling objects, as they may be holding down expensive OS-level state.
EventMask eventsMeasured
Instances of PerfMeasurement expose this constant.  It will not have any more bits set than were set in the mask passed to the constructor, but if the OS cannot or will not measure all of the requested events, then only those events that will actually be measured have their bits set.  All the counter variables for events that are not being measured will have the fixed value (uint64)-1.
void start()
Call this on a PerfMeasurement instance to start timing.
void stop()
And call this to stop timing again.  The counter variables do not update in real time; they only change when stop is called.  Counter values are accumulated across many start/stop cycles, and you can modify their values if you want; stop simply adds counts read back from the OS to whatever is already in each counter.
void reset()
Resets all enabled counters to zero.
uint64 cpu_cycles, uint64 instructions, etc.
Each potentially-measurable event corresponds to a regular old instance variable, which you can read and even modify.  All presently-measurable events are measured with counters, not timers; that is, there is no defined relation between the numbers you get from this interface, and wall-clock time.
PerfMeasurement* JS::ExtractPerfMeasurement(jsval wrapper)
If you are the C++ side of an XPCOM interface, and you want to benchmark only part of your execution time but make the results available to JavaScript, you can declare a bare jsval argument in your .idl file and have JavaScript pass a PerfMeasurement object that it created in that argument slot.  The jsval you receive points to a JavaScript wrapper object.  To extract the C++-level PerfMeasurement object, call this function.  It will return NULL if JavaScript passed in something with the wrong dynamic type.
JSObject* JS::RegisterPerfMeasurement(JSContext* cx, JSObject* global)
You shouldn't need to use this function, but we mention it for completeness.  It initializes the JavaScript wrapper interface for this API and pokes the constructor function into the global object you provide.

See also

Revision Source

<p>{{ gecko_minversion_header("2.0") }}</p>
<div class="note"><strong>Note:</strong> At present, <code>JS::PerfMeasurement</code> is only functional on Linux, but it is planned to add support for Windows ({{ Bug(583322) }}) and OSX ({{ Bug(583323) }}) as well, and we welcome patches for other operating systems.</div>
<p>The <code>JS::PerfMeasurement</code> class, found in <code>jsperf.h</code>, lets you take detailed performance measurements of your code.  It is a stopwatch profiler -- that is, it counts events that occur while code of interest is running; it does <em>not</em> do call traces or sampling.  The current implementation can measure eleven different types of low-level hardware and software events:</p>
<table border="0" cellpadding="1" cellspacing="1" style="width: 100%;"> <caption>Events that can be measured</caption> <thead> <tr> <th scope="col">Bitmask passed to constructor</th> <th scope="col">Counter variable</th> <th scope="col">What it measures</th> </tr> </thead> <tbody> <tr> <td><code>PerfMeasurement::CPU_CYCLES<br> </code></td> <td><code>.cpu_cycles<br> </code></td> <td>Raw CPU clock cycles</td> </tr> <tr> <td><code>               ::INSTRUCTIONS<br> </code></td> <td><code>.instructions<br> </code></td> <td>Total instructions executed</td> </tr> <tr> <td><code>               ::CACHE_REFERENCES<br> </code></td> <td><code>.cache_references<br> </code></td> <td>Total number of memory accesses</td> </tr> <tr> <td><code>               ::CACHE_MISSES<br> </code></td> <td><code>.cache_misses<br> </code></td> <td>Memory accesses that missed the cache</td> </tr> <tr> <td><code>               ::BRANCH_INSTRUCTIONS<br> </code></td> <td><code>.branch_instructions<br> </code></td> <td>Branch instructions executed</td> </tr> <tr> <td><code>               ::BRANCH_MISSES<br> </code></td> <td><code>.branch_misses<br> </code></td> <td>Branch instructions that were not predicted correctly</td> </tr> <tr> <td><code>               ::BUS_CYCLES<br> </code></td> <td><code>.bus_cycles<br> </code></td> <td>Total memory bus cycles</td> </tr> <tr> <td><code>               ::PAGE_FAULTS<br> </code></td> <td><code>.page_faults<br> </code></td> <td>Total page-fault exceptions fielded by the OS</td> </tr> <tr> <td><code>               ::MAJOR_PAGE_FAULTS<br> </code></td> <td><code>.major_page_faults<br> </code></td> <td>Page faults that required disk access</td> </tr> <tr> <td><code>               ::CONTEXT_SWITCHES<br> </code></td> <td><code>.context_switches<br> </code></td> <td>Context switches involving the profiled thread</td> </tr> <tr> <td><code>               ::CPU_MIGRATIONS<br> </code></td> <td><code>.cpu_migrations<br> </code></td> <td>Migrations of the profiled thread from one CPU core to another</td> </tr> </tbody>
</table>
<p>These events map directly to "generic events" in the Linux 2.6.31+ <code>&lt;linux/perf_event.h&gt;</code> interface, and so unfortunately are a little vague in their specification; for instance, we can't tell you exactly which level of cache you get misses for if you measure <code>CACHE_MISSES</code>.  We also can't guarantee that all platforms will support all event types, once we have more than one back end for this interface.</p>
<p>Here is the complete C++-level API:</p>
<dl> <dt>PerfMeasurement::EventMask</dt> <dd>This is an enumeration defining all of the bit mask values in the above table.  You bitwise-OR the ones you want together and pass them to the constructor.</dd> <dt>PerfMeasurement::ALL</dt> <dd>In a constructor call, this special value means "measure everything that can possibly be measured."</dd> <dt>PerfMeasurement::NUM_MEASURABLE_EVENTS</dt> <dd>This constant equals the total number of events defined by the API - not necessarily the total number of events that a particular OS allows you to measure.  (At time of writing, Linux allows all the above events to be measured, and we don't have back ends for any other operating system.</dd> <dt>static bool PerfMeasurement::canMeasureSomething()</dt> <dd>This class method returns true if and only if <em>some</em> -- not necessarily all -- events can be measured by the current build of SpiderMonkey, running on the current OS.  At present, it returns true on Linux when the <code>&lt;linux/perf_event.h&gt;</code> API is available (kernel 2.6.31 or later), and false everywhere else.</dd> <dt>PerfMeasurement::PerfMeasurement(EventMask toMeasure)</dt> <dd>The constructor creates a new profiling object, which measures some subset of the requested events.</dd> <dt>PerfMeasurement::~PerfMeasurement()</dt> <dd>Take care not to leak profiling objects, as they may be holding down expensive OS-level state.</dd> <dt>EventMask eventsMeasured</dt> <dd>Instances of PerfMeasurement expose this constant.  It will not have any more bits set than were set in the mask passed to the constructor, but if the OS cannot or will not measure all of the requested events, then only those events that will actually be measured have their bits set.  All the counter variables for events that are not being measured will have the fixed value <code>(uint64)-1</code>.</dd> <dt>void start()</dt> <dd>Call this on a PerfMeasurement instance to start timing.</dd> <dt>void stop()</dt> <dd>And call this to stop timing again.  The counter variables do not update in real time; they only change when <code>stop</code> is called.  Counter values are accumulated across many start/stop cycles, and you can modify their values if you want; <code>stop</code> simply adds counts read back from the OS to whatever is already in each counter.</dd> <dt>void reset()</dt> <dd>Resets all enabled counters to zero.</dd> <dt>uint64 cpu_cycles, uint64 instructions, etc.</dt> <dd>Each potentially-measurable event corresponds to a regular old instance variable, which you can read and even modify.  All presently-measurable events are measured with <em>counters</em>, not timers; that is, there is no defined relation between the numbers you get from this interface, and wall-clock time.</dd> <dt>PerfMeasurement* JS::ExtractPerfMeasurement(jsval wrapper)</dt> <dd>If you are the C++ side of an XPCOM interface, and you want to benchmark only <em>part</em> of your execution time but make the results available to JavaScript, you can declare a bare <code>jsval</code> argument in your .idl file and have JavaScript pass a PerfMeasurement object that it created in that argument slot.  The <code>jsval</code> you receive points to a JavaScript wrapper object.  To extract the C++-level PerfMeasurement object, call this function.  It will return <code>NULL</code> if JavaScript passed in something with the wrong dynamic type.</dd> <dt>JSObject* JS::RegisterPerfMeasurement(JSContext* cx, JSObject* global)</dt> <dd>You shouldn't need to use this function, but we mention it for completeness.  It initializes the JavaScript wrapper interface for this API and pokes the constructor function into the global object you provide.</dd>
</dl>
<h2 id="See_also">See also</h2>
<ul> <li><a href="/en/JavaScript_code_modules/PerfMeasurement.jsm" title="en/JavaScript code modules/PerfMeasurement.jsm"><code>PerfMeasurement.jsm</code></a></li> <li><a href="/en/Performance/Measuring_performance_using_the_PerfMeasurement.jsm_code_module" title="en/Performance/Measuring performance using the PerfMeasurement.jsm code module">Measuring performance using the PerfMeasurement.jsm code module</a></li> <li><a href="/en/Performance" title="en/Performance">Performance</a></li>
</ul>
<dl></dl>
Revert to this revision