Nanojit

  • Revision slug: Nanojit
  • Revision title: Nanojit
  • Revision id: 53866
  • Created:
  • Creator: paritosh1010
  • Is current revision? No
  • Comment 158 words added

Revision Content

Overview

Nanojit is a small, cross-platform C++ library that emits machine code. Both the Tamarin JIT and the SpiderMonkey JIT (a.k.a. TraceMonkey) use Nanojit as their back end.

You can get Nanojit by cloning the tamarin-redux Mercurial repository at http://hg.mozilla.org/tamarin-redux. It's in the nanojit directory.

The input for Nanojit is a stream of Nanojit LIR instructions. The term LIR is compiler jargon for a language used internally in a compiler that is usually cross-platform but very close to machine language. It is an acronym for "low-level intermediate representation". A compiler's LIR is typically one of several partly-compiled representations of a program that a compiler produces on the way from raw source code to machine code.

An application using Nanojit creates a nanojit::LirBuffer object to hold LIR instructions.  It creates a nanojit::LirBufWriter object to write instructions to the buffer.  Then it wraps the LirBufWriter in zero or more other LirWriter objects, all of which implement the same interface as LirBufWriter. This chain of LirWriter objects forms a pipeline for the instructions to pass through.  Each LirWriter can perform an optimization or other task on the program as it passes through the system and into the LirBuffer.

Once the instructions are in the LirBuffer, the application calls nanojit::compile() to produce machine code, which is stored in a nanojit::Fragment. Internally to Nanojit, another set of filters operates on the LIR as it passes from the LirBuffer toward the assembler. The result of compilation is a function that the application can call from C via a pointer to the first instruction.

Example

The following code works with SpiderMonkey's hacked version of Nanojit.  Figuring out how to compile it is left as an exercise for the reader; the following works when run in the object directory of an --enable-debug SpiderMonkey shell:

g++ -DDEBUG -g3 -include mozilla-config.h -I dist/include/js  -I ../nanojit -o jittest ../jittest.cpp libjs_static.a

(Remove the -DDEBUG if you have not compiled SpiderMonkey with --enable-debug, and use whatever you called the sample source file in place of jittest.cpp.)

#include <stdio.h>
#include <stdint.h>
#include "jsapi.h"
#include "nanojit.h"

using namespace nanojit;

const uint32_t CACHE_SIZE_LOG2 = 20;

int main()
{
    // Set up the basic Nanojit objects.
    avmplus::GC *gc = new avmplus::GC;
    if (!gc)
        return 1;
    avmplus::AvmCore core;
#ifdef DEBUG
    core.config.verbose = 1; // Show disassembly of generated traces.
#endif
    Fragmento *fragmento = new (gc) Fragmento(&core, CACHE_SIZE_LOG2);
    LirBuffer *buf = new (gc) LirBuffer(fragmento, NULL);

    // Create a Fragment to hold some native code.
    Fragment *f = new (gc) Fragment(NULL);
    f->lirbuf = buf;
    f->root = f;

    // Create a LIR writer, with verbose output if DEBUG.
    LirBufWriter writer0(buf);
#ifdef DEBUG
    fragmento->labels = new (gc) LabelMap(&core, NULL);
    buf->names = new (gc) LirNameMap(gc, NULL, fragmento->labels);
    VerboseWriter writer(gc, &writer0, buf->names);
#else
    LirBufWriter& writer = writer0;
#endif

    // Write a few LIR instructions to the buffer: add the first parameter
    // to the constant 2.
    writer.ins0(LIR_start);
    LIns *two = writer.insImm(2);
    LIns *firstParam = writer.insParam(0, 0);
    LIns *result = writer.ins2(LIR_add, firstParam, two);
    writer.ins1(LIR_ret, result);

    // Emit a LIR_loop instruction.  It won't be reached, but there's
    // an assertion in Nanojit that trips if a fragment doesn't end with
    // a guard (a bug in Nanojit). 
    LIns *rec_ins = writer0.skip(sizeof(GuardRecord) + sizeof(SideExit));
    GuardRecord *guard = (GuardRecord *) rec_ins->payload();
    memset(guard, 0, sizeof(*guard));
    SideExit *exit = (SideExit *)(guard + 1);
    guard->exit = exit;
    guard->exit->target = f;
    f->lastIns = writer.insGuard(LIR_loop, writer.insImm(1), rec_ins);

    // Compile the fragment.
    compile(fragmento->assm(), f);
    if (fragmento->assm()->error() != None) {
        fprintf(stderr, "error compiling fragment\n");
        return 1;
    }
    printf("Compilation successful.\n");

    // Call the compiled function.
    typedef JS_FASTCALL int32_t (*AddTwoFn)(int32_t);
    AddTwoFn fn = reinterpret_cast<AddTwoFn>(f->code());
    printf("2 + 5 = %d\n", fn(5));
    return 0;
}

 

Guards

Guards are special LIR instructions, similar to conditional branches, with the difference that when they are called, instead of going to a particular address, they leave the JIT code entirely, and stop the trace.

Need

Guards are required in a cross platform dynamic language like Javascript. Certain assumptions are made when a particular JIT code is generated.

For example, in an instruction INR x, a guard would check that x doesn't overflow the range for a 32 bit integer. The JIT code would have a guard checking this condition(an xt guard), and would return to the interpreter if the condition turns out to be true. The interpreter is then equipped to handle the overflow.

Hence, guards are needed to prevent certain erroneous behaviour that might result from the assumptions that are generally made while JIT is generated.

 

TODO: Explain guards, guard records, VMSideExit, Fragmento, VerboseWriter::formatGuard...

 

 

Revision Source

<h3>Overview</h3>
<p>Nanojit is a small, cross-platform C++ library that emits machine code. Both the Tamarin JIT and the SpiderMonkey JIT (a.k.a. TraceMonkey) use Nanojit as their back end.</p>
<p>You can get Nanojit by cloning the <code>tamarin-redux</code> <a class="internal" href="/en/Mercurial" title="En/Mercurial">Mercurial</a> repository at <a class="external" href="http://hg.mozilla.org/tamarin-redux">http://hg.mozilla.org/tamarin-redux</a>. It's in the <code>nanojit</code> directory.</p>
<p>The input for Nanojit is a stream of <a class="internal" href="/En/Nanojit/LIR" title="en/Nanojit/LIR">Nanojit LIR instructions</a>. The term <em>LIR</em> is compiler jargon for a language used internally in a compiler that is usually cross-platform but very close to machine language. It is an acronym for "low-level intermediate representation". A compiler's LIR is typically one of several partly-compiled representations of a program that a compiler produces on the way from raw source code to machine code.</p>
<p>An application using Nanojit creates a <code>nanojit::LirBuffer</code> object to hold LIR instructions.  It creates a <code>nanojit::LirBufWriter</code> object to write instructions to the buffer.  Then it wraps the <code>LirBufWriter</code> in zero or more other <code>LirWriter</code> objects, all of which implement the same interface as <code>LirBufWriter</code>. This chain of <code>LirWriter</code> objects forms a pipeline for the instructions to pass through.  Each <code>LirWriter</code> can perform an optimization or other task on the program as it passes through the system and into the <code>LirBuffer</code>.</p>
<p>Once the instructions are in the <code>LirBuffer</code>, the application calls <code>nanojit::compile()</code> to produce machine code, which is stored in a <code>nanojit::Fragment</code>. Internally to Nanojit, another set of filters operates on the LIR as it passes from the <code>LirBuffer</code> toward the assembler. The result of compilation is a function that the application can call from C via a pointer to the first instruction.</p>
<h3>Example</h3>
<p>The following code works with SpiderMonkey's hacked version of Nanojit.  Figuring out how to compile it is left as an exercise for the reader; the following works when run in the object directory of an <code>--enable-debug</code> SpiderMonkey shell:</p>
<pre>g++ -DDEBUG -g3 -include mozilla-config.h -I dist/include/js  -I ../nanojit -o jittest ../jittest.cpp libjs_static.a
</pre>
<p>(Remove the <code>-DDEBUG</code> if you have not compiled SpiderMonkey with <code>--enable-debug</code>, and use whatever you called the sample source file in place of <em><code>jittest.cpp</code></em>.)</p>
<pre class="brush: cpp">#include &lt;stdio.h&gt;
#include &lt;stdint.h&gt;
#include "jsapi.h"
#include "nanojit.h"

using namespace nanojit;

const uint32_t CACHE_SIZE_LOG2 = 20;

int main()
{
    // Set up the basic Nanojit objects.
    avmplus::GC *gc = new avmplus::GC;
    if (!gc)
        return 1;
    avmplus::AvmCore core;
#ifdef DEBUG
    core.config.verbose = 1; // Show disassembly of generated traces.
#endif
    Fragmento *fragmento = new (gc) Fragmento(&amp;core, CACHE_SIZE_LOG2);
    LirBuffer *buf = new (gc) LirBuffer(fragmento, NULL);

    // Create a Fragment to hold some native code.
    Fragment *f = new (gc) Fragment(NULL);
    f-&gt;lirbuf = buf;
    f-&gt;root = f;

    // Create a LIR writer, with verbose output if DEBUG.
    LirBufWriter writer0(buf);
#ifdef DEBUG
    fragmento-&gt;labels = new (gc) LabelMap(&amp;core, NULL);
    buf-&gt;names = new (gc) LirNameMap(gc, NULL, fragmento-&gt;labels);
    VerboseWriter writer(gc, &amp;writer0, buf-&gt;names);
#else
    LirBufWriter&amp; writer = writer0;
#endif

    // Write a few LIR instructions to the buffer: add the first parameter
    // to the constant 2.
    writer.ins0(LIR_start);
    LIns *two = writer.insImm(2);
    LIns *firstParam = writer.insParam(0, 0);
    LIns *result = writer.ins2(LIR_add, firstParam, two);
    writer.ins1(LIR_ret, result);

    // Emit a LIR_loop instruction.  It won't be reached, but there's
    // an assertion in Nanojit that trips if a fragment doesn't end with
    // a guard (a bug in Nanojit). 
    LIns *rec_ins = writer0.skip(sizeof(GuardRecord) + sizeof(SideExit));
    GuardRecord *guard = (GuardRecord *) rec_ins-&gt;payload();
    memset(guard, 0, sizeof(*guard));
    SideExit *exit = (SideExit *)(guard + 1);
    guard-&gt;exit = exit;
    guard-&gt;exit-&gt;target = f;
    f-&gt;lastIns = writer.insGuard(LIR_loop, writer.insImm(1), rec_ins);

    // Compile the fragment.
    compile(fragmento-&gt;assm(), f);
    if (fragmento-&gt;assm()-&gt;error() != None) {
        fprintf(stderr, "error compiling fragment\n");
        return 1;
    }
    printf("Compilation successful.\n");

    // Call the compiled function.
    typedef JS_FASTCALL int32_t (*AddTwoFn)(int32_t);
    AddTwoFn fn = reinterpret_cast&lt;AddTwoFn&gt;(f-&gt;code());
    printf("2 + 5 = %d\n", fn(5));
    return 0;
}

</pre>
<p> </p>
<h3>Guards</h3>
<p>Guards are special LIR instructions, similar to conditional branches, with the difference that when they are called, instead of going to a particular address, they leave the JIT code entirely, and stop the trace. </p>
<h4>Need</h4>
<p>Guards are required in a cross platform dynamic language like Javascript. Certain assumptions are made when a particular JIT code is generated.</p>
<p>For example, in an instruction INR x, a guard would check that x doesn't overflow the range for a 32 bit integer. The JIT code would have a guard checking this condition(an xt guard), and would return to the interpreter if the condition turns out to be true. The interpreter is then equipped to handle the overflow.</p>
<p>Hence, guards are needed to prevent certain erroneous behaviour that might result from the assumptions that are generally made while JIT is generated. </p>
<p> </p>
<p><strong>TODO</strong>: Explain guards, guard records, <code>VMSideExit</code>, <code>Fragmento</code>, <code>VerboseWriter::formatGuard</code>...</p>
<p> </p>
<p> </p>
Revert to this revision