LIR

  • Revision slug: Nanojit/LIR
  • Revision title: LIR
  • Revision id: 100619
  • Created:
  • Creator: Jorend
  • Is current revision? No
  • Comment 93 words added, 8 words removed

Revision Content

In Nanojit, LIR is the source language for compilation to machine code. LIR stands for low-level intermediate representation.

The LIR instruction set

Types in LIR. Values in LIR have a type: either 32-bit or 64-bit. A 32-bit value, additionally, may be a condition or not. Each instruction requires operands of a specific type and produces a result of a specific type.

LIR makes no distinction at all between pointers and integers of the same size, between signed and unsigned integer values, or between 64-bit floating-point and 64-bit integer values. It is acceptable to load a 64-bit value from memory using ldq and then use the result with a floating-point arithmetic operation such as fadd.

The names of the operands and results below indicate the types required by LIR. Names starting with i indicate 32-bit values. Results starting with b are 32-bit integers and additionally are conditions. Operand names starting with b must be conditions. Names starting with q or f indicate 64-bit values. (f and q indicate the same type as far as LIR is concerned, but f is used here for floating-point operations.) Operand names starting with p must be pointer-sized integers. That is, they must be 64 bits on a 64-bit platform and 32 bits otherwise.

Constants

The result of each of these instructions is a constant value.

On 32-bit platforms, int is used to load constant addresses; on 64-bit platforms, quad is used.  The convenience function LirWriter::insImmPtr() emits the appropriate instruction depending on the platform.

Note: These instructions are currently rendered in VerboseWriter output as lines containing only the symbolic name of the immediate value being loaded. The instruction name and numeric value are not displayed for short and int instructions. This is arguably a bug.

short

An immediate 32-bit integer that fits in a signed 16-bit integer.

i = short <number>

<number> must be an integer that fits in the range of a signed 16-bit integer (-32768 to 32767). The result of a short instruction is a 32-bit integer with the same value (sign-extended).

The result is a 32-bit integer. This is exactly like int but saves a few bytes in the LirBuffer.

Use LirWriter::insImm(int32_t) to emit this instruction.  (If the argument fits in 16 bits, a short instruction is emitted.  Otherwise an int instruction is emitted.)

int

An immediate 32-bit value.

i = int <number>

<number> must be an integer that fits in int32_t or uint32_t.

Use LirWriter::insImm(int32_t) to emit this instruction.

quad

An immediate 64-bit value.

q = quad <number>

<number> must be either an integer that fits in a signed or unsigned 64-bit integer; or a floating-point number (containing a decimal point).

Use LirWriter::insImmq(int64_t) to emit this instruction.

Integer arithmetic

neg

32-bit integer negation.

i = neg i1

add

32-bit integer addition.

i = add i1, i2

qiadd

64-bit integer addition.

q = qiadd q1, q2

sub

32-bit integer subtraction.

i = sub i1, i2

mul

32-bit integer multiplication.

i = mul i1, i2

and

32-bit bitwise AND.

i = and i1, i2

qiand

64-bit bitwise AND.

q = qiand q1, q2

or

32-bit bitwise OR.

i = or i1, i2

qior

64-bit bitwise OR.

q = qior q1, q2

xor

32-bit bitwise XOR.

i = or i1, i2

not

32-bit bitwise NOT.

i = not i1

Bit shifting

The bit shifting operations behave as in Java and ECMAScript. When the first operand is 32-bit, all but the five least-significant bits of the second operand are discarded. (See {{ es3_spec("11.7.1") }}.) When the first operand is 64-bit, all but the six least-significant bits are discarded.  The shift count parameter must be a 32bit value even for 64-bit shift instructions.

lsh

32-bit left shift.

i = lsh i1, i2

The result is i1 left-shifted by i2 & 0x1f bits.

qilsh

64-bit left shift.

q = qilsh q1, i2

The result is q1 left-shifted by i2 & 0x3f bits.  q1 must be a 64-bit value,i1 must be a 32-bit value.

rsh

32-bit right shift with sign extend.

i = rsh i1, i2

The result is q1 right-shifted by i2 & 0x1f bits. New bits shifted into the result match the sign bit of i1.

ush

32-bit unsigned right shift.

i = ush i1, i2

The result is q1 right-shifted by i2 & 0x1f bits. New bits shifted into the result are zero.

Floating-point arithmetic

Any 64-bit value may be treated as a floating-point number. These operations behave according to the rules of IEEE 754 double-precision arithmetic. Some details may be found in the ECMAScript language standard, {{ Es3_spec("11.5.1") }} and subsequent sections.

fneg

Floating-point negation.

f = fneg f1

fadd

Floating-point addition.

f = fadd f1, f2

fsub

Floating-point subtraction.

f = fsub f1, f2

fmul

Floating-point multiplication.

f = mul f1, f2

fdiv

Floating-point division.

f = div f1, f2

Numeric conversions

qlo

Get the low 32 bits of a 64-bit value.

i = qlo q

qhi

Get the high 32 bits of a 64-bit value.

i = qhi q

qjoin

Join two 32-bit values to form a 64-bit value.

q = qjoin i1, i2

i2f

Convert signed 32-bit integer to floating-point number.

f = i2f i1

u2f

Convert unsigned 32-bit integer to floating-point number.

f = u2f i1

Loads and stores

LIR provides a single addressing mode.  Each load or store instruction takes a pointer provided by a previous instruction and a constant offset (in bytes). The offset must fit in the range of a signed 32-bit integer, even on 64-bit platforms.

Although the ld instruction takes 2 operands, a pointer and offset, the second operand must be the result of an int instruction.  This is enforced with assertions.

The convenience functions LirWriter::insLoad(LOpcode op, LIns *base, int32_t offset) and LirWriter::insStorei(LIns *value, LIns *base, int32_t offset) should be used to emit loads and stores.

ld

32-bit load. This instruction is never removed by common subexpression elimination.

i = ld p1[offset]

ldq

64-bit load. This instruction is never removed by common subexpression elimination.

q = ldq p1[offset]

ldcb

8-bit load. This instruction may be removed by common subexpression elimination.

i = ldcb p1[offset]

ldcs

16-bit load. This instruction may be removed by common subexpression elimination.

i = ldcs p1[offset]

ldc

32-bit load. This instruction may be removed by common subexpression elimination.

i = ldc p1[offset]

ldqc

64-bit load. This instruction may be removed by common subexpression elimination.

q = ldqc p1[offset]

st

32-bit store.

st p1[offset] = i2

stq

64-bit store.

stq p1[offset] = q2

offset must fit in the range of a 32-bit integer, even on 64-bit platforms.

sti

32-bit store.

sti p1[offset] = i2

offset must be in the range [-128, 127]. sti is identical to the corresponding st instruction but takes a few bytes less to represent in a LirBuffer.

stqi

64-bit store.

stqi p1[offset] = q2

offset must be in the range [-128, 127]. sti is identical to the corresponding st instruction but takes a few bytes less to represent in a LirBuffer.

Subroutines

Use LirWriter::insCall to emit a subroutine call. Pass a CallInfo object containing metadata about the function being called, including which calling convention to use.

{{ warning("The argv array to insCall must be in reverse order.") }}

Note: The call instructions have changed in Adobe's branch, and the changes will be merged back to Tracemonkey soonish. Code using insCall shouldn't be affected.

TODO: Explain calling conventions.

call

Subroutine call returning a 32-bit integer value.

i = call function(a1, a2, ...)

calli

Indirect subroutine call returning a 32-bit integer value.

i = calli p1(a2, a3, ...)

fcall

Subroutine call returning a floating-point value.

f = fcall function(a1, a2, ...)

fcalli

Indirect subroutine call returning a floating-point value.

f = fcalli p1(a2, a3, ...)

callh

Access the high 32 bits of a call returning a 64-bit result as a pair of 32-bit values. On 64-bit platforms, this instruction is unused.

i = callh i1

Here i1 must be the result of an earlier call instruction.

ret

Return a pointer-sized value.

ret p1

fret

Return a floating-point value.

fret f1

The code emitted for fret typically returns f1 in an FPU register. The exact behavior depends on the platform's calling conventions.

Conditions

The result of these instructions is a 32-bit value, either 1 (true) or 0 (false). Conditions are used as operands to conditional branch, guard, and conditional move instructions.

eq

32-bit integer equality test.

b = eq i1, i2

There is no not-equal instruction. Instead, flip the instruction that uses the result, or add a not instruction.

lt

Signed 32-bit integer less-than test.

b = lt i1, i2

gt

Signed 32-bit integer greater-than test.

b = gt i1, i2

le

Signed 32-bit integer less-than-or-equals test.

b = le i1, i2

ge

Signed 32-bit integer greater-than-or-equals test.

b = ge i1, i2

ult

Unsigned 32-bit integer less-than test.

b = ult i1, i2

ugt

Unsigned 32-bit integer greater-than test.

b = ugt i1, i2

ule

Unsigned 32-bit integer less-than-or-equals test.

b = ule i1, i2

uge

Unsigned 32-bit integer greater-than-or-equals test.

b = uge i1, i2

feq

Floating-point equality test.

b = feq f1, f2

flt

Floating-point less-than test.

b = flt f1, f2

fgt

Floating-point greater-than test.

b = fgt f1, f2

fle

Floating-point less-than-or-equals test.

b = fle f1, f2

fge

Floating-point greater-than-or-equals test.

b = fge f1, f2

ov

Test for overflow.

b = ov i1

The result is 1 if i1 is the result of an add, sub, or neg that overflowed the range of a signed 32-bit integer, for example.

Note: nanojit may produce incorrect code if this instruction does not immediately follow the instruction that produced i1. On Intel, this reads the overflow condition flag. Other platforms have to emulate this behavior.

cs

Test for carry.

b = cs i1

The result is 1 if i1 is the result of an add that overflowed the range of an unsigned 32-bit integer, for example.

Note: nanojit may produce incorrect code if this instruction does not immediately follow the instruction that produced i1. On Intel, this reads the carry condition flag. Other platforms have to emulate this behavior.

Guards

Note: VerboseWriter::formatGuard is left undefined in nanojit, so applications can display more information about the side exit alongside these instructions.

loop

Loop fragment.

loop

x

Exit unconditionally.

x

xt

Exit if true.

xt condition

xf

Exit if false.

xf condition

xbarrier

Do not exit, but emit writes to flush all values to the stack, just as a real guard would.

xbarrier

Forward branches

To emit forward branches in LIR, first emit a jump instruction. Later, emit a label instruction and use LIns::target to set the target of the jump instruction to the LIns * of the label.

j

Jump unconditionally.

j label

jt

Jump if true.

jt condition, label

jf

Jump if false.

jf condition, label

label

A jump target. This LIR instruction is used to hook up jumps to their targets. No machine code is emitted.

label:

ji

Indirect jump.  Currently not implemented.  Bug 465582 proposes to replace with jtbl, a table-based indirect jump with a known set of targets. 

Conditional moves

Note: These two instructions can be written using the idiom lirwriter->ins2(LIR_cmov, b1, lirwriter->ins2(LIR_2, i2, i3)). The LIR_2 instruction serves only to group the second and third operands, since LirWriter has no ins3 method.

The LirWriter::ins_choose() convenience method can be used instead. It uses the above idiom.

cmov

Choice of two 32-bit values.

i = cmov b1, i2, i3

qcmov

Choice of two 64-bit values.

Note: This instruction currently does not work on 32-bit Intel platforms.

q = qcmov b1, q2, q3

Special operations

start

Indicates the start of a fragment.

start

nearskip

Used to skip across blobs of binary data in the LIR, such as guard records. Also used internally to allow LirBuffers to continue across multiple pages.

skip

Used to skip across blobs of binary data in the LIR, such as guard records. Also used internally to allow LirBuffers to continue across multiple pages.

Operations that are weird but don't count as special

addp

Integer addition for temporary pointer calculations.

i = addp p1, i2

(Changed in tamarin-redux.)

Like add, but the result is not subject to common subexpression elimination. (This effectively serves as a hint to nanojit that p1 should stay in a register if possible and the sum should be discarded after its use, even if the same sum is calculated again later.)

param

Load a parameter.

p = param index
p = param index, kind

kind must be 0 or 1. The default is 0.

Nanojit compiles LIR to native code that can be called using a platform-specific calling convention (fastcall on Intel x86). Only pointer-sized, non-floating-point parameters are supported. The param instruction with kind=0 (the default) loads the value of one of the arguments passed by the caller. index in this case indicates which parameter to load. 0 indicates the first parameter.

It is up to the application to determine how many parameters, and of what C/C++ types, a LIR fragment takes. Warning: On platforms other than Intel x86, there may be an undocumented limit to how many parameters the application may use. Up to 4 parameters should work fine everywhere.

With kind=1, this instruction is used to enable explicit management of callee-save registers.  If a LIR fragment uses this, it must contain exactly one param instruction with kind=1 for each callee-save register on the target architecture.  The result of each of those instructions denotes the value of one callee-save register on entry to the fragment.  That value should not be used by the fragment.  Instead, filters and particularly the register allocator use the instruction to spill and restore callee-save registers as needed.

To emit this instruction, call LirWriter::insParam(int32_t index, int32_t kind).

file

Source filename for debug symbols.

file "filename"

line

Source line number for debug symbols.

line number

file and line are used to build symbol tables for the output binary. They have no executable semantics.

alloc

Allocate a fixed amount of stack space.  result is a pointer that can be used as a base for loads and stores, useful for stack-allocated structs or variables assigned to from more than one place.  Like the Calloca() function except that LIR_alloc does not take a runtime-computed size.

p = alloc size

size must be a multiple of 4 that does not exceed 262140 (0xffff << 2).

live

Extend live range of reference.

live x

Here x may be the result of any previous instruction that produces a value.

Revision Source

<p>In Nanojit, <strong>LIR</strong> is the source language for compilation to machine code. LIR stands for <em>low-level intermediate representation</em>.</p>
<h2>The LIR instruction set</h2>
<p><strong>Types in LIR.</strong> Values in LIR have a type: either 32-bit or 64-bit. A 32-bit value, additionally, may be a condition or not. Each instruction requires operands of a specific type and produces a result of a specific type.</p>
<p>LIR makes no distinction at all between pointers and integers of the same size, between signed and unsigned integer values, or between 64-bit floating-point and 64-bit integer values. It is acceptable to load a 64-bit value from memory using <code>ldq</code> and then use the result with a floating-point arithmetic operation such as <code>fadd</code>.</p>
<p>The names of the operands and results below indicate the types required by LIR. Names starting with <em>i</em> indicate 32-bit values. Results starting with <em>b</em> are 32-bit integers and additionally are conditions. Operand names starting with <em>b</em> must be conditions. Names starting with <em>q</em> or <em>f</em> indicate 64-bit values. (<em>f</em> and <em>q</em> indicate the same type as far as LIR is concerned, but <em>f</em> is used here for floating-point operations.) Operand names starting with <em>p</em> must be pointer-sized integers. That is, they must be 64 bits on a 64-bit platform and 32 bits otherwise.</p>
<h3>Constants</h3>
<p>The result of each of these instructions is a constant value.</p>
<p>On 32-bit platforms, <code>int</code> is used to load constant addresses; on 64-bit platforms, <code>quad</code> is used.  The convenience function <code>LirWriter::insImmPtr()</code> emits the appropriate instruction depending on the platform.</p>
<p>Note: These instructions are currently rendered in <code>VerboseWriter</code> output as lines containing only the symbolic name of the immediate value being loaded. The instruction name and numeric value are not displayed for <code>short</code> and <code>int</code> instructions. This is arguably a bug.</p>
<h4>short</h4>
<p>An immediate 32-bit integer that fits in a signed 16-bit integer.</p>
<pre><var>i</var> = short <var>&lt;number&gt;</var></pre>
<p><code><var>&lt;number&gt;</var></code> must be an integer that fits in the range of a signed 16-bit integer (-32768 to 32767). The result of a <code>short</code> instruction is a 32-bit integer with the same value (sign-extended).</p>
<p>The result is a 32-bit integer. This is exactly like <code>int</code> but saves a few bytes in the <code>LirBuffer</code>.</p>
<p>Use <code>LirWriter::insImm(int32_t)</code> to emit this instruction.  (If the argument fits in 16 bits, a <code>short</code> instruction is emitted.  Otherwise an <code>int</code> instruction is emitted.)</p>
<h4>int</h4>
<p>An immediate 32-bit value.</p>
<pre><var>i</var> = int <var>&lt;number&gt;</var></pre>
<p><code><var>&lt;number&gt;</var></code> must be an integer that fits in <code>int32_t</code> or <code>uint32_t</code>.</p>
<p>Use <code>LirWriter::insImm(int32_t)</code> to emit this instruction.</p>
<h4>quad</h4>
<p>An immediate 64-bit value.</p>
<pre><var>q</var> = quad <var>&lt;number&gt;</var></pre>
<p><code><var>&lt;number&gt;</var></code> must be either an integer that fits in a signed or unsigned 64-bit integer; or a floating-point number (containing a decimal point).</p>
<p>Use <code>LirWriter::insImmq(int64_t)</code> to emit this instruction.</p><h3>Integer arithmetic</h3>
<h4>neg</h4>
<p>32-bit integer negation.</p>
<pre><var>i</var> = neg <var>i1</var></pre>
<h4>add</h4>
<p>32-bit integer addition.</p>
<pre><var>i</var> = add <var>i1</var>, <var>i2</var></pre>
<h4>qiadd</h4>
<p>64-bit integer addition.</p>
<pre><var>q</var> = qiadd <var>q1</var>, <var>q2</var></pre>
<h4>sub</h4>
<p>32-bit integer subtraction.</p>
<pre><var>i</var> = sub <var>i1</var>, <var>i2</var></pre>
<h4>mul</h4>
<p>32-bit integer multiplication.</p>
<pre><var>i</var> = mul <var>i1</var>, <var>i2</var></pre>
<h4>and</h4>
<p>32-bit bitwise AND.</p>
<pre><var>i</var> = and <var>i1</var>, <var>i2</var></pre>
<h4>qiand</h4>
<p>64-bit bitwise AND.</p>
<pre><var>q</var> = qiand <var>q1</var>, <var>q2</var></pre>
<h4>or</h4>
<p>32-bit bitwise OR.</p>
<pre><var>i</var> = or <var>i1</var>, <var>i2</var></pre>
<h4>qior</h4>
<p>64-bit bitwise OR.</p>
<pre><var>q</var> = qior <var>q1</var>, <var>q2</var></pre>
<h4>xor</h4>
<p>32-bit bitwise XOR.</p>
<pre><var>i</var> = or <var>i1</var>, <var>i2</var></pre>
<h4>not</h4>
<p>32-bit bitwise NOT.</p>
<pre><var>i</var> = not <var>i1</var></pre>
<h3>Bit shifting</h3>
<p>The bit shifting operations behave as in Java and ECMAScript. When the first operand is 32-bit, all but the five least-significant bits of the second operand are discarded. (See {{ es3_spec("11.7.1") }}.) When the first operand is 64-bit, all but the six least-significant bits are discarded.  The shift count parameter must be a 32bit value even for 64-bit shift instructions.</p>
<h4>lsh</h4>
<p>32-bit left shift.</p>
<pre><var>i</var> = lsh <var>i1</var>, <var>i2</var></pre>
<p>The result is <code><var>i1</var></code> left-shifted by <code><var>i2</var> &amp; 0x1f</code> bits.</p>
<h4>qilsh</h4>
<p>64-bit left shift.</p>
<pre><var>q</var> = qilsh <var>q1</var>, <var>i2</var></pre>
<p>The result is <code><var>q1</var></code> left-shifted by <code><var>i2</var> &amp; 0x3f</code> bits.  <em><code>q1</code></em> must be a 64-bit value,<em><code><span style="font-family: monospace;">i1</span></code></em> must be a 32-bit value.</p>
<h4>rsh</h4>
<p>32-bit right shift with sign extend.</p>
<pre><var>i</var> = rsh <var>i1</var>, <var>i2</var></pre>
<p>The result is <code><var>q1</var></code> right-shifted by <code><var>i2</var> &amp; 0x1f</code> bits. New bits shifted into the result match the sign bit of <code><var>i1</var></code>.</p>
<h4>ush</h4>
<p>32-bit unsigned right shift.</p>
<pre><var>i</var> = ush <var>i1</var>, <var>i2</var></pre>
<p>The result is <code><var>q1</var></code> right-shifted by <code><var>i2</var> &amp; 0x1f</code> bits. New bits shifted into the result are zero.</p><h3>Floating-point arithmetic</h3>
<p>Any 64-bit value may be treated as a floating-point number. These operations behave according to the rules of IEEE 754 double-precision arithmetic. Some details may be found in the ECMAScript language standard, {{ Es3_spec("11.5.1") }} and subsequent sections.</p>
<h4>fneg</h4>
<p>Floating-point negation.</p>
<pre><var>f</var> = fneg <var>f1</var></pre>
<h4>fadd</h4>
<p>Floating-point addition.</p>
<pre><var>f</var> = fadd <var>f1</var>, <var>f2</var></pre>
<h4>fsub</h4>
<p>Floating-point subtraction.</p>
<pre><var>f</var> = fsub <var>f1</var>, <var>f2</var></pre>
<h4>fmul</h4>
<p>Floating-point multiplication.</p>
<pre><var>f</var> = mul <var>f1</var>, <var>f2</var></pre>
<h4>fdiv</h4>
<p>Floating-point division.</p>
<pre><var>f</var> = div <var>f1</var>, <var>f2</var></pre>
<h3>Numeric conversions</h3>
<h4>qlo</h4>
<p>Get the low 32 bits of a 64-bit value.</p>
<pre><var>i</var> = qlo <var>q</var></pre>
<h4>qhi</h4>
<p>Get the high 32 bits of a 64-bit value.</p>
<pre><var>i</var> = qhi <var>q</var></pre>
<h4>qjoin</h4>
<p>Join two 32-bit values to form a 64-bit value.</p>
<pre><var>q</var> = qjoin <var>i1</var>, <var>i2</var></pre>
<h4>i2f</h4>
<p>Convert signed 32-bit integer to floating-point number.</p>
<pre><var>f</var> = i2f <var>i1</var></pre>
<h4>u2f</h4>
<p>Convert unsigned 32-bit integer to floating-point number.</p>
<pre><var>f</var> = u2f <var>i1</var></pre>
<h3>Loads and stores</h3>
<p>LIR provides a single addressing mode.  Each load or store instruction takes a pointer provided by a previous instruction and a constant offset (in bytes). The <code><var>offset</var></code> must fit in the range of a signed 32-bit integer, even on 64-bit platforms.</p>
<p>Although the <code>ld</code> instruction takes 2 operands, a pointer and offset, the second operand must be the result of an <code>int</code> instruction.  This is enforced with assertions.</p>
<p>The convenience functions <code>LirWriter::insLoad(LOpcode op, LIns *base, int32_t offset)</code> and <code>LirWriter::insStorei(LIns *value, LIns *base, int32_t offset)</code> should be used to emit loads and stores.</p>
<h4>ld</h4>
<p>32-bit load. This instruction is never removed by common subexpression elimination.</p>
<pre><var>i</var> = ld <var>p1</var>[<em>offset</em>]</pre>
<h4>ldq</h4>
<p>64-bit load. This instruction is never removed by common subexpression elimination.</p>
<pre><var>q</var> = ldq <var>p1</var>[<var>offset</var>]</pre>
<h4>ldcb</h4>
<p>8-bit load. This instruction may be removed by common subexpression elimination.</p>
<pre><var>i</var> = ldcb <var>p1</var>[<var>offset</var>]</pre>
<h4>ldcs</h4>
<p>16-bit load. This instruction may be removed by common subexpression elimination.</p>
<pre><var>i</var> = ldcs <var>p1</var>[<var>offset</var>]</pre>
<h4>ldc</h4>
<p>32-bit load. This instruction may be removed by common subexpression elimination.</p>
<pre><var>i</var> = ldc <var>p1</var>[<var>offset</var>]</pre>
<h4>ldqc</h4>
<p>64-bit load. This instruction may be removed by common subexpression elimination.</p>
<pre><var>q</var> = ldqc <var>p1</var>[<var>offset</var>]</pre>
<h4>st</h4>
<p>32-bit store.</p>
<pre>st <var>p1</var>[<var>offset</var>] = <var>i2</var><code><var><br></var></code></pre>
<h4>stq</h4>
<p>64-bit store.</p>
<pre>stq <var>p1</var>[<var>offset</var>] = <var>q2</var></pre>
<p><code><var>offset</var></code> must fit in the range of a 32-bit integer, even on 64-bit platforms.</p>
<h4>sti</h4>
<p>32-bit store.</p>
<pre>sti <var>p1</var>[<var>offset</var>] = <var>i2</var></pre>
<p><code><var>offset</var></code> must be in the range [-128, 127]. <code>sti</code> is identical to the corresponding <code>st</code> instruction but takes a few bytes less to represent in a <code>LirBuffer</code>.</p>
<h4>stqi</h4>
<p>64-bit store.</p>
<pre>stqi <var>p1</var>[<var>offset</var>] = <var>q2</var></pre>
<p><code><var>offset</var></code> must be in the range [-128, 127]. <code>sti</code> is identical to the corresponding <code>st</code> instruction but takes a few bytes less to represent in a <code>LirBuffer</code>.</p>
<h3>Subroutines</h3>
<p>Use <code>LirWriter::insCall</code> to emit a subroutine call. Pass a <code>CallInfo</code> object containing metadata about the function being called, including which calling convention to use.</p>
<p>{{ warning("The argv array to <code>insCall</code> must be in reverse order.") }}</p>
<p>Note: The <code>call</code> instructions have changed in Adobe's branch, and the changes will be merged back to Tracemonkey soonish. Code using <code>insCall</code> shouldn't be affected.</p>
<p><strong>TODO:</strong> Explain calling conventions.</p>
<h4>call</h4>
<p>Subroutine call returning a 32-bit integer value.</p>
<pre><em>i</em> = call <var>function</var>(<var>a1</var>, <var>a2</var>, ...)</pre>
<h4>calli</h4>
<p>Indirect subroutine call returning a 32-bit integer value.</p>
<pre><em>i</em> = calli <var>p1</var>(<var>a2</var>, <var>a3</var>, ...)</pre>
<h4>fcall</h4>
<p>Subroutine call returning a floating-point value.</p>
<pre><em>f</em> = fcall <var>function</var>(<var>a1</var>, <var>a2</var>, ...)</pre>
<h4>fcalli</h4>
<p>Indirect subroutine call returning a floating-point value.</p>
<pre><em>f</em> = fcalli <var>p1</var>(<var>a2</var>, <var>a3</var>, ...)</pre>
<h4>callh</h4>
<p>Access the high 32 bits of a call returning a 64-bit result as a pair of 32-bit values. On 64-bit platforms, this instruction is unused.</p>
<pre><em>i</em> = callh <var>i1</var></pre>
<p>Here <code><var>i1</var></code> must be the result of an earlier call instruction.</p>
<h4>ret</h4>
<p>Return a pointer-sized value.</p>
<pre>ret <var>p1</var></pre>
<h4>fret</h4>
<p>Return a floating-point value.</p>
<pre>fret <var>f1</var></pre>
<p>The code emitted for <code>fret</code> typically returns <code><var>f1</var></code> in an FPU register. The exact behavior depends on the platform's calling conventions.</p><h3>Conditions</h3>
<p>The result of these instructions is a 32-bit value, either 1 (true) or 0 (false). Conditions are used as operands to conditional branch, guard, and conditional move instructions.</p>
<h4>eq</h4>
<p>32-bit integer equality test.</p>
<pre><var>b</var> = eq <var>i1</var>, <var>i2</var></pre>
<p>There is no not-equal instruction. Instead, flip the instruction that uses the result, or add a <code>not</code> instruction.</p>
<h4>lt</h4>
<p>Signed 32-bit integer less-than test.</p>
<pre><var>b</var> = lt <var>i1</var>, <var>i2</var></pre>
<h4>gt</h4>
<p>Signed 32-bit integer greater-than test.</p>
<pre><var>b</var> = gt <var>i1</var>, <var>i2</var></pre>
<h4>le</h4>
<p>Signed 32-bit integer less-than-or-equals test.</p>
<pre><var>b</var> = le <var>i1</var>, <var>i2</var></pre>
<h4>ge</h4>
<p>Signed 32-bit integer greater-than-or-equals test.</p>
<pre><var>b</var> = ge <var>i1</var>, <var>i2</var></pre>
<h4>ult</h4>
<p>Unsigned 32-bit integer less-than test.</p>
<pre><var>b</var> = ult <var>i1</var>, <var>i2</var></pre>
<h4>ugt</h4>
<p>Unsigned 32-bit integer greater-than test.</p>
<pre><var>b</var> = ugt <var>i1</var>, <var>i2</var></pre>
<h4>ule</h4>
<p>Unsigned 32-bit integer less-than-or-equals test.</p>
<pre><var>b</var> = ule <var>i1</var>, <var>i2</var></pre>
<h4>uge</h4>
<p>Unsigned 32-bit integer greater-than-or-equals test.</p>
<pre><var>b</var> = uge <var>i1</var>, <var>i2</var></pre>
<h4>feq</h4>
<p>Floating-point equality test.</p>
<pre><var>b</var> = feq <var>f1</var>, <var>f2</var></pre>
<h4>flt</h4>
<p>Floating-point less-than test.</p>
<pre><var>b</var> = flt <var>f1</var>, <var>f2</var></pre>
<h4>fgt</h4>
<p>Floating-point greater-than test.</p>
<pre><var>b</var> = fgt <var>f1</var>, <var>f2</var></pre>
<h4>fle</h4>
<p>Floating-point less-than-or-equals test.</p>
<pre><var>b</var> = fle <var>f1</var>, <var>f2</var></pre>
<h4>fge</h4>
<p>Floating-point greater-than-or-equals test.</p>
<pre><var>b</var> = fge <var>f1</var>, <var>f2</var></pre>
<h4>ov</h4>
<p>Test for overflow.</p>
<pre><var>b</var> = ov <var>i1</var></pre>
<p>The result is <code>1</code> if <code><var>i1</var></code> is the result of an <code>add</code>, <code>sub</code>, or <code>neg</code> that overflowed the range of a signed 32-bit integer, for example.</p>
<p>Note: nanojit may produce incorrect code if this instruction does not immediately follow the instruction that produced <code><var>i1</var></code>. On Intel, this reads the overflow condition flag. Other platforms have to emulate this behavior.</p>
<h4>cs</h4>
<p>Test for carry.</p>
<pre><var>b</var> = cs <var>i1</var></pre>
<p>The result is <code>1</code> if <code><var>i1</var></code> is the result of an <code>add</code> that overflowed the range of an unsigned 32-bit integer, for example.</p>
<p>Note: nanojit may produce incorrect code if this instruction does not immediately follow the instruction that produced <code><var>i1</var></code>. On Intel, this reads the carry condition flag. Other platforms have to emulate this behavior.</p>
<h3>Guards</h3>
<p>Note: <code>VerboseWriter::formatGuard</code> is left undefined in nanojit, so applications can display more information about the side exit alongside these instructions.</p>
<h4>loop</h4>
<p>Loop fragment.</p>
<pre>loop</pre>
<h4>x</h4>
<p>Exit unconditionally.</p>
<pre>x</pre>
<h4>xt</h4>
<p>Exit if true.</p>
<pre>xt <var>condition</var></pre>
<h4>xf</h4>
<p>Exit if false.</p>
<pre>xf <var>condition</var></pre>
<h4>xbarrier</h4>
<p>Do not exit, but emit writes to flush all values to the stack, just as a real guard would.</p>
<pre>xbarrier</pre><h3>Forward branches</h3>
<p>To emit forward branches in LIR, first emit a jump instruction. Later, emit a <code>label</code> instruction and use <code>LIns::target</code> to set the target of the jump instruction to the <code>LIns *</code> of the <code>label</code>.</p>
<h4>j</h4>
<p>Jump unconditionally.</p>
<pre>j <var>label</var></pre>
<h4>jt</h4>
<p>Jump if true.</p>
<pre>jt <var>condition</var>, <var>label</var></pre>
<h4>jf</h4>
<p>Jump if false.</p>
<pre>jf <var>condition</var>, <var>label</var></pre>
<h4>label</h4>
<p>A jump target. This LIR instruction is used to hook up jumps to their targets. No machine code is emitted.</p>
<pre><var>label</var>:</pre>
<h4>ji</h4>
<p>Indirect jump.  Currently not implemented.  <a class="link-https" href="https://bugzilla.mozilla.org/show_bug.cgi?id=465582" title="https://bugzilla.mozilla.org/show_bug.cgi?id=465582">Bug 465582</a> proposes to replace with <code>jtbl</code>, a table-based indirect jump with a known set of targets.  </p><h3>Conditional moves</h3>
<p>Note: These two instructions can be written using the idiom <code>lirwriter-&gt;ins2(LIR_cmov, b1, lirwriter-&gt;ins2(LIR_2, i2, i3))</code>. The <code>LIR_2</code> instruction serves only to group the second and third operands, since <code>LirWriter</code> has no <code>ins3</code> method.</p>
<p>The <code>LirWriter::ins_choose()</code> convenience method can be used instead. It uses the above idiom.</p>
<h4>cmov</h4>
<p>Choice of two 32-bit values.</p>
<pre><var>i</var> = cmov <var>b1</var>, <var>i2</var>, <var>i3</var></pre>
<h4>qcmov</h4>
<p>Choice of two 64-bit values.</p>
<p>Note: This instruction currently does not work on 32-bit Intel platforms.</p>
<pre><var>q</var> = qcmov <var>b1</var>, <var>q2</var>, <var>q3</var></pre>
<h3>Special operations</h3>
<h4>start</h4>
<p>Indicates the start of a fragment.</p>
<pre>start</pre>
<h4>nearskip</h4>
<p>Used to skip across blobs of binary data in the LIR, such as guard records. Also used internally to allow <code>LirBuffer</code>s to continue across multiple pages.</p>
<h4>skip</h4>
<p>Used to skip across blobs of binary data in the LIR, such as guard records. Also used internally to allow <code>LirBuffer</code>s to continue across multiple pages.</p>
<h3>Operations that are weird but don't count as special</h3>
<h4>addp</h4>
<p>Integer addition for temporary pointer calculations.</p>
<pre><var>i</var> = addp <var>p1</var>, <var>i2</var></pre>
<p><strong>(Changed in tamarin-redux.)</strong></p>
<p>Like <code>add</code>, but the result is not subject to common subexpression elimination. (This effectively serves as a hint to nanojit that <code>p1</code> should stay in a register if possible and the sum should be discarded after its use, even if the same sum is calculated again later.)</p>
<h4>param</h4>
<p>Load a parameter.</p>
<pre><var>p</var> = param <var>index</var>
<var>p</var> = param <var>index</var>, <var>kind</var>
</pre>
<p><code><var>kind</var></code> must be 0 or 1. The default is 0.</p>
<p>Nanojit compiles LIR to native code that can be called using a platform-specific calling convention (fastcall on Intel x86). Only pointer-sized, non-floating-point parameters are supported. The <code>param</code> instruction with <code><var>kind</var></code>=0 (the default) loads the value of one of the arguments passed by the caller. <code><var>index</var></code> in this case indicates which parameter to load. 0 indicates the first parameter.</p>
<p>It is up to the application to determine how many parameters, and of what C/C++ types, a LIR fragment takes. <strong>Warning:</strong> On platforms other than Intel x86, there may be an undocumented limit to how many parameters the application may use. Up to 4 parameters should work fine everywhere.</p>
<p>With <code><var>kind</var></code>=1, this instruction is used to enable explicit management of callee-save registers.  If a LIR fragment uses this, it must contain exactly one <code>param</code> instruction with <em><code>kind</code></em><code>=1</code> for each callee-save register on the target architecture.  The result of each of those instructions denotes the value of one callee-save register on entry to the fragment.  That value should not be used by the fragment.  Instead, filters and particularly the register allocator use the instruction to spill and restore callee-save registers as needed.</p>
<p>To emit this instruction, call <code>LirWriter::insParam(int32_t index, int32_t kind)</code>. </p><h4>file</h4>
<p>Source filename for debug symbols.</p>
<pre>file "<var>filename</var>"</pre>
<h4>line</h4>
<p>Source line number for debug symbols.</p>
<pre>line <var>number</var></pre>
<p><code>file</code> and <code>line</code> are used to build symbol tables for the output binary. They have no executable semantics.</p>
<h4>alloc</h4>
<p>Allocate a fixed amount of stack space.  result is a pointer that can be used as a base for loads and stores, useful for stack-allocated structs or variables assigned to from more than one place.  Like the Calloca() function except that LIR_alloc does not take a runtime-computed size.</p>
<pre><var>p</var> = alloc <var>size</var></pre>
<p><code><var>size</var></code> must be a multiple of 4 that does not exceed 262140 (<code>0xffff &lt;&lt; 2</code>).</p><h4>live</h4>
<p>Extend live range of reference.</p>
<pre>live <var>x</var></pre>
<p>Here <code><var>x</var></code> may be the result of any previous instruction that produces a value.</p>
Revert to this revision