Parser API

  • Revision slug: SpiderMonkey/Parser_API
  • Revision title: Parser API
  • Revision id: 35993
  • Created:
  • Creator: Dherman
  • Is current revision? No
  • Comment 25 words added

Revision Content

Upcoming builds of the standalone SpiderMonkey shell will include a reflection of the SpiderMonkey parser, made available as an in-JavaScript API. This makes it easier to write tools in JavaScript that manipulate JavaScript source programs, such as syntax highlighters or static analyses.

Example:

> var expr = Reflect.parse("obj.foo + 42").body[0].expression
> expr.left.property
({loc:null, type:"Identifier", name:"foo"})
> expr.right
({loc:{start:{line:1, column:10}, end:{line:1, column:12}}, type:"Literal", value:42})

Built-in objects

In the SpiderMonkey shell, the global object includes a singleton Reflect object, which currently contains just the parse method.

Reflect methods

The Reflect object currently consists of a single method.

parse(string)

Parses the input string as a JavaScript program and returns a Program object (see below) representing the parsed abstract syntax tree (AST).

Node objects

The result of parsing produces Node objects, which are plain JavaScript objects (i.e., their prototype derives from the standard Object prototype). All node types implement the following interface:

interface Node {
    type: string;
    loc: { start: Position, end: Position } | null;
}

The type field is a string representing the AST variant type. Each subtype of Node is documented below with the specific string of its type field. You can use this field to determine which interface a node implements.

The loc field represents the source location information of the node. If the parser produced no information about the node's source location, the field is null; otherwise it is an object consisting of a start position (the position of the first character of the parsed source region) and an end position (the position of the first character after the parsed source region).

Each Position object consists of a line number (1-indexed) and a column number (0-indexed):

interface Position {
    line: uint32 >= 1;
    column: uint32 >= 0;
}

Programs

interface Program <: Node {
    type: "Program";
    elements: [ Statement ];
}

A complete program source tree.

Statements

interface Statement <: Node { }

Any statement.

interface EmptyStatement <: Statement {
    type: "EmptyStatement";
}

An empty statement, i.e., a solitary semicolon.

interface BlockStatement <: Statement {
    type: "BlockStatement";
    body: [ Statement ];
}

A block statement, i.e., a sequence of statements surrounded by braces.

interface ExpressionStatement <: Statement {
    type: "ExpressionStatement";
    expression: Expression;
}

An expression statement, i.e., a statement consisting of a single expression.

interface IfStatement <: Statement {
    type: "IfStatement";
    test: Expression;
    alternate: Statement;
    consequent: Statement | null;
}

An if statement.

interface LabelledStatement <: Statement {
    type: "LabelledStatement";
    label: Identifier;
    body: Statement;
}

A labelled statement, i.e., a statement prefixed by a break/continue label.

interface BreakStatement <: Statement {
    type: "BreakStatement";
    label: Identifier | null;
}

A break statement.

interface ContinueStatement <: Statement {
    type: "ContinueStatement";
    label: Identifier | null;
}

continue statement.

interface WithStatement <: Statement {
    type: "WithStatement";
    object: Expression;
    body: Statement;
}

with statement.

interface SwitchStatement <: Statement {
    type: "SwitchStatement";
    test: Expression;
    cases: [ SwitchCase ];
}

A switch statement.

interface ReturnStatement <: Statement {
    type: "ReturnStatement";
    argument: Expression | null;
}

A return statement.

interface ThrowStatement <: Statement {
    type: "ThrowStatement";
    argument: Expression;
}

A throw statement.

interface TryStatement <: Statement {
    type: "TryStatement";
    block: BlockStatement;
    handler: CatchClause | [ CatchClause ] | null;
    finalizer: BlockStatement | null;
}

A try statement. If the source contains more than one catch clause, the handler property is an array.

Note: multiple catch clauses are SpiderMonkey-specific.
interface WhileStatement <: Statement {
    type: "WhileStatement";
    test: Expression;
    body: Statement;
}

A while statement.

interface DoWhileStatement <: Statement {
    type: "DoWhileStatement";
    body: Statement;
    test: Expression;
}

A do/while statement.

interface ForStatement <: Statement {
    type: "ForStatement";
    init: VariableDeclaration | Expression | null;
    test: Expression | null;
    update: Expression | null;
}

A for statement.

interface ForInStatement <: Statement {
    type: "ForInStatement";
    left: VariableDeclaration |  Expression;
    right: Expression;
    body: Statement;
    each: boolean;
}

A for/in statement, or, if each is true, a for each/in statement.

The for each form is SpiderMonkey-specific.
interface DebuggerStatement <: Statement {
    type: "DebuggerStatement";
}

A debugger statement.

Note: the debugger statement is SpiderMonkey-specific.

Declarations

In SpiderMonkey, declarations can appear in any statement context, so the parser API treats function and variable declarations as statements.

Note: declarations in arbitrary nested scopes are SpiderMonkey-specific.
interface Declaration <: Statement { }

Any declaration node.

interface FunctionDeclaration <: Statement {
    type: "FunctionDeclaration";
    id: Identifier;
    params: [ Identifier ];
    body: BlockStatement;
}

A function declaration.

interface VariableDeclaration <: Statement {
    type: "VariableDeclaration";
    declarations: [ { id: Identifier, init: Expression | null } ];
    kind: "var" | "let" | "const";
}

A variable declaration, via one of var, let, or const.

Note: let and const are SpiderMonkey-specific.

Expressions

interface Expression <: Node, Pattern { }

Any expression node. Since the left-hand side of an assignment may be any expression in general, an expression can also be a pattern.

interface ThisExpression <: Expression {
    type: "ThisExpression";
}

this expression.

interface ArrayExpression <: Expression {
    type: "ArrayExpression";
    elements: [ Expression | null ]
}

An array expression.

interface ObjectExpression <: Expression {
    type: "ObjectExpression";
    properties: [ { key: StringLiteral | Identifier | IntegerLiteral, value: Expression | Getter | Setter } ]
}

An object expression.

interface FunctionExpression <: Expression {
    type: "FunctionExpression";
    id: Identifier | null;
    params: [ Identifier ];
    body: BlockStatement;
}

A function expression.

interface SequenceExpression <: Expression {
    type: "SequenceExpression";
    expressions: [ Expression ]
}

A sequence expression, i.e., a comma-separated sequence of expressions.

interface UnaryExpression <: Expression {
    type: "UnaryExpression";
    operator: UnaryOperator;
    prefix: boolean;
    argument: Expression;
}

A unary operator expression.

interface BinaryExpression <: Expression {
    type: "BinaryExpression";
    operator: BinaryOperator;
    left: Expression;
    right: Expression;
}

A binary operator expression.

interface AssignmentExpression <: Expression {
    type: "AssignmentExpression";
    operator: AssignmentOperator;
    left: Expression;
    right: Expression;
}

An assignment operator expression.

interface UpdateExpression <: Expression {
    type: "UpdateExpression";
    operator: UpdateOperator;
    argument: Expression;
    prefix: boolean;
}

An update (increment or decrement) operator expression.

interface LogicalExpression <: Expression {
    type: "LogicalExpression";
    operator: LogicalOperator;
    left: Expression;
    right: Expression;
}

A logical operator expression.

interface ConditionalExpression <: Expression {
    type: "ConditionalExpression";
    test: Expression;
    alternate: Expression;
    consequent: Expression;
}

A conditional expression, i.e., a ternary ?/: expression.

interface NewExpression <: Expression {
    type: "NewExpression";
    constructor: Expression;
    arguments: [ Expression ] | null;
}

A new expression.

interface CallExpression <: Expression {
    type: "CallExpression";
    callee: Expression;
    arguments: [ Expression ];
}

A function or method call expression.

interface MemberExpression <: Expression {
    type: "MemberExpression";
    object: Expression;
    property: Identifier | Expression;
    computed : boolean;
}

A member expression. If computed === true, the node corresponds to a computed e1[e2] expression and property is an Expression. If computed === false, the node corresponds to a static e1.x expression and property is an Identifier.

Patterns

more to come.

Clauses

interface SwitchCase <: Node {
    type: "SwitchCase";
    test: Expression | null;
    consequent: [ Statement ];
}

A case (if test is an Expression) or default (if test === null) clause in the body of a switch statement.

interface CatchClause <: Node {
    type: "CatchClause";
    param: Identifier;
    guard: Expression | null;
    body: BlockStatement;
}

catch clause following a try block. The optional guard property corresponds to the optional expression guard on the bound variable.

Note: the guard expression is SpiderMonkey-specific.

Miscellaneous

interface Identifier <: Node, Expression, Pattern {
    type: "Identifier";
    name: string;
}

An identifier. Note that an identifier may be an expression or a destructuring pattern.

interface Literal <: Node, Expression {
    type: "Literal";
    value: string | boolean | null | number | RegExp;
}

A literal token. Note that a literal can be an expression.

interface UnaryOperator <: Node {
    type: "UnaryOperator";
    token: "-" | "+" | "!" | "~" | "typeof" | "void" | "delete";
}

A unary operator token.

interface BinaryOperator <: Node {
    type: "BinaryOperator";
    token: "==" | "!=" | "===" | "!=="
            | "<" | "<=" | ">" | ">="
            | "<<" | ">>" | ">>>"
            | "+" | "-" | "*" | "/" | "%"
            | "|" | "^" | "^"
            | "in" | "instanceof"
            | ".."
}

A binary operator token.

Note: the .. operator is E4X-specific.
interface LogicalOperator <: Node {
    type: "LogicalOperator";
    token: "||" | "&&";
}

A logical operator token.

interface AssignmentOperator <: Node {
    type: "AssignmentOperator";
    token: "=" | "+=" | "-=" | "*=" | "/=" | "%="
            | "<<=" | ">>=" | ">>>="
            | "|=" | "^=" | "&="
}

An assignment operator token.

interface UpdateOperator <: Node {
    type: "UpdateOperator";
    token: "++" | "--";
}

An update (increment or decrement) operator token.

E4X

The following node types are for E4X.

more to come.

Revision Source

<p>Upcoming builds of the standalone SpiderMonkey shell will include a reflection of the SpiderMonkey parser, made available as an in-JavaScript API. This makes it easier to write tools in JavaScript that manipulate JavaScript source programs, such as syntax highlighters or static analyses.</p>
<p>Example:</p>
<pre>&gt; var expr = Reflect.parse("obj.foo + 42").body[0].expression
&gt; expr.left.property
({loc:null, type:"Identifier", name:"foo"})
&gt; expr.right
({loc:{start:{line:1, column:10}, end:{line:1, column:12}}, type:"Literal", value:42})
</pre>
<h2>Built-in objects</h2>
<p>In the SpiderMonkey shell, the global object includes a singleton <code>Reflect</code> object, which currently contains just the <code>parse</code> method.</p>
<h2><code>Reflect</code> methods</h2>
<p>The <code>Reflect</code> object currently consists of a single method.</p>
<h4><code>parse(<em>string</em>)</code></h4>
<p>Parses the input string as a JavaScript program and returns a Program object (see below) representing the parsed abstract syntax tree (AST).</p>
<h2>Node objects</h2>
<p>The result of parsing produces Node objects, which are plain JavaScript objects (i.e., their prototype derives from the standard <code>Object</code> prototype). All node types implement the following interface:<em><br>
</em></p>
<pre>interface Node {
    type: string;
    loc: { start: Position, end: Position } | null;
}
</pre>
<p>The <code>type</code> field is a string representing the AST variant type. Each subtype of Node is documented below with the specific string of its <code>type</code> field. You can use this field to determine which interface a node implements.</p>
<p>The loc field represents the source location information of the node. If the parser produced no information about the node's source location, the field is null; otherwise it is an object consisting of a start position (the position of the first character of the parsed source region) and an end position (the position of the first character <em>after</em> the parsed source region).</p>
<p>Each <code>Position</code> object consists of a <code>line</code> number (1-indexed) and a <code>column</code> number (0-indexed):</p>
<pre>interface Position {
    line: uint32 &gt;= 1;
    column: uint32 &gt;= 0;
}</pre>
<h3>Programs</h3>
<pre>interface Program &lt;: Node {
    type: "Program";
    elements: [ Statement ];
}
</pre>
<p>A complete program source tree.</p>
<h3>Statements</h3>
<pre>interface Statement &lt;: Node { }</pre>
<p>Any statement.</p>
<pre>interface EmptyStatement &lt;: Statement {
    type: "EmptyStatement";
}
</pre>
<p>An empty statement, i.e., a solitary semicolon.</p>
<pre>interface BlockStatement &lt;: Statement {
    type: "BlockStatement";
    body: [ Statement ];
}
</pre>
<p>A block statement, i.e., a sequence of statements surrounded by braces.</p>
<pre>interface ExpressionStatement &lt;: Statement {
    type: "ExpressionStatement";
    expression: Expression;
}
</pre>
<p>An expression statement, i.e., a statement consisting of a single expression.</p>
<pre>interface IfStatement &lt;: Statement {
    type: "IfStatement";
    test: Expression;
    alternate: Statement;
    consequent: Statement | null;
}
</pre>
<p>An <code>if</code> statement.</p>
<pre>interface LabelledStatement &lt;: Statement {
    type: "LabelledStatement";
    label: Identifier;
    body: Statement;
}
</pre>
<p>A labelled statement, i.e., a statement prefixed by a <code>break</code>/<code>continue</code> label.</p>
<pre>interface BreakStatement &lt;: Statement {
    type: "BreakStatement";
    label: Identifier | null;
}
</pre>
<p>A <code>break</code> statement.</p>
<pre>interface ContinueStatement &lt;: Statement {
    type: "ContinueStatement";
    label: Identifier | null;
}
</pre>
<p>A <code>continue</code> statement.</p>
<pre>interface WithStatement &lt;: Statement {
    type: "WithStatement";
    object: Expression;
    body: Statement;
}
</pre>
<p>A <code>with</code> statement.</p>
<pre>interface SwitchStatement &lt;: Statement {
    type: "SwitchStatement";
    test: Expression;
    cases: [ SwitchCase ];
}
</pre>
<p>A <code>switch</code> statement.</p>
<pre>interface ReturnStatement &lt;: Statement {
    type: "ReturnStatement";
    argument: Expression | null;
}
</pre>
<p>A <code>return</code> statement.</p>
<pre>interface ThrowStatement &lt;: Statement {
    type: "ThrowStatement";
    argument: Expression;
}
</pre>
<p>A <code>throw</code> statement.</p>
<pre>interface TryStatement &lt;: Statement {
    type: "TryStatement";
    block: BlockStatement;
    handler: CatchClause | [ CatchClause ] | null;
    finalizer: BlockStatement | null;
}
</pre>
<p>A <code>try</code> statement. If the source contains more than one <code>catch</code> clause, the <code>handler</code> property is an array.</p>
<div class="note">Note: multiple <code>catch</code> clauses are SpiderMonkey-specific.</div>
<pre>interface WhileStatement &lt;: Statement {
    type: "WhileStatement";
    test: Expression;
    body: Statement;
}
</pre>
<p>A <code>while</code> statement.</p>
<pre>interface DoWhileStatement &lt;: Statement {
    type: "DoWhileStatement";
    body: Statement;
    test: Expression;
}
</pre>
<p>A <code>do</code>/<code>while</code> statement.</p>
<pre>interface ForStatement &lt;: Statement {
    type: "ForStatement";
    init: VariableDeclaration | Expression | null;
    test: Expression | null;
    update: Expression | null;
}
</pre>
<p>A <code>for</code> statement.</p>
<pre>interface ForInStatement &lt;: Statement {
    type: "ForInStatement";
    left: VariableDeclaration |  Expression;
    right: Expression;
    body: Statement;
    each: boolean;
}
</pre>
<p>A <code>for</code>/<code>in</code> statement, or, if <code>each</code> is <code>true</code>, a <code>for each</code>/<code>in</code> statement.</p>
<div class="note">The <code>for each</code> form is SpiderMonkey-specific.</div>
<pre>interface DebuggerStatement &lt;: Statement {
    type: "DebuggerStatement";
}
</pre>
<p>A <code>debugger</code> statement.</p>
<div class="note">Note: the <code>debugger</code> statement is SpiderMonkey-specific.</div>
<h3>Declarations</h3>
<p>In SpiderMonkey, declarations can appear in any statement context, so the parser API treats function and variable declarations as statements.</p>
<div class="note">Note: declarations in arbitrary nested scopes are SpiderMonkey-specific.</div>
<pre>interface Declaration &lt;: Statement { }</pre>
<p>Any declaration node.</p>
<pre>interface FunctionDeclaration &lt;: Statement {
    type: "FunctionDeclaration";
    id: Identifier;
    params: [ Identifier ];
    body: BlockStatement;
}
</pre>
<p>A function declaration.</p>
<pre>interface VariableDeclaration &lt;: Statement {
    type: "VariableDeclaration";
    declarations: [ { id: Identifier, init: Expression | null } ];
    kind: "var" | "let" | "const";
}
</pre>
<p>A variable declaration, via one of <code>var</code>, <code>let</code>, or <code>const</code>.</p>
<div class="note">Note: <code>let</code> and <code>const</code> are SpiderMonkey-specific.</div>
<h3>Expressions</h3>
<pre>interface Expression &lt;: Node, Pattern { }</pre>
<p>Any expression node. Since the left-hand side of an assignment may be any expression in general, an expression can also be a pattern.</p>
<pre>interface ThisExpression &lt;: Expression {
    type: "ThisExpression";
}
</pre>
<p>A <code>this</code> expression.</p>
<pre>interface ArrayExpression &lt;: Expression {
    type: "ArrayExpression";
    elements: [ Expression | null ]
}</pre>
<p>An array expression.</p>
<pre>interface ObjectExpression &lt;: Expression {
    type: "ObjectExpression";
    properties: [ { key: StringLiteral | Identifier | IntegerLiteral, value: Expression | Getter | Setter } ]
}</pre>
<p>An object expression.</p>
<pre>interface FunctionExpression &lt;: Expression {
    type: "FunctionExpression";
    id: Identifier | null;
    params: [ Identifier ];
    body: BlockStatement;
}
</pre>
<p>A function expression.</p>
<pre>interface SequenceExpression &lt;: Expression {
    type: "SequenceExpression";
    expressions: [ Expression ]
}</pre>
<p>A sequence expression, i.e., a comma-separated sequence of expressions.</p>
<pre>interface UnaryExpression &lt;: Expression {
    type: "UnaryExpression";
    operator: UnaryOperator;
    prefix: boolean;
    argument: Expression;
}</pre>
<p>A unary operator expression.</p>
<pre>interface BinaryExpression &lt;: Expression {
    type: "BinaryExpression";
    operator: BinaryOperator;
    left: Expression;
    right: Expression;
}</pre>
<p>A binary operator expression.</p>
<pre>interface AssignmentExpression &lt;: Expression {
    type: "AssignmentExpression";
    operator: AssignmentOperator;
    left: Expression;
    right: Expression;
}</pre>
<p>An assignment operator expression.</p>
<pre>interface UpdateExpression &lt;: Expression {
    type: "UpdateExpression";
    operator: UpdateOperator;
    argument: Expression;
    prefix: boolean;
}</pre>
<p>An update (increment or decrement) operator expression.</p>
<pre>interface LogicalExpression &lt;: Expression {
    type: "LogicalExpression";
    operator: LogicalOperator;
    left: Expression;
    right: Expression;
}</pre>
<p>A logical operator expression.</p>
<pre>interface ConditionalExpression &lt;: Expression {
    type: "ConditionalExpression";
    test: Expression;
    alternate: Expression;
    consequent: Expression;
}</pre>
<p>A conditional expression, i.e., a ternary <code>?</code>/<code>:</code> expression.</p>
<pre>interface NewExpression &lt;: Expression {
    type: "NewExpression";
    constructor: Expression;
    arguments: [ Expression ] | null;
}</pre>
<p>A <code>new</code> expression.</p>
<pre>interface CallExpression &lt;: Expression {
    type: "CallExpression";
    callee: Expression;
    arguments: [ Expression ];
}</pre>
<p>A function or method call expression.</p>
<pre>interface MemberExpression &lt;: Expression {
    type: "MemberExpression";
    object: Expression;
    property: Identifier | Expression;
    computed : boolean;
}</pre>
<p>A member expression. If <code>computed === true</code>, the node corresponds to a computed <code>e1[e2]</code> expression and property is an <code>Expression</code>. If <code>computed === false</code>, the node corresponds to a static <code>e1.x</code> expression and property is an <code>Identifier</code>.</p>
<h3>Patterns</h3>
<p>more to come.</p>
<h3>Clauses</h3>
<pre>interface SwitchCase &lt;: Node {
    type: "SwitchCase";
    test: Expression | null;
    consequent: [ Statement ];
}</pre>
<p>A <code>case</code> (if <code>test</code> is an <code>Expression</code>) or <code>default</code> (if <code>test === null</code>) clause in the body of a <code>switch</code> statement.</p>
<pre>interface CatchClause &lt;: Node {
    type: "CatchClause";
    param: Identifier;
    guard: Expression | null;
    body: BlockStatement;
}</pre>
<p>A <code>catch</code> clause following a <code>try</code> block. The optional <code>guard</code> property corresponds to the optional expression guard on the bound variable.</p>
<div class="note">Note: the guard expression is SpiderMonkey-specific.</div>
<h3>Miscellaneous</h3>
<pre>interface Identifier &lt;: Node, Expression, Pattern {
    type: "Identifier";
    name: string;
}
</pre>
<p>An identifier. Note that an identifier may be an expression or a destructuring pattern.</p>
<pre>interface Literal &lt;: Node, Expression {
    type: "Literal";
    value: string | boolean | null | number | RegExp;
}
</pre>
<p>A literal token. Note that a literal can be an expression.</p>
<pre>interface UnaryOperator &lt;: Node {
    type: "UnaryOperator";
    token: "-" | "+" | "!" | "~" | "typeof" | "void" | "delete";
}
</pre>
<p>A unary operator token.</p>
<pre>interface BinaryOperator &lt;: Node {
    type: "BinaryOperator";
    token: "==" | "!=" | "===" | "!=="
            | "&lt;" | "&lt;=" | "&gt;" | "&gt;="
            | "&lt;&lt;" | "&gt;&gt;" | "&gt;&gt;&gt;"
            | "+" | "-" | "*" | "/" | "%"
            | "|" | "^" | "^"
            | "in" | "instanceof"
            | ".."
}
</pre>
<p>A binary operator token.</p>
<div class="note">Note: the <code>..</code> operator is E4X-specific.</div>
<pre>interface LogicalOperator &lt;: Node {
    type: "LogicalOperator";
    token: "||" | "&amp;&amp;";
}</pre>
<p>A logical operator token.</p>
<pre>interface AssignmentOperator &lt;: Node {
    type: "AssignmentOperator";
    token: "=" | "+=" | "-=" | "*=" | "/=" | "%="
            | "&lt;&lt;=" | "&gt;&gt;=" | "&gt;&gt;&gt;="
            | "|=" | "^=" | "&amp;="
}
</pre>
<p>An assignment operator token.</p>
<pre>interface UpdateOperator &lt;: Node {
    type: "UpdateOperator";
    token: "++" | "--";
}</pre>
<p>An update (increment or decrement) operator token.</p>
<h3>E4X</h3>
<p>The following node types are for E4X.</p>
<p>more to come.</p>
Revert to this revision