Code comments Redirect 1

This section describes JavaScript's lexical grammar. The source text of ECMAScript scripts gets scanned from left to right and is converted into a sequence of input elements which are tokens, control characters, line terminators, comments or white space. ECMAScript also defines certain keywords and literals and has rules for automatic insertion of semicolons to end statements.

Control characters

Control characters have no visual representation but are used to control the interpretation of the text.

Unicode format-control characters
Code point Name Abbreviation Description
U+200C Zero width non-joiner <ZWNJ> Placed between characters to prevent being connected into ligatures in certain languages (Wikipedia).
U+200D Zero width joiner <ZWJ> Placed between characters connect to words in certain languages (Wikipedia).
U+FEFF Byte order mark <BOM> Used at the start of the script to mark it as Unicode and the text's byte order (Wikipedia).

White space

White space characters are improving the readability of the source text and separate tokens from each other. These characters are usually unnecessary for the functionality of the code. Minification tools are often used to remove whitespace in order to reduce the amount of data that needs to be transferred.

White space characters
Code point Name Abbreviation Description Escape sequence
U+0009 Character tabulation <HT> Horizontal tabulation \t
U+000B Line tabulation <VT> Vertical tabulation \v
U+000C Form feed <FF> Page breaking control character (Wikipedia). \f
U+0020 Space <SP> Normal space  
U+00A0 No-break space <NBSP> Normal space, but no point at which a line may break  
Others Other Unicode space characters <USP> Spaces in Unicode on Wikipedia  

Line terminators

In addition to white space characters, line terminator characters are used to improve the readability of the source text. However, in some cases, line terminators can influence the the execution of JavaScript code as there are a few places where they are forbidden. Line terminators also affect the process of automatic semicolon insertion. Line terminators are matched by the \s class in regular expressions.

Only the following Unicode code points are treated as line terminators in ECMAScript, other line breaking characters are treated as white space (for example, Next Line, NEL, U+0085 is considered as white space).

Line terminator characters
Code point Name Abbreviation Description Escape sequence
U+000A Line Feed <LF> New line character in UNIX systems. \n
U+000D Carriage Return <CR> New line character in Commodore and early Mac systems. \r
U+2028 Line Separator <LS> Wikipedia  
U+2029 Paragraph Separator <PS> Wikipedia  

Comments

Comments are used to add hints, notes, suggestions, or warnings to JavaScript code. This can make it easier to read and understand. They can also be used to disable code to prevent it from being executed; this can be a valuable debugging tool.

JavaScript has two ways of assigning comments in its code.

The first way is the // comment; this makes all text following it on the same line into a comment. For example:

function comment() {
  // This is a one line JavaScript comment
  alert("Hello world!");
}
comment();

The second way is the /* */ style, which is much more flexible.

For example, you can use it on a single line:

function comment() {
  /* This is a one line JavaScript comment */
  alert("Hello world!");
}
comment();

You can also make multiple-line comments, like this:

function comment() {
  /* This comment spans multiple lines. Notice
     that we don't need to end the comment until we're done. */
  alert("Hello world!");
}
comment();

You can also use it in the middle of a line, if you wish, although this can make your code harder to read so it should be used with caution:

function comment(x) {
  alert("Hello " + x /* insert the value of x */ + " !");
}
comment("world");

In addition, you can use it to disable code to prevent it from running, by wrapping code in a comment, like this:

function comment() {
  /* alert("Hello world!"); */
}
comment();

In this case, the alert() call is never issued, since it's inside a comment. Any number of lines of code can be disabled this way.

Keywords

Reserved keywords as of ECMAScript 6

Future reserved keywords

The following are reserved as future keywords by the ECMAScript specification. They have no special functionality at present, but they might at some future time, so they cannot be used as identifiers. These keywords may not be used in either strict or non-strict mode.

  • enum

The following are reserved as future keywords when they are found in strict mode code:

  • implements
  • package
  • protected
  • static
  • interface
  • private
  • public

Future reserved keywords in older standards

The following are reserved as future keywords by older ECMAScript specifications (ECMAScript 1 till 3).

  • abstract
  • boolean
  • byte
  • char
  • double
  • final
  • float
  • goto
  • int
  • long
  • native
  • short
  • synchronized
  • transient
  • volatile

Additionally, the literals null, true, and false are reserved in ECMAScript for their normal uses.

Reserved word usage

Reserved words actually only apply to Identifiers (vs. IdentifierNames) . As described in es5.github.com/#A.1, these are all IdentifierNames which do not exclude ReservedWords.

a.import
a["import"]
a = { import: "test" }.

On the other hand the following is illegal because it's an Identifier, which is an IdentifierName without the reserved words. Identifiers are used for FunctionDeclaration and FunctionExpression.

function import() {} // Illegal.

Literals

Null literal

See also null for more information.

null

Boolean literal

See also Boolean for more information.

true
false

Numeric literals

Decimal

0123456789

Binary

Binary number syntax using 0b or 0B:
Throws a SyntaxError: "Missing binary digits after 0b", if digits are not 0 or 1.

var FLT_SIGNBIT  = 0b10000000000000000000000000000000; // 2147483648
var FLT_EXPONENT = 0b01111111100000000000000000000000; // 2139095040
var FLT_MANTISSA = 0B00000000011111111111111111111111; // 8388607

Octal

Octal number syntax using 0o or 0O:
Throws a SyntaxError: "Missing octal digits after 0o", if digits are not between 0 and 7.

var n = 0O755; // 493
var m = 0o644; // 420

Hexadecimal

Hexadecimal number syntax using 0x or 0X:
Throws a SyntaxError: "Identifier starts immediately after numeric literal", if it is outside the hexadecimal range (0123456789ABCDEF).

0xFFFFFFFFFFFFFFFFF // 295147905179352830000
0x123456789ABCDEF   // 81985529216486900
0XA                 // 10

String literals

'foo'
"bar"

Hexadecimal escape sequences

'\xA9' // "©"

Unicode escape sequences

The Unicode escape sequences require at least four characters following \u.

'\u00A9' // "©"

Unicode code point escapes

New in ECMAScript 6. With Unicode code point escapes, any character can be escaped using hexadecimal numbers so that it is possible to use Unicode code points up to 0x10FFFF. With simple Unicode escapes it is often necessary to write the surrogate halves separately to achieve the same.

See also String.fromCodePoint() or String.prototype.codePointAt().

'\u{2F804}'

// the same with simple Unicode escapes
'\uD87E\uDC04'

Regular expression literals

See also RegExp for more information.

/ab+c/f

Automatic semicolon insertion

Some JavaScript statements must be terminated with semicolons and are therefore affected by automatic semicolon insertion (ASI):

  • Empty statement
  • let, const, variable statement
  • import, export, module declaration
  • Expression statement
  • debugger
  • continue, break, throw
  • return

The ECMAScript specification mentions three rules of semicolon insertion.

1.  A semicolon is inserted before, when a Line terminator or "}" is encountered that is not allowed by the grammar.

{ 1
2 } 3

// is transformed by ASI into 

{ 1
;2 ;} 3;

2.  A semicolon is inserted at the end, when the end of the input stream of tokens is detected and the the parser is unable to parse the single input stream as a complete program.

Here ++ is not treated as a postfix operator applying to variable b, because a line terminator occurs between b and ++.

a = b
++c

// is transformend by ASI into

a = b;
++c;

3. A semicolon is inserted at the end, when a statement with restricted productions in the grammar is followed by a line terminator. These statements with "no LineTerminator here" rules are:

  • PostfixExpressions (++ and --)
  • continue
  • break
  • return
  • yield, yield*
  • module
return
a + b

// is transformed by ASI into

return;
a + b;

Specifications

Specification Status Comment
ECMAScript 1st Edition. Standard Initial definition.
ECMAScript 5.1 (ECMA-262)
The definition of 'Lexical Conventions' in that specification.
Standard  
ECMAScript 6 (ECMA-262)
The definition of 'Lexical Grammar' in that specification.
Draft Added: Binary and Octal Numeric literals, Unicode code point escapes, Templates

Browser compatibility

Feature Chrome Firefox (Gecko) Internet Explorer Opera Safari
Basic support (Yes) (Yes) (Yes) (Yes) (Yes)
Binary and octal numeric literals
(0b and 0o)
(Yes) 25 (25) ? ? ?
Unicode code point escapes
(\u{})
? Not supported
bug 952985
? ? ?
Feature Android Chrome for Android Firefox Mobile (Gecko) IE Mobile Opera Mobile Safari Mobile
Basic support (Yes) (Yes) (Yes) (Yes) (Yes) (Yes)
Binary and octal numeric literals ? ? 25 (25) ? ? ?
Unicode code point escapes ? ? Not supported
bug 952985
? ? ?

Firefox-specific notes

  • Prior to Firefox 5 (JavaScript 1.8.6), future reserved keywords could be used when not in strict mode. This ECMAScript violation was fixed in Firefox 5.

See also

Document Tags and Contributors

 Contributors to this page: Sheppy
 Last updated by: Sheppy,