Literal character: a, b
A literal character specifies exactly itself to be matched in the input text.
A single character that is not one of the syntax characters described below.
In regular expressions, most characters can appear literally. They are usually the most basic building blocks of patterns. For example, here is a pattern from the Removing HTML tags example:
const pattern = /<.+?>/g;
In this example,
? are called syntax characters. They have special meanings in regular expressions. The rest of the characters in the pattern (
>) are literal characters. They match themselves in the input text: the left and right angle brackets.
The following characters are syntax characters in regular expressions, and they cannot appear as literal characters:
Whenever you want to match a syntax character literally, you need to escape it with a backslash (
\). For example, to match a literal
* in a pattern, you need to write
\* in the pattern. Using syntax characters as literal characters either leads to unexpected results or causes syntax errors — for example,
/*/ is not a valid regular expression because the quantifier is not preceded by a pattern. In non-unicode mode,
} may appear literally if it's not possible to parse them as the end of a character class or quantifier delimiters. This is a deprecated syntax for web compatibility, and you should not rely on it.
Regular expression literals cannot be specified with certain non-syntax literal characters.
/ cannot appear as a literal character in a regular expression literal, because
/ is used as the delimiter of the literal itself. You need to escape it as
\/ if you want to match a literal
/. Line terminators cannot appear as literal characters in a regular expression literal either, because a literal cannot span multiple lines. You need to use a character escape like
\n instead. There are no such restrictions when using the
RegExp() constructor, although string literals have their own escaping rules (for example,
"\\" actually denotes a single backslash character, so
new RegExp("\\*") and
/\*/ are equivalent).
Within character classes, more characters can appear literally. For more information, see the Character class page. For example
[.] both match a literal
In non-unicode mode, the pattern is interpreted as a sequence of UTF-16 code units. This means surrogate pairs actually represent two literal characters. This causes unexpected behaviors when paired with other features:
/^[😄]$/.test("😄"); // false, because the pattern is interpreted as /^[\ud83d\udc04]$/ /^😄+$/.test("😄😄"); // false, because the pattern is interpreted as /^\ud83d\udc04+$/
In unicode mode, the pattern is interpreted as a sequence of Unicode code points, and surrogate pairs do not get split. Therefore, you should always prefer to use the
Using literal characters
The following example is copied from Character escape. The
b characters are literal characters in the pattern, and
\n is an escaped character because it cannot appear literally in a regular expression literal.
const pattern = /a\nb/; const string = `a b`; console.log(pattern.test(string)); // true
|ECMAScript Language Specification |
BCD tables only load in the browser