JavaScript Regex Guide
JavaScript's regex engine has caught up with the rest of the world over the last decade — lookbehind, named groups, Unicode property escapes, sticky matching, and matchAll all landed in modern engines. This guide is a practical reference: which method to call, which flag to set, which gotcha to watch for, and where the engine differs from PCRE, Python, or Java. Pair it with the regex tester and the explainer to learn by experimenting.
Two Ways to Create a Regex
JavaScript has two regex syntaxes: a literal with slashes, and the RegExp constructor:
// Literal — pattern is fixed at parse time
const literal = /\d{3}-\d{4}/g;
// Constructor — pattern can be built from variables
const dynamic = new RegExp(`\\d{${minDigits}}`, 'g');
Use the literal form whenever the pattern is known at write time. It is more readable and the parser checks the pattern at parse time rather than at first use. Use new RegExp() only when you need to build the pattern from runtime input — but be careful to escape user input first, or you create a regex injection vulnerability.
The literal needs single backslashes (/\d/) while the constructor needs doubled ones ("\\d") because the string parser eats one. This trips up everyone at least once.
Flags
JavaScript supports seven regex flags. Combine freely:
const re = /pattern/gimsuy;
g(global) — find all matches, not just the first. Required formatchAll,replaceAll, and most extraction tasks.i(case-insensitive) — match letters regardless of case.m(multiline) — make^and$match line boundaries instead of string boundaries.s(dotAll) — make.match newline characters. Without this,.skips\n.u(Unicode) — enable proper Unicode handling. Treats surrogate pairs as one character and unlocks\p{...}property escapes.y(sticky) — match only at the position indicated bylastIndex. Useful for tokenizers; rare in everyday code.d(hasIndices) — added in ES2022. Each match includesindiceswith the start/end positions of every capture group.
The g flag is the source of most regex bugs in JavaScript. See the section on lastIndex below.
String Methods vs RegExp Methods
JavaScript exposes regex through both the String prototype and RegExp objects:
String methods (regex is the argument)
str.match(re) // returns array of matches, or null
str.matchAll(re) // returns iterator of detailed match objects (g flag required)
str.replace(re, fn) // replace first match (or all if g flag set)
str.replaceAll(re, fn) // replace all (g flag required if regex)
str.split(re) // split by matches
str.search(re) // index of first match, or -1
RegExp methods (regex is the receiver)
re.test(str) // returns true/false — the cheapest check
re.exec(str) // returns one detailed match; advances lastIndex if g flag set
Picking between them:
- Just checking a match exists?
re.test(str)— fastest and clearest. - Want all matches with capture groups?
str.matchAll(re)— returns an iterator of full match objects. - Replacing or transforming?
str.replaceorstr.replaceAllwith a callback. - Tokenizing with positional control?
re.execin a loop with theyflag.
The lastIndex Trap
Regex objects with the g or y flag carry a mutable lastIndex property. Every call to exec or test updates it to the position after the match, and the next call resumes from there. If you reuse the same regex, this produces results that look random:
const re = /\d+/g;
re.test('abc123'); // true — matched, lastIndex = 6
re.test('abc123'); // false — resumed from index 6, no more matches
re.test('abc123'); // true — wrapped to 0, matched again
Three solutions, in order of preference:
- Use
matchAllormatchinstead ofexecin a loop. They handle iteration internally without exposinglastIndex. - Construct the regex inline if you only use it once:
str.match(/\d+/g). - Reset explicitly:
re.lastIndex = 0before reuse — but this is easy to forget.
This bug is so common that many style guides forbid the g flag with test entirely. If you only need a boolean, use test without g.
Named Capture Groups
JavaScript supports named groups since ES2018:
const m = '2026-04-30'.match(/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/);
m.groups.year; // '2026'
m.groups.month; // '04'
m.groups.day; // '30'
// In replace callbacks
'2026-04-30'.replace(
/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/,
(_, ...args) => {
const groups = args[args.length - 1];
return `${groups.day}/${groups.month}/${groups.year}`;
}
);
// '30/04/2026'
// In replacement strings
'2026-04-30'.replace(
/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/,
'$<day>/$<month>/$<year>'
);
Backreferences to named groups use \k<name> in the pattern. The named-group syntax is identical to .NET's, but differs from older Python which uses (?P<name>). When porting regex from Python, do find-and-replace: (?P< → (?<, (?P=name) → \k<name>.
Lookahead and Lookbehind
Lookarounds match a position based on what comes before or after, without consuming any characters:
// Positive lookahead — followed by " dollars"
'100 dollars'.match(/\d+(?= dollars)/); // ['100']
// Negative lookahead — NOT followed by " euros"
'100 dollars'.match(/\d+(?! euros)/); // ['100']
// Positive lookbehind — preceded by "$"
'$50'.match(/(?<=\$)\d+/); // ['50']
// Negative lookbehind — NOT preceded by "$"
'50 items'.match(/(?<!\$)\d+/); // ['50']
JavaScript's lookbehind landed in ES2018 (Chrome 62, Firefox 78, Safari 16.4, Node 10). Earlier browsers throw a SyntaxError. Unlike Java's regex engine, JavaScript supports variable-length lookbehind, so patterns like (?<=\$\d*) work fine.
Full coverage in the lookahead and lookbehind guide.
Unicode and the u Flag
Without u, JavaScript treats regex as a sequence of UTF-16 code units. Characters above U+FFFF (emoji, rare scripts) are surrogate pairs that the engine sees as two characters, breaking the otherwise-intuitive idea that . matches one character. The u flag fixes this:
// Without u — broken for non-BMP characters
/^.$/.test('💩'); // false — emoji is two code units
// With u — correct
/^.$/u.test('💩'); // true
The u flag also enables Unicode property escapes — \p{...} and \P{...} for any character property defined in the Unicode standard:
/\p{Letter}/u // any letter from any script
/\p{Script=Greek}/u // Greek letters
/\p{Number}/u // digits in any script (including Eastern Arabic, Devanagari, etc.)
/\p{Emoji}/u // emoji
Always use u when matching anything potentially non-ASCII. The performance cost is negligible.
Replacement Callback Patterns
The replace callback gets the full match, capture groups, offset, original string, and (for named groups) a groups object:
'$15.50'.replace(/\$(\d+)\.(\d{2})/, (match, dollars, cents, offset, str, groups) => {
return `${dollars} dollars and ${cents} cents`;
});
// '15 dollars and 50 cents'
Common patterns:
// HTML escape
str.replace(/[&<>"']/g, c => ({
'&':'&', '<':'<', '>':'>', '"':'"', "'":'''
}[c]));
// camelCase → kebab-case
str.replace(/([A-Z])/g, '-$1').toLowerCase();
// Strip leading/trailing whitespace (or use String.prototype.trim)
str.replace(/^\s+|\s+$/g, '');
// Highlight matches in HTML (escape user input first!)
text.replace(new RegExp(escape(query), 'gi'), m => `<mark>${m}</mark>`);
Common Gotchas
- Forgetting to escape user input in
new RegExp()— a malicious input like.*turns your filter into a wildcard. Use a smallescapeRegexhelper or a library likelodash.escapeRegExp. - Catastrophic backtracking — patterns like
(a+)+bcan hang the JS engine on long inputs. Avoid nested quantifiers; prefer atomic alternatives or the*?lazy quantifier. - Forgetting the
gflag withreplace— without it, only the first match is replaced.replaceAllrequiresgif the first argument is a regex (otherwise it throws). - Confusing
$&and$1in replacement strings —$&is the whole match;$1is the first capture group;$$is a literal$. - Greedy quantifiers eating too much — for "match between two delimiters," use
.*?(lazy) or a negated character class like[^<]+.
Try Patterns Live
Paste any pattern from this guide into the regex tester with sample text to see matches in real time. For unfamiliar tokens, the regex explainer gives a plain-English breakdown of every piece. Start from the pattern library for ready-made email, URL, IP, and date patterns.
Frequently Asked Questions
Should I use String methods or RegExp methods in JavaScript?
Use String methods (match, matchAll, replace, replaceAll, split, search) when the regex is the input and the string is what you have on hand — they read naturally as text operations. Use RegExp methods (test, exec) when you have a long-lived regex object you reuse, or when you need exec's positional state via lastIndex for tokenizer-style iteration. test() is also the cheapest way to check if a pattern matches at all, since it returns a boolean and can stop on the first match.
Why does my regex with the g flag behave inconsistently?
RegExp objects with the g flag carry a lastIndex property that exec and test mutate after each call. If you reuse the same regex across calls, the next call resumes from where the last one left off, which produces surprising results. Solutions: use String.prototype.matchAll instead of repeated exec calls, reset lastIndex to 0 between batches, or construct a fresh regex per batch. matchAll returns an iterator and avoids the stateful lastIndex pitfall entirely.
Does JavaScript support lookbehind?
Yes — lookbehind landed in ES2018 and is supported in all modern browsers (Chrome 62+, Firefox 78+, Safari 16.4+) and Node.js 10+. Use (?<=pattern) for positive lookbehind and (?<!pattern) for negative lookbehind. Unlike some engines, JavaScript supports variable-length lookbehind, so patterns like (?<=\$\d*) work. If you must support very old browsers, fall back to a capturing group and slice the result.
How do I match Unicode characters in JavaScript regex?
Add the u flag for proper Unicode handling: surrogate pairs are treated as one character, character classes like \w respect ASCII rules unless extended, and Unicode property escapes \p{...} become available. With the u flag, /\p{Letter}/u matches any letter from any script, /\p{Emoji}/u matches emoji, and /\p{Script=Greek}/u matches Greek letters. Without the u flag, JavaScript uses the older byte-level regex semantics, which mishandles characters above U+FFFF.
What are named capture groups in JavaScript regex?
Named groups use the syntax (?<name>pattern), and matches are accessed via match.groups.name or in a replace callback's groups argument. They make complex patterns far more readable than relying on positional groups. JavaScript's syntax matches .NET and modern Python, but differs from older Python which uses (?P<name>pattern). When migrating regex from Python, replace (?P< with (?< and (?P= with \k< for backreferences.