Regex Lookahead and Lookbehind

Lookarounds are the regex feature that finally lets you say "match X, but only if Y is nearby." They are zero-width assertions — they test a position rather than consuming characters, which means the matched text stays exactly what you wanted without trailing context attached. This guide covers all four forms (positive and negative, ahead and behind), the engine-support gotchas, and the patterns where lookarounds are the only clean tool.

The Four Lookarounds

Every regex engine that supports lookarounds spells them the same way:

(?=X)    Positive lookahead   — must be FOLLOWED by X
(?!X)    Negative lookahead   — must NOT be followed by X
(?<=X)   Positive lookbehind  — must be PRECEDED by X
(?<!X)   Negative lookbehind  — must NOT be preceded by X

The pattern X inside a lookaround can be anything — a literal, a character class, a group, even another lookaround. The lookaround does not advance the engine's position when it succeeds, so subsequent parts of the pattern see the same starting index they would have seen without it.

Positive Lookahead Examples

Use positive lookahead when you want to match X only if Y comes after, but you do not want Y in the match:

// Match dollar amounts (number, but only when followed by " dollars")
'100 dollars 50 euros'.match(/\d+(?= dollars)/g);
// ['100']

// Match the username portion of an email (text before @)
'jane@example.com'.match(/[\w.+-]+(?=@)/);
// ['jane']

// Match a number followed by a unit (without including the unit)
'5kg of apples and 3oz of pepper'.match(/\d+(?=\s*(kg|oz|lb))/g);
// ['5', '3']

The key is that the matched text never includes the asserted suffix. If you want the suffix, capture it with a group instead.

Negative Lookahead Examples

Negative lookahead inverts the assertion — match X only if Y does not follow:

// Numbers NOT followed by " dollars"
'100 dollars 50 euros'.match(/\b\d+\b(?! dollars)/g);
// ['50']

// Words that don't end with "ly"
'quickly slowly fast happy'.match(/\b\w+(?<!ly)\b/g);
// ['fast', 'happy']

// Lines that don't contain a specific word
text.match(/^(?!.*ERROR).+$/gm);  // every line that doesn't have ERROR

The "find the word that does NOT appear before X" pattern is one of the most common uses of negative lookahead. It is much cleaner than the alternative of matching everything, then filtering out the unwanted matches in code.

Positive Lookbehind Examples

Lookbehind tests what came before. Positive lookbehind matches X only if Y precedes:

// Numbers preceded by "$"
'$50 and 100 items'.match(/(?<=\$)\d+/g);
// ['50']

// Word after "Mr." or "Mrs." — title-stripped names
'Mr. Smith and Mrs. Jones'.match(/(?<=Mrs?\. )\w+/g);
// ['Smith', 'Jones']

// First word after a heading marker
'## Introduction\n## Setup'.match(/(?<=## )\w+/g);
// ['Introduction', 'Setup']

Lookbehind keeps the match free of the prefix you tested for. Without it, you would match the whole string and then slice off the prefix, which is messier and slower in tight loops.

Negative Lookbehind Examples

Negative lookbehind matches X only if Y does not precede:

// Numbers NOT preceded by "$"
'$50 and 100 items'.match(/(?<!\$)\b\d+\b/g);
// ['100']

// Words that don't start with "un"
'unfair unhappy fair happy'.match(/\b(?<!un)\w+/g);
// includes 'happy', 'fair', and even partial matches inside the un* words —
// add a leading \b for proper word matching

// Find unescaped quotes
'He said \"hello\" and \\\"goodbye\\\"'.match(/(?<!\\)"/g);

The "unescaped delimiter" pattern (negative lookbehind for the escape character) is a classic lookbehind use case. Without it, you have to manually walk the string tracking escape state.

Combining Lookaheads (Password Validation)

Lookarounds compose. Stacking multiple lookaheads at the same position acts as logical AND, which is the standard regex idiom for password validation:

// Must contain: at least one lowercase, one uppercase, one digit, one symbol; minimum length 8
const strongPassword = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%^&*]).{8,}$/;

strongPassword.test('Abc123!@');     // true
strongPassword.test('weakpass');     // false — no uppercase, digit, or symbol

Each lookahead independently asserts a property of the input, then .{8,}$ consumes the actual characters and checks length. The match succeeds only if every assertion passes and the body matches.

This pattern is included in the common patterns library. Note that pure complexity rules are weaker than length-based entropy — the password generator uses cryptographic randomness and produces stronger passwords than any complexity-validated user input.

Engine Support

Lookahead is supported nearly universally — JavaScript, Python, Java, .NET, PCRE, Ruby, Go, Rust all have it. Lookbehind support is more recent in some engines:

EngineLookaheadLookbehindVariable-length lookbehind
JavaScriptYesES2018+Yes
Python reYesYesNo (fixed-width only)
Python regexYesYesYes
JavaYesYesNo (fixed-width only)
.NETYesYesYes
PCRE / PHPYesYesLimited (depends on version)
RubyYesYesYes (Onigmo)
Go (RE2)YesNoN/A

Fixed-width lookbehind means the asserted pattern must have a single known length. (?<=cat) works because "cat" is always 3 characters; (?<=cats?) would fail in fixed-width engines because it can be 3 or 4 characters. Workarounds for fixed-width engines: write multiple lookbehinds joined by alternation, or use a capturing group plus post-processing.

Go's RE2 engine deliberately omits lookbehind because it cannot guarantee linear time complexity with the assertion. For Go regex with lookbehind, use a third-party package or switch to a different language for that particular task.

When NOT to Use Lookarounds

Lookarounds are powerful but not always the right tool. Skip them when:

  • You actually want the surrounding context in the match. If you want to capture "the word after Mr." including the title, use a capturing group instead of a lookbehind.
  • You need maximum portability. Code that may run in Go's RE2 or older Java engines benefits from staying within capturing groups.
  • The pattern can be expressed more simply with anchors or character classes. A lookahead like (?=[a-z]) at the very start of a pattern is just [a-z] with the engine looking at the same character — refactor.
  • You want the match to be obvious to a reader. Stacked lookaheads in a complex pattern can be impenetrable. If you find yourself adding three lookarounds, consider splitting the work between regex and post-validation in code.

Common Mistakes

  • Forgetting that lookbehind is fixed-width in Python's stdlib. Variable patterns like (?<=\$\d*) fail with a "look-behind requires fixed-width pattern" error. Use the third-party regex module if you need variable-length.
  • Capturing groups inside lookarounds. Capturing groups do capture even when inside a lookaround. The captured text is the position-zero match of the asserted pattern, but the lookaround still does not consume characters. Be careful with group numbering.
  • Negative lookahead allowing partial matches. A pattern like \d+(?! dollars) can match "10" of "100 dollars" because the lookahead succeeds at "0 dollars" — anchor with \b to enforce word-aligned matching.
  • Confusing the order: (?=X) looks forward from the current position; (?<=X) looks backward. The = always means "positive"; the ! always means "negative".
  • Trying to use lookbehind in Go. RE2 simply does not support it. The error is at compile time, but it is the kind of thing a project doesn't catch until production.

Try Lookarounds Live

The regex tester uses the JavaScript engine, which supports all four lookarounds and variable-length lookbehind. Paste any pattern from this guide, paste sample text, and watch matches highlight in real time. For unfamiliar patterns, paste them into the regex explainer for a token-by-token breakdown — lookarounds are explicitly labelled in the output.

Frequently Asked Questions

What is the difference between lookahead and lookbehind?

Lookahead checks what comes AFTER the current position; lookbehind checks what comes BEFORE. Both are zero-width assertions — they match a position, not characters, so the matched text never includes what the lookaround tested. Positive lookahead (?=X) requires X to follow; negative lookahead (?!X) requires X to NOT follow. Positive lookbehind (?<=X) requires X to precede; negative lookbehind (?<!X) requires X to NOT precede.

Are lookaheads zero-width?

Yes. A lookaround tests a position without consuming characters. After the assertion succeeds, the regex engine's position has not advanced, so the next part of the pattern continues from where the lookaround started. This is what makes lookaheads useful — you can require that something follows the current match without including it in the captured text. The classic example is matching a password that contains a digit somewhere: (?=.*\d) asserts a digit is reachable later in the string, but the matched password text does not include that digit's position.

Which engines support lookbehind?

Lookbehind is supported in PCRE (PHP), Python's re module, Java's Pattern class, .NET, Ruby's Onigmo, Go (with the regexp/syntax extensions), and JavaScript since ES2018 (Chrome 62+, Firefox 78+, Safari 16.4+, Node 10+). Most engines historically required fixed-width lookbehind — the asserted pattern had to have a single known length. Modern JavaScript and the third-party Python regex module support variable-length lookbehind. Python's standard re module still requires fixed-width lookbehind.

Can I use multiple lookaheads in one regex?

Yes — multiple lookaheads at the same position effectively AND-combine, which is the standard pattern for password validation. (?=.*[a-z])(?=.*[A-Z])(?=.*\d).{8,} demands a lowercase letter, an uppercase letter, and a digit somewhere in the next 8+ characters. Each lookahead independently tests the same starting position, and the overall match succeeds only if all of them succeed. This composition is one of the most powerful uses of lookarounds.

Should I use a lookahead or a capturing group?

Use a lookahead when you need to assert something is present without including it in the match — for example, finding a number not followed by particular text. Use a capturing group when you actually need to extract the surrounding text along with the part you care about, then slice the result. Lookarounds keep the match clean (no extra characters to strip), but capturing groups are more portable across engines that may not support all lookaround forms.