Regex Cheat Sheet — Regular Expression Quick Reference

Character Classes

Character classes let you match specific sets or ranges of characters. They are the foundation of almost every regex pattern.

Pattern	Description	Example
`.`	Any character except newline	`a.c` matches "abc", "a1c"
`\d`	Any digit (0-9)	`\d{3}` matches "123"
`\D`	Any non-digit	`\D+` matches "abc"
`\w`	Word character (a-z, A-Z, 0-9, _)	`\w+` matches "hello_1"
`\W`	Non-word character	`\W` matches "@" in "a@b"
`\s`	Whitespace (space, tab, newline)	`a\sb` matches "a b"
`\S`	Non-whitespace character	`\S+` matches "hello"
`[abc]`	Any one of a, b, or c	`[aeiou]` matches vowels
`[^abc]`	Any character except a, b, or c	`[^0-9]` matches non-digits
`[a-z]`	Any character in range a through z	`[A-Za-z]` matches any letter

Quantifiers

Quantifiers specify how many times the preceding element must occur. By default they are greedy (match as much as possible). Append ? to make them lazy (match as little as possible).

Pattern	Description	Example
`*`	Zero or more	`ab*c` matches "ac", "abc", "abbc"
`+`	One or more	`ab+c` matches "abc", "abbc" but not "ac"
`?`	Zero or one (optional)	`colou?r` matches "color" and "colour"
`{n}`	Exactly n times	`\d{4}` matches "2026"
`{n,}`	n or more times	`\d{2,}` matches "12", "123", "1234"
`{n,m}`	Between n and m times	`\d{2,4}` matches "12", "123", "1234"
`*?`	Zero or more (lazy)	`<.+?>` matches single HTML tags
`+?`	One or more (lazy)	`".+?"` matches individual quoted strings

Anchors

Anchors assert a position in the string rather than matching a character. They are essential for ensuring your pattern matches at the correct location.

Pattern	Description	Example
`^`	Start of string (or line with `m` flag)	`^Hello` matches "Hello world"
`$`	End of string (or line with `m` flag)	`world$` matches "Hello world"
`\b`	Word boundary	`\bcat\b` matches "cat" but not "catalog"
`\B`	Non-word boundary	`\Bcat\B` matches "cat" in "concatenate"

Groups and Backreferences

Groups let you treat multiple characters as a single unit, capture matched text for later use, or apply alternation. Backreferences let you match the same text that was previously captured.

Pattern	Description	Example
`(abc)`	Capturing group	`(ha)+` matches "haha"
`(?:abc)`	Non-capturing group	`(?:ha)+` groups without capturing
`(?<name>abc)`	Named capturing group	`(?<year>\d{4})` captures as "year"
`\1`	Backreference to group 1	`(\w+)\s\1` matches "the the"
`a\|b`	Alternation (a or b)	`cat\|dog` matches "cat" or "dog"

Lookahead and Lookbehind

Lookaround assertions check for a pattern ahead or behind the current position without including it in the match. They are zero-width, meaning they do not consume characters.

Pattern	Description	Example
`(?=abc)`	Positive lookahead	`\d+(?= dollars)` matches "100" in "100 dollars"
`(?!abc)`	Negative lookahead	`\d+(?! dollars)` matches "100" in "100 euros"
`(?<=abc)`	Positive lookbehind	`(?<=\$)\d+` matches "50" in "$50"
`(?<!abc)`	Negative lookbehind	`(?<!\$)\d+` matches "50" in "50 items"

Lookbehind support varies by engine. JavaScript added lookbehind in ES2018, so it works in all modern browsers. Python and Java have supported lookbehind for many years. Note that some engines require lookbehind patterns to be fixed-length.

Flags (Modifiers)

Flags change how the regex engine interprets your pattern. In JavaScript, flags are appended after the closing delimiter: /pattern/flags. In Python, they are passed as arguments to re.compile().

Flag	Name	Description
`g`	Global	Find all matches, not just the first one
`i`	Case-insensitive	`/hello/i` matches "Hello", "HELLO", "hello"
`m`	Multiline	`^` and `$` match start/end of each line, not the whole string
`s`	Dotall (single-line)	`.` matches newline characters as well
`u`	Unicode	Enables full Unicode matching and proper surrogate pair handling
`y`	Sticky	Matches only at the index indicated by `lastIndex`

Common Regex Patterns

These are practical patterns for everyday validation and extraction tasks. Keep in mind that real-world data can be messy, so these patterns cover common formats rather than every possible edge case. Test them with the Regex Tester before using them in production. For a fuller searchable library — including IPv6, MAC, semver, credit cards, and many more — see the common regex patterns library.

Email Address

A practical pattern that handles most valid email addresses. The full RFC 5322 specification is extremely complex, so this pattern balances accuracy with readability.

[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}

URL

Matches HTTP and HTTPS URLs with optional port numbers and paths.

https?:\/\/[a-zA-Z0-9.-]+(?::\d+)?(?:\/[^\s]*)?

Phone Number (US)

Matches common US phone formats including optional country code, parentheses, dashes, dots, and spaces.

(?:\+?1[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}

This matches formats like 555-123-4567, (555) 123-4567, +1 555.123.4567, and 5551234567.

IPv4 Address

Matches a standard dotted-quad IPv4 address. For strict validation (each octet 0-255), the pattern is more complex. This simplified version is suitable for most extraction tasks.

\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b

IPv4 Address (Strict 0-255)

This stricter version validates that each octet is between 0 and 255.

\b(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b

Date (YYYY-MM-DD)

Matches ISO 8601 date format.

\d{4}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12]\d|3[01])

Hex Color Code

Matches 3-digit and 6-digit hex color values with a leading hash.

#(?:[0-9a-fA-F]{3}){1,2}\b

Frequently Asked Questions

What is the difference between greedy and lazy quantifiers in regex?

Greedy quantifiers (*, +, {n,m}) match as many characters as possible while still allowing the overall pattern to succeed. Lazy (or non-greedy) quantifiers (*?, +?, {n,m}?) match as few characters as possible. For example, given the string <b>bold</b>, the greedy pattern <.+> matches the entire string, while the lazy pattern <.+?> matches only <b>. Use lazy quantifiers when you want the shortest possible match.

What is the difference between a capturing group and a non-capturing group?

A capturing group, written as (pattern), matches and stores the matched text so you can reference it later using backreferences (\1, \2) or in replacement strings ($1, $2). A non-capturing group, written as (?:pattern), groups the pattern for applying quantifiers or alternation but does not store the match. Non-capturing groups are slightly more efficient when you do not need to reference the matched text.

Are regular expressions the same across all programming languages?

The core syntax — character classes, quantifiers, anchors, and basic grouping — is largely consistent across languages. However, advanced features differ. JavaScript did not support lookbehind until ES2018, while Python and Java had it for years. PCRE (used by PHP) supports recursive patterns, which most other engines do not. Named capture groups use different syntax in some languages: Python uses (?P<name>...) while JavaScript and .NET use (?<name>...). Always consult the documentation for your specific regex engine.