Common Regex Patterns

A curated library of practical regex patterns for everyday validation and extraction tasks — email, URL, IP addresses, phone numbers, dates, UUIDs, hex colors, credit cards, semver, and more. Each pattern is annotated with the cases it matches and the trade-offs it makes, so you can pick the right one without rereading the RFC.

Email Address

Practical pattern that catches the email addresses you actually encounter without trying to implement full RFC 5322.

[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}

Matches: user@example.com, jane.doe+tag@sub.example.co.uk

Note: Allows most real addresses. For deliverability, send a verification email — regex cannot prove an address exists.

HTTP/HTTPS URL

Loose pattern that finds candidate URLs in text. Pair with a real URL parser for structural validation.

https?:\/\/[a-zA-Z0-9.-]+(?::\d+)?(?:\/[^\s]*)?

Matches: https://example.com, http://api.example.com:8080/v1/users?id=42

Note: Does not validate IDN domains or strict host syntax. Use new URL(match) in JS to confirm validity.

Domain Name (FQDN)

Matches fully-qualified domain names with one or more labels and a TLD of at least two letters.

(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,}

Matches: example.com, api.staging.example.co.uk

Note: Per RFC 1035 each label is 1–63 chars; full FQDN max 253. ASCII only — does not handle punycoded IDN.

IPv4 Address (lenient)

Quick dotted-quad match. Allows octets up to 999 — fast for extraction, not for validation.

\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b

Matches: 192.168.1.1, 8.8.8.8 (and also 999.999.999.999)

Note: Use the strict version below if octet range matters.

IPv4 Address (strict, 0–255)

Validates each octet falls in the legal 0–255 range.

\b(?:25[0-5]|2[0-4]\d|[01]?\d\d?)(?:\.(?:25[0-5]|2[0-4]\d|[01]?\d\d?)){3}\b

Matches: 192.168.1.1, 0.0.0.0, 255.255.255.255

Note: Rejects 256.0.0.1 and other out-of-range octets.

IPv6 Address

Matches full and compressed IPv6 forms including :: and IPv4-mapped variants.

(?:(?:[0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}|(?:[0-9a-fA-F]{1,4}:){1,7}:|::(?:[0-9a-fA-F]{1,4}:){0,6}[0-9a-fA-F]{1,4}|(?:[0-9a-fA-F]{1,4}:){1,6}(?::[0-9a-fA-F]{1,4}){1,1}|(?:[0-9a-fA-F]{1,4}:){1,5}(?::[0-9a-fA-F]{1,4}){1,2}|(?:[0-9a-fA-F]{1,4}:){1,4}(?::[0-9a-fA-F]{1,4}){1,3}|(?:[0-9a-fA-F]{1,4}:){1,3}(?::[0-9a-fA-F]{1,4}){1,4}|(?:[0-9a-fA-F]{1,4}:){1,2}(?::[0-9a-fA-F]{1,4}){1,5})

Matches: 2001:0db8:85a3::8a2e:0370:7334, ::1, fe80::1

Note: A lossless IPv6 regex is famously long. For 100% RFC compliance, use a parser like net/netaddr in your language.

MAC Address

Matches MAC addresses with either colons or hyphens as separators.

(?:[0-9A-Fa-f]{2}[:-]){5}[0-9A-Fa-f]{2}

Matches: 00:1A:2B:3C:4D:5E, 00-1A-2B-3C-4D-5E

Note: Does not enforce that all separators are the same character.

US Phone Number

Tolerant US phone matcher accepting country code, parentheses, dashes, dots, and spaces.

(?:\+?1[-.\s]?)?\(?(\d{3})\)?[-.\s]?(\d{3})[-.\s]?(\d{4})

Matches: 555-123-4567, (555) 123-4567, +1 555.123.4567, 5551234567

Note: Capturing groups extract area code, exchange, and line number.

International Phone (E.164-style)

Matches numbers in E.164 international format with optional plus and 7–15 digits.

\+?[1-9]\d{1,14}

Matches: +15551234567, 447911123456, +33123456789

Note: Does not allow spaces, dashes, or parentheses. Strip those before matching with str.replace(/[\s().-]/g, '').

US ZIP Code

5-digit ZIP code with optional ZIP+4 extension.

\b\d{5}(?:-\d{4})?\b

Matches: 90210, 10001-1234

UK Postcode

Royal Mail format. Letters case-insensitive; the central space is optional.

\b[A-Z]{1,2}\d[A-Z\d]?\s*\d[A-Z]{2}\b

Matches: SW1A 1AA, EC1A1BB, M11AE

Note: Use the i flag for lowercase input.

Canadian Postal Code

Canadian format A1A 1A1, with optional space.

\b[ABCEGHJ-NPRSTVXY]\d[A-Z]\s?\d[A-Z]\d\b

Matches: K1A 0B1, M5V3L9

Note: Excludes the letters D, F, I, O, Q, U which Canada Post does not assign.

UUID / GUID

Matches any RFC 4122 UUID (any version) in canonical hyphenated form.

\b[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[1-7][0-9a-fA-F]{3}-[89abAB][0-9a-fA-F]{3}-[0-9a-fA-F]{12}\b

Matches: 550e8400-e29b-41d4-a716-446655440000

Note: Validates version (1–7) and variant (8/9/A/B) digits. See the UUID vs GUID guide for context.

Username

Standard username rules: 3–20 characters, letters, digits, and underscore, must start with a letter.

^[a-zA-Z][a-zA-Z0-9_]{2,19}$

Matches: jane_doe, user123

Note: Adjust the {2,19} bound for your length policy.

URL Slug

Lowercase letters, digits, and hyphens, no leading/trailing hyphen.

^[a-z0-9]+(?:-[a-z0-9]+)*$

Matches: hello-world, my-blog-post-2026

Note: Generate slugs with the slug generator.

Strong Password (8+, mixed case, digit, symbol)

Demands at least 8 characters with at least one lowercase, one uppercase, one digit, and one symbol. Uses lookahead.

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%^&*]).{8,}$

Matches: Abc123!@, P4ssw0rd!

Note: Length-and-mix rules are weaker than true entropy. Prefer passphrases when possible.

Credit Card (major brands)

Detects Visa, MasterCard, American Express, Discover, JCB, and Diners Club by prefix and length.

(?:4\d{12}(?:\d{3})?|(?:5[1-5]\d{2}|2[2-7]\d{2})\d{12}|3[47]\d{13}|6(?:011|5\d{2})\d{12}|(?:2131|1800|35\d{3})\d{11}|3(?:0[0-5]|[68]\d)\d{11})

Matches: 4111111111111111 (Visa), 378282246310005 (Amex)

Note: Strip spaces/dashes first. Always validate with the Luhn algorithm — regex only checks format.

Hex Color (3, 6, or 8 digit)

CSS-style hex colors with optional alpha channel.

#(?:[0-9a-fA-F]{3,4}|[0-9a-fA-F]{6}|[0-9a-fA-F]{8})\b

Matches: #fff, #ffffff, #ffaa00cc

Note: Convert between formats with the color converter.

RGB / RGBA Color

CSS rgb() and rgba() function calls with optional alpha.

rgba?\(\s*\d{1,3}\s*,\s*\d{1,3}\s*,\s*\d{1,3}\s*(?:,\s*(?:0|1|0?\.\d+))?\s*\)

Matches: rgb(255, 0, 0), rgba(0, 0, 0, 0.5)

ISO Date (YYYY-MM-DD)

ISO 8601 calendar date with strict month and day ranges.

\d{4}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12]\d|3[01])

Matches: 2026-04-30, 1999-12-31

Note: Does not validate that 02-30 is invalid — that requires a real date parser.

ISO 8601 Datetime

Full datetime with optional milliseconds and timezone offset (or Z for UTC).

\d{4}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12]\d|3[01])T(?:[01]\d|2[0-3]):[0-5]\d:[0-5]\d(?:\.\d+)?(?:Z|[+-](?:[01]\d|2[0-3]):[0-5]\d)

Matches: 2026-04-30T14:23:00Z, 2026-04-30T14:23:00.123+02:00

Note: Convert epoch values with the timestamp converter.

Time (24-hour)

24-hour HH:MM with optional seconds.

(?:[01]\d|2[0-3]):[0-5]\d(?::[0-5]\d)?

Matches: 09:30, 23:59:59, 00:00

Integer (signed)

Whole number with optional sign. No leading zeros.

-?(?:0|[1-9]\d*)

Matches: 0, 42, -100

Decimal Number

Signed integer or decimal with optional fractional part.

-?\d+(?:\.\d+)?

Matches: 3.14, -0.5, 42

Scientific Notation

Numbers in scientific (E-notation) form.

-?\d+(?:\.\d+)?[eE][+-]?\d+

Matches: 6.022e23, 1.6E-19, -3e10

Hexadecimal Number

Hex literal with 0x prefix. Convert with the base converter.

0[xX][0-9a-fA-F]+

Matches: 0xFF, 0X1a2b, 0xCAFEBABE

Semantic Version

SemVer 2.0.0 — major.minor.patch with optional pre-release and build metadata.

(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?

Matches: 1.0.0, 2.1.0-beta.1, 3.0.0-rc.1+build.42

Note: This is the official regex from semver.org.

HTML Tag

Matches an opening, closing, or self-closing HTML tag with attributes.

<\/?[a-zA-Z][a-zA-Z0-9-]*(?:\s+[a-zA-Z][a-zA-Z0-9-]*(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s>]+))?)*\s*\/?>

Matches: <p>, <a href="/x">, <br/>, </div>

Note: Regex cannot fully parse HTML — this catches well-formed tags only. Use a real parser for serious work.

Leading / Trailing Whitespace

Match whitespace at the start or end of a string. Replace with empty for trim.

^\s+|\s+$

Matches: spaces, tabs, or newlines at the boundaries.

Note: Use the g flag with replace() to strip both ends. String.prototype.trim() is equivalent in JavaScript.

File Extension

Captures the trailing extension of a filename (after the last dot).

\.([a-zA-Z0-9]+)$

Matches: Captures jpg in photo.jpg, tar.gz only captures gz.

Hashtag

Matches a hashtag with letters, digits, and underscores after the #.

(?:^|\s)#([a-zA-Z][a-zA-Z0-9_]*)

Matches: #javascript, #regex_tips

Note: Requires whitespace or start-of-string before # to avoid matching #anchor in URLs.

Twitter / X Handle

Matches @-handle: 1–15 alphanumerics or underscores.

(?:^|\s)@([A-Za-z0-9_]{1,15})\b

Matches: @jane, @dev_42

JWT (JSON Web Token)

Three Base64URL segments joined by dots.

[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+

Matches: eyJhbGciOi...XYz.eyJzdWIiOi...abc.SflKxw...

Note: Decode with the JWT decoder to inspect claims.

How to Use This Tool

  1. Search or filter for the pattern type you need — email, ipv4, semver, color, etc.
  2. Read the description and Note on the pattern. Many patterns make practical trade-offs (loose vs strict, RFC-compliant vs real-world tolerant) — pick the one that fits your use case.
  3. Click Copy to copy the pattern to your clipboard, ready to paste into your code or the regex tester.
  4. Verify before deploying. Open the regex tester, paste the pattern, paste a sample of your real data, and confirm matches and non-matches behave as expected.
  5. Need an explanation? Paste any pattern into the regex explainer to see what every token does.

Pragmatism Over Pedantry

Most "definitive" regex patterns floating around the internet are either too loose (matching obvious nonsense) or too strict (rejecting real input). This library leans pragmatic: each pattern catches the cases you actually encounter in form inputs, log files, and user-generated content, while the Note line is honest about what it skips. The full RFC for an email address runs to several pages and accepts strings most users would consider clearly invalid; the email pattern here is the same one used by most real-world validation libraries because it works.

Two general principles for using regex on real data: first, regex is for finding, not validating. Use it to extract candidate values, then validate each one with a proper parser (new URL(), parseFloat(), the Luhn algorithm). Second, readable patterns beat clever patterns. A 200-character regex that handles every edge case is harder to maintain than a 50-character regex plus a few lines of follow-up validation code. Pick the version that future-you will be able to read.

Frequently Asked Questions

Are these regex patterns RFC-compliant?
Most are practical patterns that catch the cases you actually encounter in real data, not the formal grammar from the relevant RFC. The full RFC 5322 email grammar, for example, runs to several pages and accepts strings most users would consider clearly invalid. Each pattern below has a Note describing what it covers and what it does not, so you can decide whether the trade-off matches your use case. For strict validation, follow the linked guides for language-specific approaches.
Do these patterns work in every regex engine?
The patterns use only common syntax — character classes, quantifiers, anchors, and non-capturing groups — which works identically in JavaScript, Python, Java, Go, .NET, PHP (PCRE), and Ruby. A few patterns use lookahead, which is supported everywhere. None use lookbehind, named groups, or recursive patterns, so they remain portable. If you need to use them in a particular language, check the JavaScript and Python regex guides for any engine-specific gotchas around flags and escape sequences.
Why is the URL pattern so loose?
URL grammar is genuinely complicated — protocol, host, port, path, query, fragment all have their own rules, plus IPv6 addresses, IDN domains, and percent-encoding. A strict pattern would be hundreds of characters and still miss edge cases. The pragmatic approach is to use a loose pattern to find candidate URLs, then validate each candidate by feeding it to your language's URL parser (URL constructor in JavaScript, urllib.parse in Python). Regex finds the URL; the parser tells you if it is structurally valid.
Should I use regex to validate emails?
For form-input validation, a loose regex is fine — it catches obvious typos like missing @ or trailing dots without rejecting unusual but valid addresses. For confirmation that an address actually exists, regex cannot help: only sending a verification email and waiting for the click can prove deliverability. The pattern below is the practical middle ground used by most validation libraries. Do not write a stricter regex hoping to catch every invalid address — you will block valid users instead.
How do I test these patterns?
Click Copy on any pattern, paste it into the regex tester linked at the top of this page, paste a sample of your real data into the test string, and watch matches highlight in real time. The tester runs entirely in your browser, so production data stays local. For an explanation of any unfamiliar token in a pattern, paste the same pattern into the regex explainer for a token-by-token breakdown.