Go Regex Guide
Go's regexp package is built on RE2, an engine that guarantees linear-time matching — which makes it immune to the catastrophic backtracking that can freeze other regex engines, at the cost of dropping two features you may expect (backreferences and lookaround). This guide is a practical reference for the package's API, the RE2 trade-off, and the Go-specific idioms that trip up developers from other languages. Pair it with the regex tester and the explainer to confirm any pattern.
RE2: The Defining Trade-Off
Go's regexp package is built on RE2, and that single choice shapes everything about how you use it. RE2 guarantees that matching runs in linear time relative to the input length — no matter how pathological the pattern. A regex can never cause the exponential "catastrophic backtracking" that freezes PCRE-based engines (Python, Java, JavaScript, PHP) on inputs like (a+)+$ against a long string of as.
The cost is that RE2 drops two features:
- Backreferences — you can't write
(\w+)\s+\1to match a repeated word. - Lookahead and lookbehind — no
(?=...),(?!...),(?<=...), or(?<!...).
This is a deliberate, defensible trade for a language built for servers: code that processes untrusted input can't be DoS'd by a malicious regex or malicious input. Most real-world patterns don't need the dropped features, and many that seem to can be restructured. When you genuinely need them, the third-party regexp2 package provides a PCRE-compatible engine — at the cost of the linear-time guarantee.
Compile Once with MustCompile
The idiomatic Go pattern is to compile regexes once at package level and reuse the *Regexp — it's safe for concurrent use by multiple goroutines.
package main
import (
"fmt"
"regexp"
)
// Compile once at package level. MustCompile panics on a bad pattern,
// so a typo fails at startup rather than mid-request.
var emailRe = regexp.MustCompile(`[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}`)
func main() {
text := "Contact jane@example.com or sales@foo.co.uk"
fmt.Println(emailRe.FindAllString(text, -1))
// [jane@example.com sales@foo.co.uk]
}
Use regexp.Compile (which returns an error) only when the pattern is dynamic — built from user input or loaded from config:
re, err := regexp.Compile(userPattern)
if err != nil {
return fmt.Errorf("invalid pattern: %w", err)
}
The Find Family
Go's matching methods follow a naming convention: Find + optional All + optional String + optional Submatch + optional Index. The combinations cover every need.
re := regexp.MustCompile(`(\w+)@(\w+)`)
text := "jane@example bob@test"
re.MatchString(text) // true — does it match anywhere?
re.FindString(text) // "jane@example" — first match
re.FindAllString(text, -1) // ["jane@example" "bob@test"] — all matches (-1 = no limit)
re.FindStringSubmatch(text) // ["jane@example" "jane" "example"] — first match + groups
re.FindAllStringSubmatch(text, -1)
// [["jane@example" "jane" "example"] ["bob@test" "bob" "test"]]
re.FindStringIndex(text) // [0 12] — byte offsets of first match
The -1 argument to the All variants is a limit; pass a positive number to cap the matches returned. There are byte-slice versions of every method (drop the String) for working with []byte directly, which avoids a conversion when you're processing file contents or network buffers.
Named Groups
Go uses the (?P<name>...) syntax. Because Go has no map-returning match method, the idiom is to zip SubexpNames() with the submatch slice:
re := regexp.MustCompile(`(?P<user>\w+)@(?P<domain>[\w.]+)`)
match := re.FindStringSubmatch("jane@example.com")
result := make(map[string]string)
for i, name := range re.SubexpNames() {
if i != 0 && name != "" {
result[name] = match[i]
}
}
// result == map[user:jane domain:example.com]
fmt.Println(result["user"]) // jane
Index 0 of SubexpNames() is always the empty string (the full match has no name), which is why the loop skips it. This map-building helper is common enough that many codebases extract it into a utility function.
Replacing
re := regexp.MustCompile(`(\w+)@(\w+)`)
// Backreferences in the replacement use $1, $2, or ${name}
re.ReplaceAllString("jane@example", "$2.$1")
// "example.jane"
// $name form for named groups
re2 := regexp.MustCompile(`(?P<user>\w+)@(?P<domain>\w+)`)
re2.ReplaceAllString("jane@example", "${user} at ${domain}")
// "jane at example"
// Function replacement — for logic the template form can't express
re.ReplaceAllStringFunc("jane@example bob@test", func(m string) string {
return "[" + m + "]"
})
// "[jane@example] [bob@test]"
Use ${name} with braces in replacement strings when the name is followed by an alphanumeric character, otherwise Go's parser greedily reads the name and your replacement breaks. The braces are always safe, so many developers use them unconditionally.
Inline Flags
Go has no flags argument — every flag is inline, written at the start of the pattern (or scoped to a group):
`(?i)hello` // case-insensitive
`(?m)^line` // multi-line: ^ and $ match at line boundaries
`(?s)a.b` // dotall: . matches newlines too
`(?im)^hello` // stack flags
`(?i:hello)WORLD` // scoped: only "hello" is case-insensitive
`(?U)a+` // ungreedy: quantifiers are lazy by default
The (?U) flag is unusual — it swaps the meaning of greedy and lazy quantifiers, so a+ becomes lazy and a+? becomes greedy. Rarely needed, but worth knowing it exists if you see it in someone else's pattern.
Common Gotchas
Use backtick raw strings
Always write patterns as backtick strings (`\d+`) not double-quoted ("\\d+"). Double-quoted strings interpret backslash escapes, forcing you to double every backslash, and "\d" is actually a compile error in Go. Backticks pass the pattern through verbatim.
Byte offsets, not rune offsets
The Index methods return byte offsets, not character (rune) positions. For ASCII input they're the same, but for multi-byte UTF-8 (accented characters, emoji, CJK) a byte offset won't line up with a string-slice-by-character. Slice the original string by the returned byte offsets — s[loc[0]:loc[1]] — and it works correctly because Go strings are byte sequences.
FindAll with a limit of 0 returns nil
The n argument to FindAll* methods means "at most n matches." Passing 0 returns nil (zero matches), not "all matches" — that's what -1 is for. A common bug is passing 0 expecting everything.
No backreferences means some patterns need a different approach
To find a doubled word (the the), PCRE uses \b(\w+)\s+\1\b. RE2 can't. In Go, find all words with the regex, then check adjacent words in code. The pattern that "should" be one regex becomes a regex plus a loop — usually clearer anyway.
Try It Live
The regex tester uses JavaScript's engine, which supports lookahead and backreferences — so a pattern that works there may need adjusting for Go's RE2. Use it to prototype the core pattern, then verify against Go's behavior. The regex explainer breaks any pattern down token by token. For the same depth in other languages, see the Python and JavaScript regex guides.
Frequently Asked Questions
Does Go regex support lookahead, lookbehind, and backreferences?
No. Go's regexp package uses the RE2 engine, which deliberately omits lookahead, lookbehind, and backreferences because those features can cause exponential-time matching (catastrophic backtracking). In exchange, RE2 guarantees that every match runs in linear time relative to the input size — a regex can never hang your program, which matters for server code processing untrusted input. If you genuinely need backreferences or lookaround, you either restructure the problem (often possible) or reach for a third-party PCRE binding like the regexp2 package, which trades the linear-time guarantee for the extra features.
What is the difference between regexp.Compile and regexp.MustCompile?
regexp.Compile(pattern) returns (*Regexp, error) — use it when the pattern comes from user input or a config file and might be invalid, so you can handle the error gracefully. regexp.MustCompile(pattern) returns just *Regexp and panics if the pattern is invalid — use it for patterns hard-coded in your source, compiled once at package level as a var. The convention is to compile package-level regexes with MustCompile in a var block so the panic happens at startup, not mid-request, if you ever introduce a typo.
How do I use named capture groups in Go regex?
Use the (?P<name>...) syntax (same as Python). Extract named groups with FindStringSubmatch, which returns a slice where index 0 is the full match, then a method SubexpNames() gives the names in order. The common idiom is to build a map: iterate re.SubexpNames() alongside the submatch slice and skip the empty-string name at index 0. Named groups make the extraction code readable when a pattern has many capture groups, which is worth the slightly verbose access pattern.
Why do Go regex patterns use backtick raw strings?
Go has two string literal forms: double-quoted strings interpret backslash escapes (so "\d" is an error and "\\d" is needed), and backtick raw strings pass every character literally. Always write regex patterns as backtick raw strings — `\d+` instead of "\\d+" — so the pattern matches what you see and you avoid double-escaping every backslash. The only exception is when your pattern needs to contain a literal backtick, which raw strings can't express; then fall back to a double-quoted string with escaped backslashes.
How do I make a Go regex case-insensitive?
Prefix the pattern with the inline flag (?i): regexp.MustCompile(`(?i)hello`) matches "hello", "HELLO", and "Hello". Go's regexp package has no separate flags argument — all flags are inline. Other useful inline flags: (?m) for multi-line mode (^ and $ match line boundaries), (?s) for dotall (. matches newlines), and (?i)(?m) or the combined (?im) to stack them. You can also scope a flag to part of the pattern with (?i:hello)WORLD, where only "hello" is case-insensitive.