Regular Expressions Tutorial for Beginners (with Examples)
Regular expressions (regex) look intimidating at first, but they follow simple rules. This tutorial takes you from zero to writing complex patterns, step by step. Try every example in our free regex tester.
What is a Regular Expression?
A regular expression is a pattern that describes a set of strings. You use it to search, match, validate, or replace text. Every programming language supports regex — JavaScript, Python, Java, Go, PHP, and more.
// JavaScript example
const pattern = /hello/;
pattern.test("say hello world"); // true
// Python example
import re
re.search(r"hello", "say hello world") # Match found1. Literal Characters
The simplest regex is just normal text. catmatches the literal string "cat" anywhere in the input.
Pattern: cat
"the cat sat" → matches "cat" at position 4
"concatenate" → matches "cat" at position 3
"dog" → no matchRegex is case-sensitive by default. Cat does not match "cat" unless you use the i flag.
2. Metacharacters (Special Characters)
These characters have special meaning in regex:
. ^ $ * + ? { } [ ] ( ) | \
To match them literally, escape with a backslash:
\. matches a literal dot
\$ matches a literal dollar sign
\( matches a literal opening parenthesis3. The Dot (.)
Matches any single character except newline.
Pattern: c.t
"cat" → match
"cot" → match
"cut" → match
"coat" → no match (two characters between c and t)
"ct" → no match (zero characters between c and t)4. Character Classes [ ]
Match one character from a set.
[aeiou] — any vowel
[0-9] — any digit (0 through 9)
[a-z] — any lowercase letter
[A-Z] — any uppercase letter
[a-zA-Z] — any letter
[a-zA-Z0-9] — any alphanumeric character
Examples:
Pattern: b[aeiou]t
"bat" → match
"bet" → match
"bit" → match
"but" → match
"bot" → match
"bxt" → no matchNegated character class [^ ]
A caret inside brackets means "anything exceptthese."
[^0-9] — anything that is NOT a digit
[^aeiou] — anything that is NOT a vowel
[^\s] — anything that is NOT whitespace5. Shorthand Character Classes
\d — digit [0-9]
\D — non-digit [^0-9]
\w — word character [a-zA-Z0-9_]
\W — non-word [^a-zA-Z0-9_]
\s — whitespace [ \t\n\r\f]
\S — non-whitespace [^ \t\n\r\f]
Example:
Pattern: \d\d\d-\d\d\d\d
"555-1234" → match
"abc-defg" → no match6. Anchors
Anchors do not match characters — they match positions.
^ — start of string (or start of line with m flag)
$ — end of string (or end of line with m flag)
\b — word boundary
Examples:
Pattern: ^hello
"hello world" → match (hello is at the start)
"say hello" → no match
Pattern: world$
"hello world" → match (world is at the end)
"world cup" → no match
Pattern: \bcat\b
"the cat sat" → matches "cat"
"concatenate" → no match (cat is not a whole word)7. Quantifiers
Control how many times a character or group is matched.
* — 0 or more
+ — 1 or more
? — 0 or 1 (optional)
{3} — exactly 3
{2,5} — between 2 and 5
{3,} — 3 or more
Examples:
Pattern: colou?r
"color" → match (u appears 0 times)
"colour" → match (u appears 1 time)
Pattern: go+gle
"gogle" → match
"google" → match
"gooogle" → match
"ggle" → no match (need at least 1 'o')
Pattern: \d{3}-\d{4}
"555-1234" → match
"55-1234" → no match (need exactly 3 digits)Greedy vs Lazy
By default, quantifiers are greedy — they match as much as possible. Add ? after a quantifier to make it lazy (match as little as possible).
Input: "<b>bold</b> and <i>italic</i>"
Greedy: <.*> → matches "<b>bold</b> and <i>italic</i>"
(everything from first < to LAST >)
Lazy: <.*?> → matches "<b>", then "</b>", then "<i>", then "</i>"
(stops at the FIRST > each time)8. Groups ( )
Parentheses group characters together, creating a sub-expression that can be quantified, alternated, or captured.
Capturing groups
Pattern: (\d{3})-(\d{4})
Input: "Call 555-1234"
Group 0 (full match): "555-1234"
Group 1: "555"
Group 2: "1234"
// JavaScript
const match = "Call 555-1234".match(/(\d{3})-(\d{4})/);
// match[1] === "555"
// match[2] === "1234"Non-capturing groups
Use (?:...) when you need grouping but do not need to capture.
Pattern: (?:http|https)://\S+
— Groups http|https for alternation
— Does NOT create a capture group (more efficient)9. Alternation (OR)
The pipe |means "or".
Pattern: cat|dog
"I have a cat" → matches "cat"
"I have a dog" → matches "dog"
Pattern: (Mon|Tues|Wednes|Thurs|Fri|Satur|Sun)day
Matches any day of the week10. Lookahead & Lookbehind
Assert that something exists ahead or behind the current position without including it in the match.
Positive lookahead (?=...)
Pattern: \d+(?= dollars)
Input: "I have 100 dollars"
Match: "100" (followed by " dollars", but " dollars" is not in the match)
// Useful: match a word only if followed by something
Pattern: \w+(?=\()
Input: "call myFunction(42)"
Match: "myFunction"Negative lookahead (?!...)
Pattern: \d{3}(?!-)
Input: "555-1234 and 999"
Match: "999" (three digits NOT followed by a hyphen)
// The "555" is followed by "-" so it doesn't matchPositive lookbehind (?<=...)
Pattern: (?<=\$)\d+
Input: "Price: $50"
Match: "50" (preceded by $, but $ is not in the match)Negative lookbehind (?<!...)
Pattern: (?<!\d)\d{3}(?!\d)
— Matches exactly 3 digits not surrounded by other digits11. Flags / Modifiers
i — case-insensitive /hello/i matches "Hello", "HELLO"
g — global find ALL matches, not just the first
m — multiline ^ and $ match line boundaries
s — dotAll . also matches newline characters
u — unicode full Unicode support
// JavaScript example
const matches = text.matchAll(/pattern/gi);12. Common Real-World Patterns
Email (simplified)
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
Breakdown:
^ — start of string
[a-zA-Z0-9._%+-]+ — one or more valid email chars (local part)
@ — literal @
[a-zA-Z0-9.-]+ — domain name
\. — literal dot
[a-zA-Z]{2,} — TLD (at least 2 letters)
$ — end of stringURL
https?://[\w.-]+(?:\.[a-zA-Z]{2,})(?:/[^\s]*)?
Breakdown:
https? — http or https
:// — literal ://
[\w.-]+ — domain characters
(?:\.[a-zA-Z]{2,}) — .com, .org, etc.
(?:/[^\s]*)? — optional pathPhone number (US format)
^\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$
Matches:
(555) 123-4567
555-123-4567
555.123.4567
5551234567IP address (IPv4)
^(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)$
Matches: 192.168.1.1, 10.0.0.255
Rejects: 256.1.1.1, 192.168.1Strong password validation
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
Requirements:
(?=.*[a-z]) — at least one lowercase
(?=.*[A-Z]) — at least one uppercase
(?=.*\d) — at least one digit
(?=.*[@$!%*?&]) — at least one special character
{8,} — minimum 8 charactersDate (YYYY-MM-DD)
^\d{4}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12]\d|3[01])$
Matches: 2026-04-02, 2025-12-31
Rejects: 2026-13-01, 2026-00-15HTML tag
<([a-z][a-z0-9]*)\b[^>]*>(.*?)</\1>
Matches: <p>text</p>, <div class="x">content</div>
\1 is a backreference to the tag name captured in group 113. Tips & Common Pitfalls
- Start simple, then refine. Get a basic pattern working before adding edge case handling.
- Use a regex tester. Try your patterns interactively with our regex tester tool — it highlights matches and groups in real time.
- Beware of catastrophic backtracking. Patterns like
(a+)+$can cause exponential runtime on certain inputs. Keep quantifiers unambiguous. - Don't parse HTML with regex. Use a proper HTML parser. Regex is fine for simple tag extraction but falls apart with nested structures.
- Use raw strings. In Python use
r"...", in JavaScript use/pattern/— this avoids double-escaping backslashes. - Test edge cases. Empty strings, strings with only whitespace, very long inputs, and Unicode characters.
Regex Cheatsheet
| Pattern | Meaning |
|---|---|
. | Any character (except newline) |
\d | Digit [0-9] |
\w | Word character [a-zA-Z0-9_] |
\s | Whitespace |
^ | Start of string |
$ | End of string |
\b | Word boundary |
* | 0 or more |
+ | 1 or more |
? | 0 or 1 |
{n} | Exactly n |
{n,m} | Between n and m |
[abc] | Any of a, b, or c |
[^abc] | Not a, b, or c |
(group) | Capture group |
a|b | a or b |
(?=...) | Positive lookahead |
(?!...) | Negative lookahead |
Practice Regex Right Now
Open our free regex tester and try every pattern from this tutorial. Real-time highlighting, group extraction, and flag toggles included.