Syntax and Creating Regex
// Regex literal β preferred for static patterns
const pattern1 = /hello/;
const pattern2 = /hello/gi; // with flags
// RegExp constructor β for dynamic patterns
const word = 'hello';
const pattern3 = new RegExp(word, 'i'); // /hello/i
const pattern4 = new RegExp(`\\b${word}\\b`, 'g'); // word boundary
// Flags overview
// g β global: find ALL matches, not just first
// i β case-insensitive
// m β multiline: ^ and $ match start/end of each line
// s β dotAll: . matches newlines too (ES2018)
// u β unicode: full Unicode support
// y β sticky: match only at lastIndex position
const re = /pattern/gim;
console.log(re.flags); // 'gim'
console.log(re.global); // true
console.log(re.source); // 'pattern'
Character Classes and Anchors
// Character classes
/[abc]/ // matches 'a', 'b', or 'c'
/[a-z]/ // matches any lowercase letter
/[A-Z]/ // matches any uppercase letter
/[0-9]/ // matches any digit
/[a-zA-Z0-9_]/ // alphanumeric + underscore
/[^abc]/ // negated class: anything EXCEPT a, b, c
// Shorthand classes
// \d digit [0-9]
// \D non-digit [^0-9]
// \w word char [a-zA-Z0-9_]
// \W non-word
// \s whitespace [ \t\n\r\f]
// \S non-whitespace
// . any char except newline (. in dotAll /s flag matches \n too)
console.log(/\d+/.test('abc123')); // true
console.log(/^\d+$/.test('123')); // true (only digits)
console.log(/^\d+$/.test('123a')); // false
// Anchors
// ^ start of string (or line with /m)
// $ end of string (or line with /m)
// \b word boundary
// \B non-word boundary
/^hello/.test('hello world'); // true (starts with 'hello')
/world$/.test('hello world'); // true (ends with 'world')
/\bcat\b/.test('cat'); // true
/\bcat\b/.test('catch'); // false (no word boundary after 'cat')
| Token | Matches | Equivalent |
|---|---|---|
\d | Any digit | [0-9] |
\D | Any non-digit | [^0-9] |
\w | Word character | [a-zA-Z0-9_] |
\W | Non-word character | [^a-zA-Z0-9_] |
\s | Whitespace | [ \t\n\r\f] |
\S | Non-whitespace | β |
. | Any char except \n | Use /s flag to include \n |
^ | Start of string/line | Use /m for per-line |
$ | End of string/line | Use /m for per-line |
\b | Word boundary | β |
Quantifiers
// Quantifiers
// * zero or more
// + one or more
// ? zero or one (also makes quantifiers lazy)
// {n} exactly n
// {n,} n or more
// {n,m} between n and m
/\d*/.test('abc'); // true (zero digits is fine)
/\d+/.test('abc'); // false (requires at least one)
/\d?/.test('abc'); // true (zero or one)
/\d{3}/.test('123'); // true
/\d{3}/.test('12'); // false
// Greedy vs lazy
const html = 'First
Second
';
html.match(/.*<\/p>/)[0]; // '
First
Second
' GREEDY
html.match(/.*?<\/p>/)[0]; // '
First
' LAZY (non-greedy)
// Add ? after quantifier to make it lazy: *? +? {n,m}?
Greedy: <p>First</p><p>Second</p> Lazy: <p>First</p>
By default quantifiers are greedy β they match as many characters as possible. Add ? to make them lazy β match as few characters as possible. This matters when matching HTML tags, strings in quotes, or any pattern with repeated delimiters.
Groups and Alternation
// Capturing group ( ) β captures the matched substring
const dateStr = '2026-06-15';
const dateRe = /(\d{4})-(\d{2})-(\d{2})/;
const match = dateStr.match(dateRe);
console.log(match[0]); // '2026-06-15' (full match)
console.log(match[1]); // '2026' (first group)
console.log(match[2]); // '06'
console.log(match[3]); // '15'
// Named capturing groups (?...) (ES2018)
const namedRe = /(?\d{4})-(?\d{2})-(?\d{2})/;
const { groups: { year, month, day } } = dateStr.match(namedRe);
console.log(year, month, day); // 2026 06 15
// Non-capturing group (?:...) β group without capturing
/(?:https?):\/\//.test('https://example.com'); // true
// Alternation |
/cat|dog/.test('I have a dog'); // true
/^(cat|dog)$/.test('cat'); // true
/^(cat|dog)$/.test('cats'); // false
// Backreferences \1, \2 β reference earlier group
/(['"])(.*?)\1/.test('"hello"'); // true (matching quotes)
/(['"])(.*?)\1/.test('"hello\''); // false (mismatched)
Lookahead and Lookbehind
// Lookahead (?=...) β match X followed by Y (Y not consumed)
const prices = ['$10', 'β¬20', '$30', 'Β£15'];
const dollarAmounts = prices.filter(p => /\d+(?=\s*$)/.test(p));
// Match dollar values
'100USD'.match(/\d+(?=USD)/)[0]; // '100'
// Negative lookahead (?!...) β match X NOT followed by Y
'hello world'.match(/\w+(?! world)/g); // ['hello'] is matched by \w+ not followed by ' world'... complex
'foo123'.replace(/(\d)(?!.*\d)/, '$1!'); // '123!' β add ! after last digit
// Lookbehind (?<=...) β match X preceded by Y (ES2018)
'$100 β¬200 Β£300'.match(/(?<=\$)\d+/g); // ['100'] β digits preceded by $
// Negative lookbehind (?= 8;
return hasUpper && hasLower && hasDigit && hasSpecial && isLong;
}
console.log(checkPassword('Passw0rd!')); // true
console.log(checkPassword('password')); // false
Regex Methods
const text = 'The price is $42.50 and $18.99';
const priceRe = /\$[\d.]+/g;
// test(str) β returns boolean
console.log(/\d/.test('abc123')); // true
// exec(str) β returns match array or null; advances lastIndex with /g
const re = /\$[\d.]+/g;
let m;
while ((m = re.exec(text)) !== null) {
console.log(`Found ${m[0]} at index ${m.index}`);
}
// Found $42.50 at index 13
// Found $18.99 at index 24
// String methods that accept regex:
// match(regex) β returns array of matches
const matches = text.match(priceRe);
console.log(matches); // ['$42.50', '$18.99']
// matchAll(regex) β returns iterator of all matches with groups
const allMatches = [...text.matchAll(/\$(\d+)\.(\d+)/g)];
console.log(allMatches[0].groups); // undefined (no named groups here)
console.log(allMatches[0][1], allMatches[0][2]); // '42' '50'
// search(regex) β returns index of first match (-1 if none)
console.log(text.search(/\$/)); // 13
// replace(regex, replacement)
const sanitised = text.replace(/\$[\d.]+/g, '[PRICE]');
console.log(sanitised); // 'The price is [PRICE] and [PRICE]'
// replace with function
const converted = text.replace(/\$([\d.]+)/g, (match, amount) =>
`Β£${(parseFloat(amount) * 0.79).toFixed(2)}`
);
console.log(converted); // 'The price is Β£33.58 and Β£15.00'
// split(regex)
'one1two2three3four'.split(/\d/); // ['one','two','three','four']
Common Patterns
// Email (simplified β RFC 5322 is far more complex)
const emailRe = /^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$/;
emailRe.test('user@example.com'); // true
emailRe.test('not-an-email'); // false
// URL
const urlRe = /^https?:\/\/[^\s/$.?#].[^\s]*$/i;
urlRe.test('https://ylearner.org'); // true
urlRe.test('ftp://old.com'); // false
// UK phone number (simplified)
const ukPhoneRe = /^(\+44|0)[\d\s\-]{9,13}$/;
ukPhoneRe.test('+44 7700 900123'); // true
// IPv4 address
const ipRe = /^(\d{1,3}\.){3}\d{1,3}$/;
ipRe.test('192.168.1.1'); // true (does not validate ranges 0-255)
// Hex colour
const hexRe = /^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$/;
hexRe.test('#FF5733'); // true
hexRe.test('#FFF'); // true
hexRe.test('#GGGGGG'); // false
// Username (3-16 alphanumeric + underscore)
const usernameRe = /^[a-zA-Z0-9_]{3,16}$/;
usernameRe.test('alice_99'); // true
usernameRe.test('ab'); // false (too short)
The full RFC 5322 email specification is so complex that a correct regex fills a page. For real validation, use a simple regex to catch obvious mistakes, then send a verification email. Server-side validation should always be the final gate.
Interview Questions
- What is the difference between
testandexec? - Why does using a
/gregex withexecin a loop work, but reusing the same regex variable between calls can cause bugs? - What is the difference between a capturing group and a non-capturing group?
- How do lookaheads and lookbehinds differ from regular groups?
- What does the
sflag do, and when would you need it?
ποΈ Practical Exercise
Write a maskSensitiveData(text) function that uses regex to:
- Replace credit card numbers (16 digits, optionally spaced in groups of 4) with
****-****-****-NNNN(keep last 4 digits). - Replace email addresses with
***@domain.tld. - Replace UK phone numbers with
***-****-****.
Test with a string that contains all three.
π₯ Challenge Exercise
Build a simple template engine using regex. Given a template string like "Hello, {{name}}! You have {{count}} messages." and a data object { name: 'Alice', count: 3 }, write a render(template, data) function that replaces all {{variable}} placeholders with their values. Handle missing variables gracefully (substitute empty string), and support a pipe syntax for simple transforms: {{name|upper}} calls .toUpperCase().
Frequently Asked Questions
- What is the
lastIndexproperty and why does it cause bugs? - When you use a
/gor/yregex withexecortest, it remembers where it left off via thelastIndexproperty. If you reuse the same regex object for two different strings without resettinglastIndex = 0, the second call starts from the wrong position. To avoid this, always use regex literals inline or create a newRegExpeach time. - What is the difference between
matchandmatchAll? - With a
/gflag,matchreturns an array of all matched strings but loses group information.matchAll(ES2020) returns an iterator of full match objects β each with the full match, groups, and index β making it far more useful when you need capturing groups from multiple matches. - How do I escape special characters in a dynamic regex?
- Use
str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')to escape all regex metacharacters in a string before passing it tonew RegExp(). Many environments also have a dedicatedRegExp.escape()method coming in a future ES version. - Is regex always the best tool for parsing?
- No. For HTML/XML use a proper DOM parser; for JSON use
JSON.parse(); for complex grammars use a dedicated parser library. Regex excels at pattern matching in strings but becomes unreadable and unreliable for nested or context-sensitive structures. - What is catastrophic backtracking?
- Certain regex patterns with nested quantifiers can cause exponential time complexity on specific inputs, hanging or crashing the engine. For example:
/^(a+)+$/on a long string of 'a's followed by a non-matching character. Always test regex patterns against adversarial inputs before using them in production.
π Summary
- Use regex literals
/pattern/flagsfor static patterns;new RegExp(str, flags)for dynamic ones. - Common flags:
g(global),i(case-insensitive),m(multiline),s(dotAll). - Shorthand classes:
\d(digit),\w(word char),\s(space); uppercase negates (\D, etc.). - Add
?after a quantifier (*?,+?) to make it lazy instead of greedy. - Capturing groups
()capture matches; non-capturing(?:)group without capturing. - Named groups
(?<name>)let you access matches by name viamatch.groups. - Lookahead
(?=)and lookbehind(?<=)assert context without consuming characters. - Use
matchAllwith/gto get all matches with group information.