Ad – 728Γ—90
πŸ”§ Intermediate JS

JavaScript Regular Expressions – Matching Patterns in Strings

Regular expressions are a concise language for describing patterns in text. Validating emails, extracting data from URLs, sanitising input, replacing substrings β€” all of these are dramatically simpler with regex. This guide builds your regex vocabulary from first principles to advanced lookarounds.

⏱️ 26 min read 🎯 Intermediate πŸ“… Updated 2026

Syntax and Creating Regex

JavaScript
// Regex literal – preferred for static patterns
const pattern1 = /hello/;
const pattern2 = /hello/gi; // with flags

// RegExp constructor – for dynamic patterns
const word = 'hello';
const pattern3 = new RegExp(word, 'i');          // /hello/i
const pattern4 = new RegExp(`\\b${word}\\b`, 'g'); // word boundary

// Flags overview
// g  – global: find ALL matches, not just first
// i  – case-insensitive
// m  – multiline: ^ and $ match start/end of each line
// s  – dotAll: . matches newlines too (ES2018)
// u  – unicode: full Unicode support
// y  – sticky: match only at lastIndex position

const re = /pattern/gim;
console.log(re.flags);  // 'gim'
console.log(re.global); // true
console.log(re.source); // 'pattern'

Character Classes and Anchors

JavaScript
// Character classes
/[abc]/      // matches 'a', 'b', or 'c'
/[a-z]/      // matches any lowercase letter
/[A-Z]/      // matches any uppercase letter
/[0-9]/      // matches any digit
/[a-zA-Z0-9_]/ // alphanumeric + underscore
/[^abc]/     // negated class: anything EXCEPT a, b, c

// Shorthand classes
// \d  digit        [0-9]
// \D  non-digit    [^0-9]
// \w  word char    [a-zA-Z0-9_]
// \W  non-word
// \s  whitespace   [ \t\n\r\f]
// \S  non-whitespace
// .   any char except newline (. in dotAll /s flag matches \n too)

console.log(/\d+/.test('abc123'));  // true
console.log(/^\d+$/.test('123'));   // true  (only digits)
console.log(/^\d+$/.test('123a')); // false

// Anchors
// ^  start of string (or line with /m)
// $  end of string (or line with /m)
// \b word boundary
// \B non-word boundary

/^hello/.test('hello world');  // true (starts with 'hello')
/world$/.test('hello world');  // true (ends with 'world')
/\bcat\b/.test('cat');         // true
/\bcat\b/.test('catch');       // false (no word boundary after 'cat')
TokenMatchesEquivalent
\dAny digit[0-9]
\DAny non-digit[^0-9]
\wWord character[a-zA-Z0-9_]
\WNon-word character[^a-zA-Z0-9_]
\sWhitespace[ \t\n\r\f]
\SNon-whitespaceβ€”
.Any char except \nUse /s flag to include \n
^Start of string/lineUse /m for per-line
$End of string/lineUse /m for per-line
\bWord boundaryβ€”

Quantifiers

JavaScript
// Quantifiers
// *    zero or more
// +    one or more
// ?    zero or one (also makes quantifiers lazy)
// {n}  exactly n
// {n,} n or more
// {n,m} between n and m

/\d*/.test('abc');    // true (zero digits is fine)
/\d+/.test('abc');    // false (requires at least one)
/\d?/.test('abc');    // true (zero or one)
/\d{3}/.test('123');  // true
/\d{3}/.test('12');   // false

// Greedy vs lazy
const html = '

First

Second

'; html.match(/

.*<\/p>/)[0]; // '

First

Second

' GREEDY html.match(/

.*?<\/p>/)[0]; // '

First

' LAZY (non-greedy) // Add ? after quantifier to make it lazy: *? +? {n,m}?
β–Ά Output
Greedy:  <p>First</p><p>Second</p>
Lazy:    <p>First</p>
πŸ’‘
Greedy vs Lazy Matching

By default quantifiers are greedy β€” they match as many characters as possible. Add ? to make them lazy β€” match as few characters as possible. This matters when matching HTML tags, strings in quotes, or any pattern with repeated delimiters.

Groups and Alternation

JavaScript
// Capturing group (  )  – captures the matched substring
const dateStr = '2026-06-15';
const dateRe = /(\d{4})-(\d{2})-(\d{2})/;
const match = dateStr.match(dateRe);
console.log(match[0]); // '2026-06-15' (full match)
console.log(match[1]); // '2026' (first group)
console.log(match[2]); // '06'
console.log(match[3]); // '15'

// Named capturing groups (?...)  (ES2018)
const namedRe = /(?\d{4})-(?\d{2})-(?\d{2})/;
const { groups: { year, month, day } } = dateStr.match(namedRe);
console.log(year, month, day); // 2026 06 15

// Non-capturing group (?:...)  – group without capturing
/(?:https?):\/\//.test('https://example.com'); // true

// Alternation |
/cat|dog/.test('I have a dog'); // true
/^(cat|dog)$/.test('cat');      // true
/^(cat|dog)$/.test('cats');     // false

// Backreferences \1, \2 – reference earlier group
/(['"])(.*?)\1/.test('"hello"');  // true (matching quotes)
/(['"])(.*?)\1/.test('"hello\''); // false (mismatched)

Lookahead and Lookbehind

JavaScript
// Lookahead (?=...)  – match X followed by Y (Y not consumed)
const prices = ['$10', '€20', '$30', 'Β£15'];
const dollarAmounts = prices.filter(p => /\d+(?=\s*$)/.test(p));
// Match dollar values
'100USD'.match(/\d+(?=USD)/)[0]; // '100'

// Negative lookahead (?!...)  – match X NOT followed by Y
'hello world'.match(/\w+(?! world)/g); // ['hello'] is matched by \w+ not followed by ' world'... complex
'foo123'.replace(/(\d)(?!.*\d)/, '$1!'); // '123!' β€” add ! after last digit

// Lookbehind (?<=...)  – match X preceded by Y (ES2018)
'$100 €200 Β£300'.match(/(?<=\$)\d+/g); // ['100'] β€” digits preceded by $

// Negative lookbehind (?= 8;
  return hasUpper && hasLower && hasDigit && hasSpecial && isLong;
}
console.log(checkPassword('Passw0rd!'));  // true
console.log(checkPassword('password'));   // false

Regex Methods

JavaScript
const text = 'The price is $42.50 and $18.99';
const priceRe = /\$[\d.]+/g;

// test(str) – returns boolean
console.log(/\d/.test('abc123'));  // true

// exec(str) – returns match array or null; advances lastIndex with /g
const re = /\$[\d.]+/g;
let m;
while ((m = re.exec(text)) !== null) {
  console.log(`Found ${m[0]} at index ${m.index}`);
}
// Found $42.50 at index 13
// Found $18.99 at index 24

// String methods that accept regex:
// match(regex) – returns array of matches
const matches = text.match(priceRe);
console.log(matches); // ['$42.50', '$18.99']

// matchAll(regex) – returns iterator of all matches with groups
const allMatches = [...text.matchAll(/\$(\d+)\.(\d+)/g)];
console.log(allMatches[0].groups);  // undefined (no named groups here)
console.log(allMatches[0][1], allMatches[0][2]); // '42' '50'

// search(regex) – returns index of first match (-1 if none)
console.log(text.search(/\$/)); // 13

// replace(regex, replacement)
const sanitised = text.replace(/\$[\d.]+/g, '[PRICE]');
console.log(sanitised); // 'The price is [PRICE] and [PRICE]'

// replace with function
const converted = text.replace(/\$([\d.]+)/g, (match, amount) =>
  `Β£${(parseFloat(amount) * 0.79).toFixed(2)}`
);
console.log(converted); // 'The price is Β£33.58 and Β£15.00'

// split(regex)
'one1two2three3four'.split(/\d/); // ['one','two','three','four']

Common Patterns

JavaScript
// Email (simplified – RFC 5322 is far more complex)
const emailRe = /^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$/;
emailRe.test('user@example.com');    // true
emailRe.test('not-an-email');        // false

// URL
const urlRe = /^https?:\/\/[^\s/$.?#].[^\s]*$/i;
urlRe.test('https://ylearner.org');  // true
urlRe.test('ftp://old.com');         // false

// UK phone number (simplified)
const ukPhoneRe = /^(\+44|0)[\d\s\-]{9,13}$/;
ukPhoneRe.test('+44 7700 900123');   // true

// IPv4 address
const ipRe = /^(\d{1,3}\.){3}\d{1,3}$/;
ipRe.test('192.168.1.1');  // true (does not validate ranges 0-255)

// Hex colour
const hexRe = /^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$/;
hexRe.test('#FF5733');  // true
hexRe.test('#FFF');     // true
hexRe.test('#GGGGGG'); // false

// Username (3-16 alphanumeric + underscore)
const usernameRe = /^[a-zA-Z0-9_]{3,16}$/;
usernameRe.test('alice_99'); // true
usernameRe.test('ab');       // false (too short)
πŸ’‘
Don't Over-Rely on Regex for Email Validation

The full RFC 5322 email specification is so complex that a correct regex fills a page. For real validation, use a simple regex to catch obvious mistakes, then send a verification email. Server-side validation should always be the final gate.

Interview Questions

  • What is the difference between test and exec?
  • Why does using a /g regex with exec in a loop work, but reusing the same regex variable between calls can cause bugs?
  • What is the difference between a capturing group and a non-capturing group?
  • How do lookaheads and lookbehinds differ from regular groups?
  • What does the s flag do, and when would you need it?

πŸ‹οΈ Practical Exercise

Write a maskSensitiveData(text) function that uses regex to:

  1. Replace credit card numbers (16 digits, optionally spaced in groups of 4) with ****-****-****-NNNN (keep last 4 digits).
  2. Replace email addresses with ***@domain.tld.
  3. Replace UK phone numbers with ***-****-****.

Test with a string that contains all three.

πŸ”₯ Challenge Exercise

Build a simple template engine using regex. Given a template string like "Hello, {{name}}! You have {{count}} messages." and a data object { name: 'Alice', count: 3 }, write a render(template, data) function that replaces all {{variable}} placeholders with their values. Handle missing variables gracefully (substitute empty string), and support a pipe syntax for simple transforms: {{name|upper}} calls .toUpperCase().

Frequently Asked Questions

What is the lastIndex property and why does it cause bugs?
When you use a /g or /y regex with exec or test, it remembers where it left off via the lastIndex property. If you reuse the same regex object for two different strings without resetting lastIndex = 0, the second call starts from the wrong position. To avoid this, always use regex literals inline or create a new RegExp each time.
What is the difference between match and matchAll?
With a /g flag, match returns an array of all matched strings but loses group information. matchAll (ES2020) returns an iterator of full match objects β€” each with the full match, groups, and index β€” making it far more useful when you need capturing groups from multiple matches.
How do I escape special characters in a dynamic regex?
Use str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&') to escape all regex metacharacters in a string before passing it to new RegExp(). Many environments also have a dedicated RegExp.escape() method coming in a future ES version.
Is regex always the best tool for parsing?
No. For HTML/XML use a proper DOM parser; for JSON use JSON.parse(); for complex grammars use a dedicated parser library. Regex excels at pattern matching in strings but becomes unreadable and unreliable for nested or context-sensitive structures.
What is catastrophic backtracking?
Certain regex patterns with nested quantifiers can cause exponential time complexity on specific inputs, hanging or crashing the engine. For example: /^(a+)+$/ on a long string of 'a's followed by a non-matching character. Always test regex patterns against adversarial inputs before using them in production.
Ad – 336Γ—280

πŸ“‹ Summary

  • Use regex literals /pattern/flags for static patterns; new RegExp(str, flags) for dynamic ones.
  • Common flags: g (global), i (case-insensitive), m (multiline), s (dotAll).
  • Shorthand classes: \d (digit), \w (word char), \s (space); uppercase negates (\D, etc.).
  • Add ? after a quantifier (*?, +?) to make it lazy instead of greedy.
  • Capturing groups () capture matches; non-capturing (?:) group without capturing.
  • Named groups (?<name>) let you access matches by name via match.groups.
  • Lookahead (?=) and lookbehind (?<=) assert context without consuming characters.
  • Use matchAll with /g to get all matches with group information.