What is Regex Validation?

Regex validation is a way of validation technique where you describe a text pattern using a compact, standardized syntax, and not accepting responses that don't fit that pattern. It may sound complicated (because... it is), but when you get the hang of it, using regular expression for validation can get you much cleaner data in your forms.

Why do you need regex validation? In other words, why does standard validation fall short?

Most form builder tools offer built-in validation, sometimes without you noticing it:

  • The email field only accepts an answer when you use @ and follow it with a domain.
  • Phone fields sometimes have settings where you can choose a country.
  • Number fields often have numeric range settings.

These work fine for general-purpose forms. Where they fall short is anywhere you're collecting data that follows your own internal logic. Here are some cases that built-in validation can't handle without regex:

  • An employee ID with a specific alphanumeric structure (e.g., EMP-2024-0042)
  • A product SKU that must start with a category code
  • A UK National Insurance number (AB 12 34 56 C)
  • A postal code from a specific country that standard "zip code" fields don't recognize
  • A coupon code where only certain character combinations are valid

In all of these cases, a text field without a regex will accept anything. That means messy, inconsistent data on the backend, and extra cleanup work for whoever processes the submissions.

The details you should be aware of upfront

  • Regex is not beginner-friendly: Writing a regular expression from scratch requires knowing the syntax, character classes, quantifiers, anchors, groups. There are entire books on this. If you've never written regex before, do not worry; common AI tools like ChatGPT or Claude are really great at generating Regex. Tools like regex101.com let you test patterns against sample inputs before using them in your form.
  • Not all regex engines behave the same: Most form builders use JavaScript regex. Some use PCRE (Perl-Compatible Regular Expressions). A pattern that works perfectly in one environment may need tweaking in another. Check which engine your tool uses before copying a regex from the internet.
  • An invalid regex can break your form: Some form builder tools will block form submission entirely if the regex pattern is malformed. Respondents see an error, they can't submit, and they're stuck. So, always test your form after adding regex validation.

Regex validation examples

Here are some real-world scenarios where regex validation makes a meaningful difference.

Internal reference number

Imagine a company uses a form to log support tickets for sold products. Every ticket must include an order number. And let's say this order number is formatted as three letters, a dash, and five digits (e.g., RFG-00421). Without regex, respondents may enter whatever they feel like, and the support team will have a hard time matching tickets with orders. With the pattern ^[A-Z]{3}-\d{5}$ applied to that field, only valid order numbers get through.

Region-specific postal code

A logistics company accepts orders from the UK only. Standard zip code validation is US-centric and won't catch UK postcode formatting. A regex pattern for UK postcodes filters out entries that don't match the expected format. So the company will be catching mistakes before they cause shipping problems downstream.

Domain-restricted email

Imagine a university wants to restrict an internal nominations form to staff only. The built-in email field accepts anything that looks like an email. A regex pattern like ^ [\w.] + @ university . edu$ restricts submissions to addresses from their own domain (To use this pattern, you just need to remove the spaces and change the domain).

Vehicle registration plate

A parking permit application asks for the applicant's vehicle registration. In the UK, the current format is two letters, two digits, a space, and three letters (e.g., AB12 CDE). A regex catches entries in clearly wrong formats, so people who type their plate with dashes, skip the space, or include their country identifier by mistake.

When is regex not enough? Or not what you are looking for?

Regex is good at checking shape, but cannot check meaning. That distinction matters more than it might seem.

  • Regex cannot verify that something actually exists: A pattern like ^[A-Z]{3}-\d{5}$ will accept XYZ-00001 even if that reference number doesn't exist in your systems. For that kind of lookup, you need a real-time API call or server-side validation, which form builders generally don't offer natively.
  • Regex cannot validate dates properly: You can use regex to check that a date looks like a date (^\d{2}/\d{2}/\d{4}$), but you cannot use it to check that the date is real. 31/02/2025 would pass that pattern even though February 31st doesn't exist. For dates, use a dedicated date field with range restrictions, not a text field with regex. Luckily, most form builder tools have validation settings built in for the date & time fields.
  • Regex cannot do conditional validation across fields: If you want a field to match one pattern when another field is set to "A" and a different pattern when it's set to "B," regex alone can't do that. For this, you need to create the same field twice with different regex validations and set up conditional logic.
  • Regex cannot validate files, images, or selections: It only works on text input. Dropdown answers, checkbox selections, file uploads, and signature fields; none of these can be validated with regex.

Frequently asked questions

Can regex validation be combined with other validation rules?

Yes, in most form builder tools. You can typically combine regex with required fields, character limits, and conditional logic. For example: the field is only required if a previous question was answered a certain way, and when it is required, the input must match your pattern.

Can regex validate email addresses more strictly than the built-in email field?

Yes. The built-in email field in most form builders checks that the input looks like an email address (something@something.something). If you need stricter rules, only emails from a specific domain, or only corporate addresses, a regex pattern gives you that control. For example, ^[\w.]+@yourcompany\.com$ would only accept addresses from your company's domain.

What does ([a-zA-Z0-9_]) mean in regex?

It matches exactly one character that is either a letter (uppercase or lowercase), a digit, or an underscore. Breaking it down: the square brackets [] define a character class — a set of characters where any one of them is acceptable. Inside, a-z means any lowercase letter from a to z, A-Z means any uppercase letter, 0-9 means any digit, and _ is a literal underscore. The parentheses around it create a capture group, which lets you reference that matched character separately if needed. You'll often see this pattern used to validate usernames; it's a way of saying "only standard characters, no spaces, no symbols."

How do I check if a regex pattern is valid before using it in my form?

Regex101.com is the standard tool for checking regex patterns. Paste your pattern in, select the engine your form builder uses (usually JavaScript), and test it against a handful of sample inputs, both ones that should pass and ones that should fail. If a valid input gets rejected, or an invalid one slips through, you'll see it immediately and can adjust the pattern before it goes anywhere near a live form. It also explains what each part of your pattern does in plain English, which is useful if you inherited a pattern from someone else and aren't sure what it's actually checking.

What is $& in regex?

`&` is a replacement reference, not a matching pattern. It appears in the *replacement* part of a find-and-replace operation, and it means "insert the full matched text here." For example, if you matched the word `hello` and replaced it with `[ &]`, the result would be `[hello]` ; the original match wrapped in brackets. In the context of form builders, you're unlikely to use `$&` in a validation pattern. Where it shows up is in tools that support reformatting, like Cognito Forms, where you can manipulate how matched input is restructured before it's displayed or stored.

What does \s* mean in regex?

\s matches any whitespace character; a space, a tab, a newline, or a carriage return. The * after it means "zero or more times." So \s* matches any amount of whitespace, including none at all. It's commonly used to make a pattern more forgiving of spacing. For example, ^\s*hello\s*$ would match "hello", " hello", "hello ", and " hello ", with or without leading or trailing spaces. In form validation, \s* is useful when you want to allow respondents to include extra spaces without failing validation, though it's worth thinking carefully about whether you want to allow spaces at all in structured fields like codes or IDs.

What does \\ mean in regex?

A single backslash \ is the escape character in regex; it tells the engine that the character following it has a special meaning (like \d for digits or \s for whitespace). Two backslashes \\ means you want to match a literal backslash character in the input. This comes up in programming contexts more than in form validation, but if you're ever writing a pattern in a language where strings also use backslash as an escape character (like JavaScript or Python), you sometimes need to double up, \\d in your code becomes \d by the time regex sees it. If you're pasting a regex directly into a form builder's validation field, one backslash is usually enough; the form builder handles the interpretation for you.