DevLab
.*
Regexintermediate

Regex to String Generator

generate matching strings from a regex pattern

By Bikram NathLast updated

Paste a regular expression and get back sample strings the pattern would match. The most common use is seeding unit-test fixtures: feed `/^\d{4}-\d{2}-\d{2}$/` and you get strings like `2019-03-14` ready to paste into test files. Unlike regex101, which validates strings you already have, this tool works in reverse and constructs matching strings directly from the pattern structure.

Try it now — free, instant, no signup

What is Regex to String Generator?

A regex-to-string generator takes a pattern like `/[A-Z]{2}\d{5}/` and produces concrete strings, for example `AB12345` or `QR00891`, that the pattern accepts. It works by traversing the regex syntax tree: literals emit themselves, character classes pick a random member, and quantifiers repeat within their stated bounds.

Reach for this when you need fixture data fast. regex101 tells you whether a string you wrote matches a pattern; this inverts that workflow and manufactures the strings instead. It also differs from Faker.js, which produces semantically meaningful data like real names or cities. Here the only guarantee is structural conformance to the pattern, which is exactly what boundary-condition tests need.

The main gotcha is unbounded quantifiers. A bare `+` or `*` has no upper bound written in the pattern, so the generator must pick an arbitrary cap, typically 10 to 100 repetitions. That means `/a*/` might produce `aaaaaaaaaa` rather than an empty string, even though both are valid matches. If short outputs matter for your tests, replace `*` and `+` with explicit bounds like `{0,3}` before generating.

When to use Regex to String Generator

Seed a unit-test suite with phone numbers or order IDs conforming to a pattern like `\+1-\d{3}-\d{3}-\d{4}` without writing each string by hand.
Verify that a new validation regex accepts all intended formats before wiring it into a form submission or API request handler.
Generate structured dummy records for database fixtures when a schema defines a column format as a regex constraint.

Frequently Asked Questions

Why does a pattern with `.*` produce a string that is hundreds of characters long?
Unbounded quantifiers like `*` and `+` have no maximum repetition count baked into the pattern, so the generator must choose a cap arbitrarily. Many implementations default to something between 10 and 100, which means `.*` can produce dozens or hundreds of characters depending on the library. To get shorter, predictable outputs, rewrite the quantifier with an explicit upper bound before generating, for example `{0,5}` instead of `*` or `{1,4}` instead of `+`.
Does the generator handle backreferences like `(\w+)\s\1`?
Most browser-based reverse-regex tools do not support backreferences. A backreference like `\1` requires the engine to remember what the first capture group matched and then emit that exact string again, which breaks the token-by-token generation model. If the tool encounters a backreference it typically either throws a parse error or ignores the constraint and emits an independent string for `\1`, producing output that would not satisfy the original pattern. Always validate generated strings against the original regex to catch this case.
Does it support Unicode property escapes like `\p{Lu}` for uppercase letters?
Unicode property escapes were standardized in ECMAScript 2018 and parse correctly in current V8, SpiderMonkey, and JavaScriptCore. Whether the generator expands them depends on the underlying library. Libraries that enumerate category members can handle `\p{Letter}` or `\p{Lu}`, but doing so is expensive because a single category can contain thousands of code points. If the output contains a literal `p` followed by `{Lu}`, the library is not expanding the property and you need to substitute an explicit character range like `[A-Z]` before generating.
Why do the generated email-like strings fail real validation even though they match the regex?
A regex enforces structural rules, not semantic ones. A pattern like `/[\w.]+@[\w.]+\.[a-z]{2,}/` matches strings like `...@1.zz` or `a@test.io`, which are structurally conformant but not deliverable addresses. The generator produces strings guaranteed to satisfy the pattern's syntax tree and nothing beyond that. If you need strings that also pass RFC 5322 compliance or mail-server checks, you need a more restrictive pattern or a semantically aware generator like Faker.js with its `internet.email()` method.
Are the generated strings random, or does the tool always produce the same output for a given pattern?
Character classes are sampled randomly from their members on each run, so generating from `/[a-z]{3}/` twice usually produces different three-letter strings. Literals and anchors are always deterministic. The count for bounded quantifiers like `{2,5}` is typically chosen randomly within the stated range. This means repeated generations give you a distribution of valid strings rather than one fixed example, which is useful for surfacing edge cases that a single hand-written fixture might never hit.

Related Tools