Your Java Regex Can Be Weaponized (And How To Stop It)
Source: Dev.to
Most developers don’t realize their input validation is a denial‑of‑service vulnerability waiting to happen. Let me show you what I mean.
String regex = "^([a-zA-Z0-9]+)+@[a-zA-Z0-9]+\\.[a-zA-Z]{2,}$";
Pattern.compile(regex).matcher(input).matches();
Looks fine, right? Now feed it this input:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!
Your CPU just pegged at 100 % and stayed there. This is called ReDoS (Regular Expression Denial of Service) and it happens because Java’s regex engine uses backtracking. Certain patterns cause exponential time complexity when matching fails in specific ways.
Attackers know about this. They send crafted inputs to your validation endpoints and watch your servers melt.
The Backtracking Problem
Java’s java.util.regex uses an NFA (Nondeterministic Finite Automaton) with backtracking. When a match fails partway through, the engine backtracks and tries alternative paths. With nested quantifiers like ([a-zA-Z0-9]+)+ the number of paths explodes exponentially.
The input aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa! has 33 characters. The regex engine will try somewhere around 2³³ combinations before giving up—about 8 billion operations for a single validation call.
The Fix
There’s a different kind of regex engine called RE2 that Google built. It uses a DFA (Deterministic Finite Automaton) which guarantees linear‑time matching. No matter how evil the pattern or input, it finishes in O(n) time where n is the input length.
I’ve been working on a Java validation library called Rules that bundles a patched fork of RE2J (the Java port of RE2). The fork includes fixes for some vulnerabilities found in the original implementation.
Rule email = StringRules.matches(
"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$"
);
ValidationResult result = email.validate(userInput);
if (result.isValid()) {
// safe to use
}
Same pattern, but now it can’t be weaponized. The underlying RE2J engine simply won’t allow catastrophic backtracking because it doesn’t backtrack at all.
What Else Can Go Wrong
HashDoS
Attackers can craft keys that all hash to the same bucket, turning your O(1) HashMap lookups into O(n). The library includes SecureHashMap, which uses SipHash‑2‑4 with random keys to prevent this.
Timing Attacks
Comparing secrets with equals() leaks information through timing differences. Character‑by‑character comparison bails early on mismatch, allowing attackers to guess passwords one character at a time. The library provides constant‑time comparison functions.
Stack Overflow via Recursion
Self‑referential data structures can blow your stack during validation. The library tracks depth and detects cycles.
Quick Example
Validating a user registration with multiple rules composed together:
Rule username = Rules.all(
StringRules.notBlank(),
StringRules.lengthBetween(3, 20),
StringRules.matches("^[a-zA-Z0-9_]+$")
);
Rule password = Rules.all(
StringRules.minLength(12),
StringRules.matches("[A-Z]"),
StringRules.matches("[a-z]"),
StringRules.matches("[0-9]")
);
username.validate("user_dev").isValid(); // true
password.validate("SecurePass123").isValid(); // true
Getting It
Maven
io.github.xoifaii
rules
1.0.0
Or clone from the GitHub repository if you want to poke around. Full API docs are in the wiki. The RE2J fork is bundled and has zero external dependencies.
If you have a public‑facing Java app doing input validation with regex, it’s worth checking whether your patterns are vulnerable. Tools like recheck can analyze patterns for ReDoS, which is very useful.
Or just use an engine that makes the problem impossible in the first place.