Selenium keeps getting blocked by Cloudflare? Here's what the fingerprint actually catches (and how to stop triggering it)
Source: Dev.to
The actual thing Cloudflare catches
Selenium’s ChromeDriver leaks the automation flag in at least three observable ways:
navigator.webdriver === true— exposed by design, WebDriver spec requires it- CDP client signature — ChromeDriver wraps Chrome’s DevTools Protocol with a specific RPC pattern that’s detectable via timing and order of
Target.*calls - Missing browser UI signals — Selenium launches Chrome without certain accessibility/window events that real users always generate
One of the top comments on the Reddit thread summarized it well:
“Selenium operates using a ChromeDriver or a GeckoDriver binary, which any respectable company that doesn’t want bots on its website can fingerprint. That doesn’t mean Selenium is broken — it just means it was not made for what you’re trying to do. Selenium’s purpose is automated testing.”
Selenium was designed for QA, where you want the site to know you’re an automated test. Cloudflare’s Bot Management scores those same signals against a human baseline, and the score drops quickly.
What the comments recommend (and what actually works)
| Tool | What it does | Catch |
|---|---|---|
undetected-chromedriver | Patches the WebDriver flag + CDP strings | Cloudflare pushes updates that re‑detect it every few months |
SeleniumBase CDP mode | Skips ChromeDriver, talks CDP directly to Chrome | Works on most CF sites; still one process per browser |
curl_cffi | Impersonates a browser’s TLS JA3 fingerprint | No JS execution — breaks on sites that hydrate with React |
nodriver / zendriver | Headless‑less Chrome with patched CDP | Good for low‑scale; resource‑heavy at 1 M pages |
| Real Chrome + stealth profile | Actual Chrome binary, persistent profile, cookies survive | What most anti‑bot services assume |
The last row is what I’ll show below — and it’s what I’ve been using.
The actual result
The two captures are from the same browser process, same machine, same IP. The only variable was the fingerprint configuration.
How I’m doing it now
I’ve been using browser-act — a CLI that drives a real Chrome with a persistent stealth profile. One command:
# Install (uses the skills package registry):
npx skills add browser-act/skills --skill browser-act
# Open a Cloudflare‑protected page in a stealth session:
browser-act --session scrape browser open https://target.site
browser-act --session scrape get markdown > out.md
The profile persists cookies and storage between runs, so the “warm browser” signals (history, localStorage, prior CF cookies) look human. For the r/webscraping OP’s scale question (~1 M pages), you’d run this with a pool of profile IDs and rotate — but that’s a separate post.
Things worth arguing about
- If your target is a Cloudflare Turnstile specifically (not the full JS challenge), you’re in a different regime —
curl_cffi+ an injected widget can work, as one of the r/webscraping replies showed. undetected-chromedriveris the cheapest entry point if you already have Selenium code and low volume.- Residential proxies matter almost as much as the browser fingerprint. If your IP is a datacenter ASN, nothing in the browser layer saves you.
If you’re fighting this problem right now, I’d love to hear what site you’re on and what’s been rejected — happy to compare notes. The full discussion is at the r/webscraping original thread.