Why Your Selenium Tests Are Flaky (And How to Fix Them Forever)

Published: 20 hours ago (December 15, 2025 at 08:08 AM EST)

4 min read

Source: Dev.to

What This Article Covers

The Flakiness Problem – Why time.sleep() and WebDriverWait aren’t enough
What Causes Flaky Tests – Racing against UI state changes
The Stability Solution – Monitoring DOM, network, animations, and layout shifts
One‑Line Integration – Wrap your driver with stabilize() — zero test rewrites
Full Diagnostics – Know exactly why tests are blocked

If you’ve worked with Selenium for more than a week, you’ve probably written code like this:

driver.get("https://myapp.com/dashboard")
time.sleep(2)  # Wait for page to load
driver.find_element(By.ID, "submit-btn").click()
time.sleep(1)  # Wait for AJAX

You may feel the shame of knowing it’s wrong—but also the relief of “it works.” Until it doesn’t. Until the CI server is 10 % slower than your machine, and suddenly your tests fail 20 % of the time.

This is the story of flaky tests, why they happen, and how a library called waitless can eliminate them.

The Flakiness Problem

Consider a real scenario: a React dashboard where a user clicks a button, an API call is made, data returns, React re‑renders, a spinner disappears, and a table appears. The whole sequence takes about 400 ms, but the test does this:

button = driver.find_element(By.ID, "load-data")
button.click()
table = driver.find_element(By.ID, "data-table")  # 💥 BOOM

The table doesn’t exist yet; Selenium throws NoSuchElementException. The quick “fix” is often:

button.click()
time.sleep(2)
table = driver.find_element(By.ID, "data-table")  # Works… usually

The Problem with `time.sleep()`

Adds unnecessary delay (e.g., 2 seconds slower than needed)
Remains flaky when the API takes longer than expected
Provides no insight when a failure occurs

Why Traditional Solutions Don’t Work

`time.sleep()` — The Naïve Approach

Sleep for a fixed duration and hope the UI is ready.

Problems: Too short → test fails; Too long → test suite drags; No feedback on actual state.

`WebDriverWait` — The “Correct” Approach

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

WebDriverWait(driver, 10).until(
    EC.element_to_be_clickable((By.ID, "submit-btn"))
)

This waits for a specific condition, but it only checks one element. It ignores:

Overlays from ongoing animations
Pending AJAX requests
React re‑renders that move elements

Retry Decorators — The Denial Approach

@retry(tries=3, delay=1)
def test_dashboard():
    driver.find_element(By.ID, "submit-btn").click()

Retries merely hide flakiness; they don’t solve it.

What Actually Causes Flaky Tests?

After debugging hundreds of flaky tests, the root cause is racing against the UI:

What You Do	What’s Actually Happening
Click a button	DOM is being mutated by the framework
Assert text content	AJAX response still in flight
Interact with modal	CSS transition still animating
Click navigation link	Layout shift moves the element

The real question isn’t “Is this element clickable?” but “Is the entire page stable and ready for interaction?”

Defining “Stability”

Four key signals indicate a stable UI:

DOM Stability – No elements are being added, removed, or modified.
Detection: MutationObserver watching the document root; track time since the last mutation.
Network Idle – All AJAX requests have completed.
Detection: Intercept fetch() and XMLHttpRequest; count pending requests.
Animation Complete – All CSS animations and transitions have finished.
Detection: Listen for animationstart, animationend, transitionstart, transitionend events.
Layout Stable – Elements have stopped moving; no more layout shifts.
Detection: Track bounding‑box positions of interactive elements over time.

The Architecture

JavaScript Instrumentation (runs in the browser)

window.__waitless__ = {
    pendingRequests: 0,
    lastMutationTime: Date.now(),
    activeAnimations: 0,

    isStable() {
        if (this.pendingRequests > 0) return false;
        if (Date.now() - this.lastMutationTime < 100) return false;
        // Add additional checks for animations and layout if needed
        return true;
    }
};

The script is injected via execute_script() and monitors DOM mutations, network activity, and animations.

Python Engine (evaluates stability)

class StabilizationEngine:
    def wait_for_stability(self):
        """Wait until all stability signals are satisfied."""
        # Checks performed automatically:
        # ✓ DOM mutations have settled
        # ✓ Network requests completed
        # ✓ Animations finished
        # ✓ Layout is stable

The engine repeatedly queries the browser state until isStable() returns True.

The Magic: One‑Line Integration

Zero test modifications are required. Add a single line to wrap the driver:

from waitless import stabilize

driver = webdriver.Chrome()
driver = stabilize(driver)  # ← Only change needed

# Existing tests work unchanged
driver.find_element(By.ID, "button").click()  # Auto‑waits!

stabilize() returns a StabilizedWebDriver that intercepts find_element() calls. Returned elements are wrapped in StabilizedWebElement, whose click() method first waits for stability:

class StabilizedWebElement:
    def click(self):
        self._engine.wait_for_stability()  # Auto‑wait!
        return self._element.click()      # Then click

Your tests no longer know they’re waiting—they just stop failing.

Handling Edge Cases

Real applications have perpetual activity (spinners, analytics polling, WebSocket heartbeats). waitless offers configurable thresholds.

Example: Ignoring Infinite Animations

from waitless import StabilizationConfig, stabilize

config = StabilizationConfig(
    network_idle_threshold=2,      # Allow up to 2 pending requests
    animation_detection=False,    # Ignore spinners/continuous animations
    strictness='relaxed'          # Only check DOM mutations
)

driver = stabilize(driver, config=config)

You can tailor the detection to suit your app’s behavior without rewriting tests.

Why Your Selenium Tests Are Flaky (And How to Fix Them Forever)

What This Article Covers

The Flakiness Problem

The Problem with `time.sleep()`

Why Traditional Solutions Don’t Work

`time.sleep()` — The Naïve Approach

`WebDriverWait` — The “Correct” Approach

Retry Decorators — The Denial Approach

What Actually Causes Flaky Tests?

Defining “Stability”

The Architecture

JavaScript Instrumentation (runs in the browser)

Python Engine (evaluates stability)

The Magic: One‑Line Integration

Handling Edge Cases

Example: Ignoring Infinite Animations

Related posts

Using AI Alongside Your UI Tests, Not Instead of Them

Java ATM CLI Dev Log #2: Transfer Cash, Hanging?

A kernel bug froze my machine: Debugging an async-profiler deadlock

5 Browser DevTools Tricks That Cut My Debug Time in Half

What This Article Covers

The Flakiness Problem

The Problem with time.sleep()

Why Traditional Solutions Don’t Work

time.sleep() — The Naïve Approach

WebDriverWait — The “Correct” Approach

Retry Decorators — The Denial Approach

What Actually Causes Flaky Tests?

Defining “Stability”

The Architecture

JavaScript Instrumentation (runs in the browser)

Python Engine (evaluates stability)

The Magic: One‑Line Integration

Handling Edge Cases

Example: Ignoring Infinite Animations

Related posts

Using AI Alongside Your UI Tests, Not Instead of Them

Java ATM CLI Dev Log #2: Transfer Cash, Hanging?

A kernel bug froze my machine: Debugging an async-profiler deadlock

5 Browser DevTools Tricks That Cut My Debug Time in Half

The Problem with `time.sleep()`

`time.sleep()` — The Naïve Approach

`WebDriverWait` — The “Correct” Approach