Adaptive Rate Limiting with Redis and Lua

Published: 19 hours ago (April 28, 2026 at 01:36 PM EDT)

4 min read

Source: Dev.to

Making Rate Limiting Correct Under Concurrency

Most rate limiting tutorials stop at the single‑instance case. That’s fine for learning, but it breaks quickly in production. Once you have multiple instances and real traffic patterns, the problem changes. It’s no longer just about picking an algorithm — it’s about correctness under concurrency. This article walks through what actually goes wrong and how to fix it.

The In‑Memory Trap

The first implementation most people write looks like this:

keep a counter in memory
increment on each request
reject when the limit is reached

This works perfectly in a single instance. Deploy two instances and each has its own counter, so a client can exceed your intended limit simply by hitting different instances. At that point, you don’t have a rate limiter anymore.

Redis Fixes Distribution, Not Concurrency

Moving state to Redis makes all instances share the same counters. A typical implementation looks like this:

Read current count from Redis
Check against limit
Increment and write back

These are separate operations. Under concurrent load:

two requests read the same value
both pass the check
both increment

Now your limit is only approximate.

The Real Problem: Atomicity

The issue isn’t Redis; it’s that the decision is split across multiple steps. What you need is a single, atomic operation that reads state, applies logic, and updates state.

The Fix: Lua Scripts in Redis

Redis supports Lua scripts that execute atomically—no other command runs between the start and end of the script. Instead of multiple round trips (read → apply logic → update → return decision), you do everything inside one script.

local current = redis.call("GET", KEYS[1]) or 0
if tonumber(current) >= tonumber(ARGV[1]) then
  return {0, current}
end

current = redis.call("INCR", KEYS[1])
redis.call("EXPIRE", KEYS[1], ARGV[2])

return {1, current}

This ensures:

no race conditions
consistent decisions across instances
predictable behavior under load

Where Algorithms Fit In

At this point you can plug in different strategies:

Token Bucket → allows bursts, smooths over time
Sliding Window → more accurate but heavier
Leaky Bucket → enforces steady flow

The key point: the algorithm matters less than where the decision happens. If your logic isn’t atomic, the algorithm won’t save you.

Static Limits Miss Real Traffic Behavior

Even with correct enforcement, static limits are too rigid. Real traffic includes:

legitimate bursts
scrapers probing endpoints
repeated identical requests
denial loops

A fixed limit treats all of these the same.

Adding a Behavior Layer

A simple improvement is to track short‑term behavior:

request volume over a short window (burst detection)
repeated request fingerprints
number of unique routes hit (scan detection)
repeated denials

This produces a basic risk score that maps to tiers:

normal
elevated
suspicious
blocked

The important separation:

Limiter → enforces limits
Policy → decides how strict to be

This keeps the system easier to reason about and tune.

Tradeoffs

This approach is not free:

Lua scripts add complexity
Debugging moves closer to Redis
Redis becomes a critical dependency

For systems that need consistency under concurrency, the tradeoff is worth it.

Key Takeaway

The biggest lesson is not about token buckets or sliding windows. Correctness in rate limiting comes from atomic decision‑making. Once you ensure:

a single source of truth
atomic execution
consistent state across instances

the rest becomes much easier.

Closing

I built this approach into a small system to explore the problem end‑to‑end. If you’re interested in seeing a full implementation (TypeScript + Redis + Lua), check it out here:

👉

If you’ve dealt with this problem in production, I’d be interested to hear how you approached it.