Rate Limiting Your API: Algorithms, Implementation, and the Strategic Thinking Behind It

Published: 1 month ago (March 29, 2026 at 08:22 PM EDT)

3 min read

Source: Dev.to

Source: Dev.to

Every API you expose to the internet will eventually be abused—automated scrapers, credential‑stuffing bots, misbehaving integrations, or even a well‑meaning client with a loop that runs too fast. Without rate limiting, a single bad actor can consume all your server resources and degrade the experience for every other user.

Threats Addressed by Rate Limiting

Resource protection – Prevent any single client from consuming a disproportionate share of CPU, memory, database connections, or bandwidth.
Cost control – Unconstrained clients can rack up significant charges (e.g., AI inference APIs, SMS providers, payment processors) in minutes.
Abuse prevention – Credential stuffing and enumeration attacks rely on volume; rate limiting raises the cost for attackers.
Fair access – In multi‑tenant systems, rate limiting ensures one tenant’s spike doesn’t degrade everyone else’s experience.

Common Algorithms

Fixed Window

Counts requests per client in a fixed time interval.
Implementation in Redis: a single INCR with EXPIRE.
Weakness: boundary problem—a client can send the maximum at the end of one window and again at the start of the next, effectively doubling the rate briefly.

Sliding Window Log

Tracks timestamps of every request in the window, eliminating the boundary problem.
Drawback: memory‑intensive (e.g., 1,000 requests/min × 10,000 clients = 10 million timestamps).
Best suited for low‑volume, high‑value endpoints like login or password reset.

Sliding Window Counter (recommended default)

Maintains counters for the current and previous fixed windows, then computes a weighted count based on how far into the current window you are.
Offers a good balance of accuracy, memory efficiency, and implementation simplicity.

Token Bucket

Models rate limiting as a bucket that fills at a steady rate.
Two parameters: refill rate (sustained throughput) and bucket capacity (burst tolerance).
Used by most cloud providers and maps naturally to tiered pricing models.

Layered Rate Limiting

Edge / Load Balancer (Nginx, Cloudflare, AWS API Gateway) – protects application servers from excessive traffic before it reaches them.
API Gateway or Middleware – enforces business‑level limits by authenticated user, API key, subscription tier, or endpoint.
Individual Services – in microservice architectures, prevents a misbehaving upstream service from overwhelming a downstream dependency.

Each layer mitigates different failure modes; don’t rely on a single line of defense.

Transparency to Clients

Include rate‑limit headers in every response:
- X-RateLimit-Limit – the maximum number of requests allowed.
- X-RateLimit-Remaining – how many requests are left in the current window.
- X-RateLimit-Reset – when the window resets (epoch time).
On a 429 Too Many Requests response, add a Retry-After header indicating when the client may retry.
This turns rate limiting from a blunt instrument into a collaborative mechanism between your API and its consumers.

Practical Recommendations

Default algorithm: Sliding Window Counter – close‑to‑accurate with minimal memory overhead, without the complexity of full logs.
Identify clients by API key, not by IP address. IP‑based limiting is increasingly unreliable due to shared corporate proxies and botnet‑distributed requests.
Treat rate limits as a product decision as well as a technical one: define tier limits, decide when to return 429 versus graceful degradation, and size burst capacity to match user experience expectations.

Read the full article at for complete algorithm implementations, Redis code examples, and a guide to designing rate‑limit tiers for subscription‑based APIs.

Originally published at NovVista.

Rate Limiting Your API: Algorithms, Implementation, and the Strategic Thinking Behind It

Threats Addressed by Rate Limiting

Common Algorithms

Fixed Window

Sliding Window Log

Sliding Window Counter (recommended default)

Token Bucket

Layered Rate Limiting

Transparency to Clients

Practical Recommendations

Related posts

I got rate-limited scraping 100 pages. Here's what actually worked

HubSpot API Autopsy: What Breaks When Agents Try to Use It

🚀 Deep JavaScript Internals: How V8 Really Makes Your Code Fast

OAuth 2.0 Explained: From Authorization Codes to PKCE (The Complete Picture)