Designing a URL Shortener

Published: (February 28, 2026 at 11:44 PM EST)
4 min read
Source: Dev.to

Source: Dev.to

Requirements

Core

  • Users can create a short URL from a long URL.
  • Visiting the short URL redirects to the original URL.

Optional

  • Custom alias (e.g., short.ly/myname)
  • Expiry time for links
  • Analytics (click count, geo, device info)
  • High availability (service should almost never go down)
  • Low‑latency redirection (redirect should feel instant)
  • Massive scale (millions of URLs, billions of redirects)
  • Durable storage (no data loss even if a server crashes)

Assumptions & Load Estimates

  • 10 M new URLs per month
  • 1 B redirects per month
  • Read‑to‑write ratio: ~100:1

Implications

  • The system is read‑heavy.
  • Caching is critical.
  • The database must scale horizontally.
  • Writes are manageable; reads can explode.

High‑Level Architecture

Client ──► Load Balancer ──► URL Generation Service ──► Key‑Value Store
                     │                                 │
                     ▼                                 ▼
                Redirection Service ◄── Cache (Redis) ◄───

Write Path (Create Short URL)

  1. POST /shorten request arrives at the load balancer.
  2. URL Generation Service creates a unique short code.
  3. Mapping (short_code → long_url) is stored in a key‑value database.
  4. The short URL is returned to the client.

Read Path (Redirect)

  1. GET /{short_code} request hits the load balancer.
  2. Redirection Service checks the cache (e.g., Redis).
  3. Cache hit: return HTTP 301/302 redirect immediately.
  4. Cache miss: fetch mapping from the database, populate the cache, then redirect.

API Specification

Create Short URL

POST /shorten
Content-Type: application/json
{
  "long_url": "https://example.com/very/long/url",
  "custom_alias": "optional",
  "expiry": "optional timestamp"
}

Response

{
  "short_url": "https://short.ly/abc123"
}

Redirect

GET /{short_code}

Response – HTTP 301 or 302 redirect to the original URL.

ID Generation Strategies

Auto‑increment ID + Base62

  • Increment a numeric ID.
  • Convert the ID to Base62 (a‑zA‑Z0‑9).
  • Guarantees uniqueness, deterministic, and easy to scale.

Hash‑based

  • Hash the long URL and take the first 6–8 characters.
  • Problem: possible collisions → requires collision resolution.

Composite (Timestamp + Machine ID + Sequence)

  • Useful when multiple servers generate IDs concurrently.
  • Provides high scalability and uniqueness.

Sharding

  • Shard key: hash(short_code) % N
  • Benefits: even distribution, prevents hotspotting, simple lookup.

Database Replication

  • Primary node: handles writes (new short URLs).
  • Read replicas: serve redirects.

Advantages

  • Reads dominate traffic → can scale reads independently.
  • Improves availability; a replica can be promoted if the primary fails.

Caching Layer

  • Cache key: short_code → long_url.
  • Typical cache: Redis or in‑memory store.

Cache workflow

  1. Check Redis.
  2. On hit → immediate redirect.
  3. On miss → query DB, update cache, then redirect.

Cache Tuning

  • TTL for entries (optional).
  • LRU eviction policy.
  • Pre‑warm cache for popular URLs.

Analytics & Click Counting

Updating click counts synchronously slows down redirects.

Better approach

  1. Emit a click event to a message queue (Kafka, RabbitMQ, etc.).
  2. Background workers consume events and update analytics asynchronously.

Benefits

  • Keeps the redirect path fast.
  • Decouples analytics from user experience.
  • Allows independent scaling.

Edge Cases & Trade‑offs

  • Custom alias collisions: reject or prompt user to choose another alias.
  • Short code collisions: use deterministic generation (auto‑increment) or resolve collisions with retries.
  • Malicious/spam URLs: integrate URL safety checks (e.g., Google Safe Browsing).
  • Expired links: return a 410 Gone or a custom error page.
  • 301 vs 302: most services prefer 302 to allow destination changes later.

Summary

A URL shortener is a classic read‑heavy system. Optimizing the redirection path with aggressive caching, read replicas, and asynchronous analytics is essential for low latency and high scalability. Design decisions—such as ID generation, sharding, and cache policies—must balance uniqueness, performance, and operational simplicity.

0 views
Back to Blog

Related posts

Read more »

Line of Defense: Three Systems, Not One

Three Systems, Not One “Rate limiting” is often used as a catch‑all for anything that rejects or slows down requests. In reality there are three distinct mecha...