Scaling PostgreSQL without Microservices: Lessons from Notion’s 480 Shards

Published: 2 months ago (February 21, 2026 at 01:03 AM EST)

2 min read

Source: Dev.to

Source: Dev.to

TL;DR: Scaling Notion’s Monolith

Application‑Level Sharding – 480 logical shards mapped to a smaller set of physical nodes.
Shard Router – Implemented in TypeScript using space_id % 480 to route requests instantly.
PgBouncer – Acts as a traffic controller, pooling connections to prevent overload.
Zero‑Downtime Migrations – “Shadow Write” strategy moves billions of rows while keeping the app live.

Chapter 1: The Problem with Monolith

In Notion’s early architecture, a Node.js backend was paired with a single PostgreSQL instance. Over time, they encountered:

CPU Saturation – Daily spikes regularly hit 90 %+ utilization.
The Vacuum Problem – Autovacuum couldn’t keep up, risking a Transaction ID wraparound, which would halt writes to protect data integrity.

Chapter 2: Why Not Microservices?

The typical microservice argument is “split the code, split the load.” Notion chose a different path, keeping a monolithic codebase while scaling the data layer.

Chapter 3: The 480‑Shard Blueprint

Logical Shards

Partition Key – space_id is used so all data for a workspace stays together, enabling fast joins.
Setup – 480 independent schemas (logical shards) are distributed across 32 physical AWS RDS instances.

Benefits

When a server becomes overloaded, a logical schema can be moved to a new server, providing linear scalability.

Chapter 4: The Great “Shadow” Migration

Backfill

Historical data is moved to the new shards in the background.

Double Writing

New changes are written to both the old database and the new shards simultaneously.

Cutover

After a comparison engine verifies data parity, traffic is switched to the new shards with zero downtime.

Future Improvements

Scaling to 96 Nodes – Expanding the physical node count to further distribute load.
Blocks All the Way Down – Continuing to treat every piece of content as a block for fine‑grained sharding.
Data Lakes & Connection Hubs – Exploring centralized analytics and integration points.

Feel free to share your thoughts in the comments!