Cache Organization: How SQLite Actually Holds Pages in Memory
Source: Dev.to
Pager and Page Cache Overview
So far, we’ve talked about the pager as a transaction manager and a gatekeeper.
Now we turn to the physical organization of the page cache itself—the place where database pages actually live while SQLite is working. This is the level where “a page” stops being an abstract idea and becomes a real chunk of memory with structure, metadata, and rules.
- Every pager owns exactly one page cache, managed through a handler object called PCache.
- The pager does not manage cache internals directly; it holds a reference to a PCache object and talks to it via a well‑defined interface. This indirection allows SQLite to support pluggable cache implementations—if a user supplies their own cache module, SQLite will use it.
The built‑in implementation lives in pcache1.c, which is what we’ll be studying.
PCache Object
Inside the PCache object, a pointer (pCache) refers to the actual cache engine being used. From the pager’s point of view, the cache is a black box: fast, predictable, and replaceable.
Hash‑Table Slot Organization
SQLite does not store pages in a flat list. Cached pages are stored in slots, indexed through a hash table:
- The hash table starts empty. As pages are requested, slots are created and inserted.
- The total number of slots is bounded by
PCache.nMax. - Default limits:
- 2000 pages for the main and attached databases
- 500 pages for the temp database
- Unlimited for in‑memory databases (bounded only by available address space)
This design keeps lookups fast while preventing unbounded memory growth.
PgHdr Structure
Each cached page is represented by a PgHdr object, visible only to the pager (the tree module sees only raw page memory). A PgHdr carries everything the pager needs to enforce correctness:
| Field | Meaning |
|---|---|
pgno | Which database page this is |
dirty | Whether it has been modified |
needSync | Whether the journal must be flushed before writing it back |
nRef | Reference count (pinning) |
pDirtyNext / pDirtyPrev | Links in the dirty‑page list |
- If
nRef > 0, the page is pinned—actively in use and untouchable. - If
nRef == 0, the page is unpinned and eligible for reuse.
PCache1 – SQLite’s Built‑In Cache
SQLite’s built‑in cache (PCache1) adds another layer. Each hash‑table slot is represented by a PgHdr1 object. The slot layout in memory looks roughly like this:
[ PgHdr | page image | private space | (optional recovery pointers) ]
- Page image – the raw database page.
- Private space – used by the tree module for per‑page in‑memory state.
- Recovery metadata – stored here for in‑memory databases.
The entire block is zero‑initialized when a page enters the cache, ensuring no stale state leaks across reuse.
All slots are reachable through the PCache1.apHash array, where each entry points to a bucket implemented as an unordered singly linked list—simple, fast, and without clever tricks.
Cache Group Mode (Optional)
When cache group mode is enabled:
- Multiple
PCache1instances are placed into a group. - Unpinned pages from one cache can be recycled by another.
- Memory pressure is handled globally instead of per‑cache.
This is useful when many connections exist in the same process and memory is tight. Pinned pages remain private; only unpinned slots are shared.
Content‑Addressed Access
A cache is not an array you index into. Clients never know:
- Where a page lives in memory.
- Which slot it occupies.
- Whether it was recently evicted.
Pages are requested by page number, not by memory address. When the tree module wants a page, it calls sqlite3PagerGet(P)—the pager handles the rest.
Fetch‑On‑Demand Policy
SQLite follows a strict fetch‑on‑demand policy:
- The pager pins the page and returns it.
- The tree module owns the page temporarily.
- The client eventually calls
sqlite3PagerUnref.
After Unref, the page becomes recyclable again. Pinned pages cannot be evicted. SQLite also enforces a minimum cache size (10 pages as of SQLite 3.7.8) to ensure there is always some room to maneuver.
What Happens When Pages Change?
The next logical question is: What happens when pages are modified? In the upcoming post we’ll cover:
- Cache update rules
- How dirty pages are tracked
- Replacement policies
- How SQLite balances performance with safety
That’s where caching stops being passive and starts shaping execution.
References & Further Reading
- SQLite Database System: Design and Implementation – Sibsankar Haldar (n.d.)
- My experiments and hands‑on executions related to SQLite
FreeDevTools
👉 Check out FreeDevTools – an open‑source hub for dev tools, cheat codes, and TL;DRs.
Any feedback or contributors are welcome! It’s online, open‑source, and ready for anyone to use.
⭐ Star it on GitHub: