Mini-Pages: Rethinking the Leaf Page Boundary

Published: 1 month ago (December 16, 2025 at 01:57 PM EST)

5 min read

Source: Dev.to

Hello, I’m Maneshwar. I’m working on FreeDevTools online, currently building one place for all dev tools, cheat codes, and TLDRs — a free, open‑source hub where developers can quickly find and use tools without the hassle of searching all over the internet.

In yesterday’s article we looked at why both B‑Trees and LSM‑Trees struggle to fully exploit modern hardware—one weighed down by cache pollution, the other by compaction, amplification, and operational complexity.

Today we move past why these structures fall short and look at what replaces the page itself.

The Bf‑Tree Answer: Mini‑Pages

Bf‑Tree’s answer is not another cache, buffer, or tuning trick, but a structural rethink of the leaf node.
At the center of this redesign is a new primitive: the mini‑page.

A mini‑page is a slim, in‑memory version of a leaf page, tightly scoped and purposely built.
Unlike buffer‑pool pages or LSM memtables, a mini‑page exists only for leaf nodes, and only one mini‑page can exist per leaf page.

This design choice is intentional. Mini‑pages are not a general cache; they are a precise extension of the leaf page, designed to:

Buffer recent updates
Cache frequently accessed records

Crucially, they do this without pulling full pages into memory.

Absorbing Writes Without Page‑Level Amplification

When a write targets a leaf page, Bf‑Tree does not immediately touch the on‑disk page. Instead, the write is inserted into the leaf’s mini‑page:

If no mini‑page exists, a minimal one is created (as small as 64 bytes, aligned to a cache line).
Records inside the mini‑page are kept sorted, preserving spatial locality.
Inserts use binary search, not pointer chasing.

As updates accumulate, the mini‑page grows dynamically:

It doubles in size when full.
Growth continues until it reaches an upper bound (up to 4 KB).

When the mini‑page reaches that bound—or must be evicted—the system batch‑merges the mini‑page into the base leaf page on disk.

This achieves two things simultaneously:

Writes are absorbed and coalesced in memory.
Disk writes happen in larger, more efficient batches.

There is no delta chain, no per‑update logging into multiple structures, and no background compaction pipeline. Updates stay local, ordered, and cache‑friendly.

Caching Hot Records Without Caching Pages

Reads follow a similarly disciplined path. Before touching disk, a lookup binary‑searches the mini‑page:

Hit: the record is returned immediately.
Miss: the corresponding leaf page is read from disk.

After loading the record, Bf‑Tree may cache it back into the mini‑page, but only probabilistically (e.g., 1 % of the time). This avoids flooding the mini‑page with cold data while allowing hot records to naturally accumulate.

Key distinction: Mini‑pages cache records, not pages.
Traditional B‑Trees cache entire pages, even if only one record is hot. LSM systems split caching into row caches and block caches, each with separate memory budgets and tuning knobs. Mini‑pages avoid both extremes.

Adapting to Range Scans with Cached Gaps

Record‑level caching alone isn’t enough. Range scans depend on spatial locality, and caching isolated records works against that. Bf‑Tree addresses this with an elegant pivot.

When a mini‑page grows to full size, it can be merged and promoted into a full leaf‑page representation, mirroring the on‑disk layout. At that point:

The mini‑page effectively becomes a page‑level cache.
Range scans regain spatial locality.
No separate cache tier is required.

Thus the same memory budget can fluidly support:

Point lookups (via small mini‑pages)
Range scans (via full‑sized pages)

Unlike systems that require static partitioning between row cache and block cache, Bf‑Tree adapts automatically as workload patterns shift.

One Layout, Two Roles

Mini‑pages and leaf pages share the exact same physical layout; the only difference is size.

Both store sorted key–value pairs and support efficient lookups.
Mini‑pages allow variable lengths, while leaf pages remain fixed‑size on disk.

This unified layout:

Reduces implementation complexity
Avoids duplicated logic
Allows optimizations to apply uniformly to memory and disk

Internally, each page begins with a compact node‑metadata header, followed by metadata entries and the actual key–value data packed from opposite ends. Inserts shift metadata, not full records, keeping operations fast and cache‑friendly.

Mapping Pages Without Breaking Concurrency

Leaf pages always live on disk; mini‑pages live in memory. Both are referenced through the same logical page ID.

Bf‑Tree uses an in‑memory mapping table to translate this page ID to either:

A memory address (mini‑page), or
A disk offset (leaf page)

The mapping table also embeds a reader–writer lock alongside the page address, packed into a single 64‑bit word. Because the mini‑page and its leaf page share the same page ID:

Locking one implicitly locks the other.
The concurrency model stays simple.
No extra coordination layer is required.

This indirection decouples page location from page identity, enabling movement between memory and disk without rewriting pointers across the tree.

Why Mini‑Pages Matter

Mini‑pages are not a cache tweak; they are a fundamental redefinition of what a page is. They blur the boundary between memory and disk while preserving order, locality, and predictability.

Writes become local.
Reads become selective.
Range scans regain efficiency—without background compaction, multi‑level probing, or cache partitioning.

This is where Bf‑Tree… (continue as needed).

- **Tree** stops being a “better B‑Tree” and starts looking like a **new storage primitive**—one that finally aligns data structures with modern hardware realities.

[![FreeDevTools](https://media2.dev.to/dynamic/image/width=800,height=,fit=scale-down,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5e57lh7rtwwwe2d694n.png)](https://hexmos.com/freedevtools/)

👉 **Check out:** [FreeDevTools](https://hexmos.com/freedevtools/)

Any feedback or contributors are welcome!  

It’s online, open‑source, and ready for anyone to use.

⭐ **Star it on GitHub:** [FreeDevTools](https://github.com/HexmosTech/FreeDevTools)

Mini-Pages: Rethinking the Leaf Page Boundary

The Bf‑Tree Answer: Mini‑Pages

Absorbing Writes Without Page‑Level Amplification

Caching Hot Records Without Caching Pages

Adapting to Range Scans with Cached Gaps

One Layout, Two Roles

Mapping Pages Without Breaking Concurrency

Why Mini‑Pages Matter

Related posts

Why Bf-Tree Pins Inner Nodes and What That Unlocks

Managing Mini-Page Memory: The Buffer Pool Behind Bf-Tree

Autocomplete / Type-ahead System for a Search Box - Part 2

Using CuCollections Nvidia Data Structures Library