Bf-트리: 페이지 장벽을 깨다

발행: (2025년 12월 5일 오후 04:58 GMT+9)
2 min read
원문: Dev.to

Source: Dev.to

Hello, I’m Maneshwar. I’m working on FreeDevTools – an online, open‑source hub that consolidates dev tools, cheat codes, and TLDRs in one place, making it easy for developers to find and use tools without endless searching.

The Real Problem: Pages Are Too Big

  • A database page is typically 4 KB or 8 KB.
  • Every update or read pulls in a lot of irrelevant data:
    • A 50‑byte record update rewrites a 4 KB page.
    • Caching 5 hot records also caches 200 cold ones.
    • Range scans and point lookups compete for the same page‑level cache.
  • Thought experiment: if the record size equaled the page size, these problems would disappear instantly.

The Core Idea Behind Bf‑Trees

Bf‑Trees break the assumption that cache pages must mirror disk pages.

Disk page  !=  Cache page

Instead, memory holds mini‑pages: variable‑length chunks that store only what matters:

  • Individual hot records
  • A small key range
  • Buffered recent writes
  • A full page when needed for range scans

Mini‑pages grow and shrink dynamically based on workload, changing the entire performance landscape.

Comparison: Traditional B‑Tree vs. Bf‑Tree

AspectTraditional B‑TreeBf‑Tree
UpdatesRewrite whole pagesRewrite just the mini‑page
Cache contentsHolds cold dataHolds only hot data
Unit sizeFixed 4 KB unitsVariable‑sized fine‑grained blocks
Record/cache splitBolted on topMini‑pages unify cache + buffer + data

Variable‑Length Buffer Pool

To manage mini‑pages efficiently, Bf‑Tree uses a circular buffer in memory:

  • Fixed total size.
  • Mini‑pages allocated by advancing the tail.
  • Free space tracked in a free list.
  • Growth handled by copy‑on‑write (RCU).
  • When full → evict mini‑pages near the head back to disk.

This design enables concurrent allocation, resizing, and eviction without fragmentation hell.

What This Enables

  • Point lookups: Fast, because mini‑pages cache only what’s hot.
  • Range scans: Mini‑pages expand to full pages when needed.
  • Writes: Only the mini‑page changes.

The Numbers

Evaluations show:

  • 6× faster writes vs. B‑Tree.
  • 2× faster point lookups vs. B‑Trees & LSMs.
  • 2.5× faster scans vs. RocksDB (LSM‑Tree).

In short, Bf‑Tree beats both B‑Trees and LSM‑Trees across all workloads—a rare achievement where reads and writes both improve.

Why This Matters

Modern systems push data beyond RAM size. Bf‑Tree demonstrates that the future isn’t a binary B‑Tree vs. LSM‑Tree debate; it’s about rethinking pages.

Next Article

Coming up next, I’ll dive into three areas:

  1. Understanding LSM‑Tree internals and where they shine or break.
  2. How modern NVMe SSD behavior changes the cost model of writes.
  3. The architectural details behind Bf‑Tree and mini‑pages.

These topics will lay the foundation for exploring RCU growth/shrink, eviction strategy, and concurrency guarantees. Stay tuned.

👉 Check out: FreeDevTools
Any feedback or contributions are welcome! It’s online, open‑source, and ready for anyone to use.
Star it on GitHub:

Back to Blog

관련 글

더 보기 »

Tailwind가 실제로 느린가요?

Tailwind는 많은 개발자들의 마음을 사로잡는 동시에 다른 많은 사람들에게는 가장 혐오받는 CSS 프레임워크가 되었습니다. 사람들은 그것을 사랑하고 미워합니다 f...

왜 ORM을 사용해야 할까요?

‘Why would you ever use an ORM?’의 커버 이미지: https://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to...

개발자 없이 사이트 속도 높이는 방법

느린 웹사이트는 트래픽, 순위, 그리고 매출에 악영향을 줄 수 있지만—좋은 소식은 성능 문제를 해결하기 위해 항상 개발자가 필요하지는 않다는 것입니다. 올바른 도구를 사용하면…