How B-Tree Indexes Work in SQL Databases: An Intuitive Guide

Published: 5 hours ago (February 3, 2026 at 01:51 PM EST)

6 min read

Source: Dev.to

Why Indexes Exist (The Real Problem)

Relational databases are optimized to store large volumes of data reliably, but they are not necessarily optimized to search that data efficiently by default. When you ask a database to find a specific row without an index, it often performs a full table scan, looking at every row one by one. This works at a small scale but becomes painfully slow as data grows.

Indexes exist to avoid this brute‑force approach. Instead of checking every row, an index allows the database to quickly narrow down where the matching data resides.

B‑Tree indexes are the default choice in most SQL databases (PostgreSQL, MySQL/InnoDB, SQL Server, Oracle) because they keep data sorted while supporting inserts and updates in a controlled, predictable way.

The intuition – even with 10 million rows, a B‑Tree index typically needs only a handful of comparison steps to find a single value. Locating a row is a matter of a few targeted jumps rather than a scan through millions of entries. Writes do carry overhead—every insert must maintain this structure—but that cost grows slowly, allowing B‑Trees to scale exceptionally well.

What Problem a B‑Tree Solves

Data structure	Search	Insert
Sorted array	O(log n) (binary search)	O(n) (shifts)
Linked list	O(n)	O(1)
B‑Tree	O(log n)	O(log n) (localized node splits)

A B‑Tree is designed specifically for storage systems that:

Keep data sorted for range queries and ordering
Provide logarithmic search (O(log n))
Minimize disk I/O by being shallow and wide
Support efficient writes through localized node splits

Key constraint: databases read data in pages, not individual rows.

Pages: How Databases Actually Store Data

Before talking about B‑Trees, it helps to understand one fundamental idea: databases work with pages.

A page is a fixed‑size block of data (commonly 8 KB).
When the database needs anything—a row, an index entry, or a pointer—it reads the entire page containing it.

This applies to:

Table data (the actual rows)
Index data (B‑Tree nodes)

The performance goal is simple: do as much useful work as possible per page read. B‑Tree nodes are designed to fit neatly inside one page, allowing the database to make massive progress with every single disk hit. This page‑based model is why B‑Trees look very different from textbook binary trees.

Page layout illustration

B‑Tree Structure (High Level)

A B‑Tree consists of three layers:

Root node – the starting point of every search
Internal nodes – the signposts used for navigation
Leaf nodes – the bottom layer where the actual data (or pointers) live

Properties

Sorted keys – keys inside a node are always kept in order
High fan‑out – each node contains many keys (often hundreds), not just two
Balanced depth – all leaf nodes are at the exact same depth, guaranteeing predictable performance regardless of which key you are seeking

Internal vs. Leaf Nodes

Internal nodes contain keys and pointers to child pages; they are used purely for navigation.
Leaf nodes contain keys and pointers to the actual rows.

Implementation details differ:

DBMS	Leaf‑node payload
MySQL (InnoDB)	Full row (clustered index)
PostgreSQL	Row pointer (heap TID)

Leaf nodes are usually linked together (left‑to‑right), allowing the database to perform range scans without climbing back up to the root.

B‑Tree storing of data

How a Lookup Works

Consider the query:

SELECT * FROM users WHERE email = 'a@b.com';

Start at the root page.
Compare the search key with the keys in the node to find the correct range.
Follow the child pointer to the next page.
Repeat until a leaf node is reached.
Fetch the row pointer (or the row itself).

Each step is a page read, not a row‑by‑row operation.

Why B‑Tree Height Stays Small

Because each node holds many keys, the tree grows wide, not tall.

Typical numbers:

Page size: 8 KB
Key + pointer: ~40 bytes
Keys per node (fan‑out): ~200

A tree with height 3 can support:

200³ ≈ 8,000,000 entries

Hence B‑Tree lookups remain extremely fast even at massive scale.

Inserts and Node Splits

When you insert a new row, the database must update the B‑Tree:

Locate the correct leaf node.
If the node has space, insert the key (trivial).
If the node is full, split it into two nodes.
Promote the middle key to the parent node.
Splits can propagate upward to the root, but they are rare in practice.

This balancing act is the tax paid to keep the tree efficient for future reads.

Range Queries and Ordering

Because B‑Trees keep keys sorted, they enable efficient range scans:

SELECT * FROM orders
WHERE created_at BETWEEN '2025-01-01' AND '2025-01-31';

Locate the first key ('2025‑01‑01') using the standard lookup.
Walk the linked leaf nodes sequentially until the end of the range is reached.

Consequences:

B‑Trees natively support ORDER BY.
They support prefix scans (e.g., LIKE 'abc%').
Hash indexes cannot do either of these.

Composite Indexes

When you create an index on multiple columns, such as (user_id, created_at), the B‑Tree sorts first by user_id and then by created_at. This ordering allows queries that filter on the leading column(s) to benefit from the index, and it also enables efficient range scans on the trailing column(s) when the leading column(s) are fixed.

B‑Tree Indexes: When They Shine and When They Don’t

How B‑Trees Work

Structure – A balanced tree where each node holds a range of key values and pointers to child pages.
Search – Starts at the root, follows the appropriate child pointer, and repeats until the leaf page containing the target key is reached.
Complexity – O(log N) page reads, where N is the number of rows. Because pages are sized for the underlying storage (e.g., 8 KB), the tree height stays low even for very large tables.

Typical Use‑Cases

Scenario	Why a B‑Tree Helps
Primary key look‑ups (`WHERE id = …`)	Direct equality match on the leading column.
Range scans (`WHERE created_at BETWEEN …`)	Tree is ordered, so it can walk sequentially through matching leaves.
ORDER BY / GROUP BY on indexed columns	Data is already sorted, eliminating an extra sort step.
Prefix searches (`WHERE name LIKE 'A%'`)	The prefix defines a contiguous range in the tree.

Composite (Multi‑Column) Indexes

A composite index stores the columns in the order they are defined.
Example: CREATE INDEX ix_user_created ON events (user_id, created_at);

Efficient for
- WHERE user_id = ?
- WHERE user_id = ? AND created_at > ? (the leading column is used, then the second column narrows the range)
Inefficient for
- WHERE created_at = ? alone – the index can’t be used because created_at isn’t the leading column.

Analogy: Think of a phone book sorted by (Last Name, First Name). It’s trivial to locate all “Smiths,” but you can’t jump straight to every “John” without scanning the whole book.

Common Misconceptions

Myth	Reality
“Indexes speed up all queries.”	They accelerate reads but add overhead to every `INSERT`, `UPDATE`, and `DELETE`.
“More indexes are always better.”	Each index consumes disk space and increases write amplification.
“B‑Trees are only for equality.”	Their real strength is handling ranges and producing sorted output.

When a B‑Tree Is the Wrong Choice

Avoid B‑Tree indexes in the following situations:

Low cardinality – e.g., is_active = true for 90 % of rows. The optimizer will often prefer a full table scan because reading the index plus the table is costlier.
High‑frequency equality‑only look‑ups on unique keys – a hash index can be faster (though less flexible).
Append‑only / time‑series data – consider BRIN (Block Range Indexes) when rows are naturally ordered by time and the table is massive.
Full‑text search – use GIN or other inverted indexes designed for searching words inside large text fields.

Practical Takeaways

B‑Trees are page‑optimized, aiming to minimize disk I/O rather than CPU cycles.
Tree height matters more than total row count; a shallow tree yields fast look‑ups.
The order of columns in a composite index is critical; the leading column must appear in the query’s WHERE clause for the index to be useful.
Always weigh your read‑vs‑write ratio before adding new indexes—more indexes help reads but hurt writes.