Clustered vs Non-Clustered Index in SQL(Complete Guide with Examples)
Source: Dev.to
Introduction: Why Do We Need Indexes?
Speed and efficiency are key when querying a database. An index is a data structure that guides the database to quickly locate a particular data set, significantly speeding up queries.
Analogies
- The Book – Instead of scanning every page to find a chapter, you consult the index at the back of the book.
- The Hotel – Rather than knocking on every room, you look at a lobby map to locate room 5001.
How SQL Stores Data
When you create a table and insert data, SQL stores the data in files on disk, divided into fixed‑size 8‑KB blocks called pages.
A data page consists of:
- Page Header – Metadata for the page.
- Data Rows – The actual row data (size‑dependent).
- Offset Array – An internal map that tracks where each row starts on the page.
If no index is defined, the table is stored as a heap:
- Write logic – New rows are appended to the next available page in the order they are inserted (unsorted).
- Read logic (full table scan) – To find a specific record (e.g.,
CustomerID = 14), SQL must scan every page until it locates the row. - Trade‑off – Fast writes, but poor read performance.
Clustered Index
A clustered index determines the physical order of the data rows. When you create a clustered index on a column (e.g., User_ID), SQL physically sorts all existing pages by that column.
SQL uses a B‑Tree (balanced tree) to navigate the sorted data:
- Root node – Points to intermediate pages.
- Intermediate nodes – Direct to ranges of data (e.g., IDs 1‑10 left, 11‑20 right).
- Leaf nodes – In a clustered index, these are the actual data pages.
Characteristics
- Only one clustered index per table (data can be sorted in only one way).
- Ideal for primary keys: they are unique and rarely updated. Updating a clustered‑index column requires moving the row, which is resource‑intensive.
Non‑Clustered Index
A non‑clustered index improves read performance without altering the physical order of the underlying table.
When you create a non‑clustered index, SQL builds a separate B‑Tree structure:
- Leaf nodes – Contain the indexed column values plus a Row Identifier (RID), which points to the exact location of the row (file ID, page number, offset).
- Navigation – SQL traverses the non‑clustered B‑Tree to the leaf node, retrieves the RID, and then makes a single “jump” to the data page to fetch the row.
Characteristics
- You can have multiple non‑clustered indexes on a single table.
- Best for columns frequently used in
WHEREclauses (e.g.,Last_Name) or join conditions.
Creating Indexes in SQL Server
SQL Server provides straightforward syntax for creating indexes. By default, defining a primary key creates a clustered index, and the default index structure is a B‑Tree.
-- Clustered index
CREATE CLUSTERED INDEX idx_customers_id
ON sales.customers (customer_id);
-- Non‑clustered index
CREATE NONCLUSTERED INDEX idx_customers_lastname
ON sales.customers (last_name);
(If you omit NONCLUSTERED, SQL Server creates a non‑clustered index by default.)
For more details on choosing the best unique identifier for a table’s clustered index, see my related post: How to Choose the Fastest Unique Identifier for a Clustered Index.