Inside SQLite Backend: Virtual Machine, Storage, and the Build Process

Published: (January 11, 2026 at 07:00 AM EST)
2 min read
Source: Dev.to

Source: Dev.to

Virtual Machine (VDBE)

Once the frontend finishes compilation, it hands over a bytecode program to the Virtual Database Engine (VDBE).

A bytecode program is:

  • A linear sequence of instructions
  • Each instruction has an opcode and up to five operands
  • Executed sequentially, one instruction at a time

The VM behaves like a custom CPU, designed specifically for database operations such as scanning tables, comparing values, managing cursors, and enforcing transactional semantics.

Tree Module (B‑tree Storage)

SQLite stores data using tree structures:

  • Tables → B+ trees
  • Indexes → B‑trees

Each table and index has its own independent tree structure. The implementation resides in:

  • btree.c – tree logic
  • btree.h – public interface

The tree module supports searching, insertion, deletion, updates, and structural changes (e.g., creating or dropping tables and indexes).

Pager

The pager is a critical component that mediates all file I/O. The tree module never accesses the database file directly; instead, it works with fixed‑size pages requested from the pager.

Key responsibilities of the pager:

  • Reads and writes database pages
  • Maintains an in‑memory page cache
  • Handles file locking
  • Manages rollback journals
  • Enforces transaction boundaries

In effect, the pager acts as a data manager, lock manager, log manager, and transaction manager. Its source files are:

  • pager.c
  • pager.h

The pager enables SQLite to deliver ACID guarantees using a single database file.

lovestaco@i3nux-mint:~/pers/sqlite$ ll /home/lovestaco/pers/sqlite/bld/sqlite3
-rwxrwxr-x 1 lovestaco lovestaco 6.9M Jan 11 17:14 /home/lovestaco/pers/sqlite/bld/sqlite3

Build Process

SQLite’s build process reflects its philosophy of self‑containment and reproducibility. It consists of six major steps:

  1. Generate sqlite3.h
  2. Build the SQL parser
  3. Generate VM opcodes
  4. Generate opcode names
  5. Generate SQL keyword tables
  6. Compile the library

During the build:

  • lemon.c generates parse.c and parse.h
  • mkkeywordhash.c generates keywordhash.h
  • awk and sed generate sqlite3.h, opcodes.h, and opcodes.c

opcodes.h assigns numeric values to VM instructions, while opcodes.c maps opcodes to human‑readable names useful for debugging and diagnostics.

Modern releases provide a single amalgamation file, sqlite3.c, along with sqlite3.h. Advantages of using the amalgamation include:

  • 5–10 % better performance
  • More aggressive compiler optimizations
  • Simplified build process
  • Easier embedding into applications

The command‑line utility additionally requires shell.c.

Summary

  • SQL is compiled into bytecode and executed by a purpose‑built VM.
  • SQLite ensures serializable execution using database‑level locking.
  • Journaling guarantees atomicity and recovery.
  • Each database lives in a single native file anchored by sqlite_master.

The architecture is modular, cleanly layered, and fully open source in the public domain.

Looking Ahead

The next chapter will dive deeper into database and journal file storage structures, revealing how SQLite’s on‑disk layout materializes these abstractions.

Further resources

  • My SQLite experiments:
  • FreeDevTools (open‑source hub for dev tools):

Reference: SQLite Database System: Design and Implementation. Sibsankar Haldar (n.d.).

Back to Blog

Related posts

Read more »