Internal Datatypes and Record Format in SQLite

Published: (February 27, 2026 at 01:20 PM EST)
5 min read
Source: Dev.to

Source: Dev.to

git‑lrc: A Git Hook for Checking AI‑Generated Code

Hello, I’m Maneshwar. I’m working on git‑lrc, a Git hook that reviews code produced by AI agents before it lands in your repository.

Inside the SQLite Virtual Machine (VM)

So far, we’ve seen how the VM executes instructions, manages cursors, and moves through bytecode programs.
Today we zoom into something even more fundamental:

What exactly is a value inside the VM?

Before a value is written to disk, compared in a WHERE clause, or returned through sqlite3_step(), it exists inside the VM as a memory object. Understanding that representation is key to understanding SQLite’s flexibility.

  • The VM uses an arbitrary number of numbered memory locations (registers).
  • Each register holds exactly one value at a time.

Every value processed by the VM must belong to one of five primitive storage types:

TypeDescription
INTEGERSigned integer
REALSigned floating‑point number (IEEE)
TEXTCharacter string (with encoding meta)
BLOBRaw byte image
NULLSQL NULL

There are no other primitive types inside SQLite’s execution engine. Everything—whether it originated from disk, an expression, or a bound parameter—becomes one of these five.

Physical Representation

Storage classPhysical representation
INTEGERStored in integer form
REALStored in IEEE floating‑point format
TEXTStored with encoding metadata
BLOBRaw bytes
NULLSpecial marker

Cached Alternate Representations

Some values can have multiple representations cached simultaneously.
Example: the value 123 may exist as:

  • INTEGER 123
  • REAL 123.0
  • TEXT "123"

The VM keeps these representations synchronized when necessary, avoiding repeated conversions during expression evaluation.

BLOB and NULL cannot have alternate representations; they remain as they are.

If a conversion is needed (e.g., comparing TEXT with INTEGER), the VM performs implicit type conversion on demand. SQLite’s dynamic typing model relies heavily on this flexibility.

Inspecting Storage Types

SELECT a, typeof(a) FROM t1;

At the C‑API level, sqlite3_column_type() reveals the storage class of a value returned by sqlite3_step().

The Mem Object

Internally, almost every value inside the VM is represented as a Mem object. Each element of the Vdbe.aMem array is one such structure.

Conceptual layout of a Mem object

  • The actual value
  • Flags describing its current storage type
  • Metadata about encoding
  • Cached alternate representations

Invariant: A value always has exactly one canonical storage class, even if multiple representations are cached. The storage‑class flag indicates which representation is canonical.

This distinction between storage class and cached representation enables:

  • Manifest typing
  • On‑the‑fly type coercion
  • Efficient comparison semantics

—all without rigid schema enforcement at runtime.

Records vs. Registers

Individual values in registers are not what get stored directly in B‑trees. The VM first composes them into records—logically contiguous byte strings that contain:

  • A key
  • An optional payload (value portion)

Although the tree module may physically split a record across pages or overflow areas, the VM sees it as one continuous byte sequence. The VM alone is responsible for formatting records; the tree module merely stores and retrieves byte strings.

Two conceptual ways to format records

ApproachCharacteristics
Fixed‑lengthEach column occupies a predetermined amount of space; record size is constant; padding may be required; column sizes are known at table creation time.
Variable‑lengthColumn sizes vary per record; no padding; record size depends on actual data; offsets must be computed dynamically.

SQLite uses a variant of variable‑length formatting because it provides:

  • Smaller database files (no wasted padding)
  • Faster I/O (fewer bytes moved)
  • Support for dynamic, manifest typing

This is crucial: SQLite allows an INTEGER column to store TEXT, a TEXT column to store numbers, or any column to store NULL. Variable‑length formatting makes this possible without structural constraints.

Manifest Typing

Traditional databases enforce type at the schema level. SQLite enforces type at the value level—manifest typing. Each value carries its own storage class, independent of the column declaration. That’s why the Mem object stores type flags with the value itself.

Examples

INSERT INTO t1(c1) VALUES (123);   -- stores an INTEGER
INSERT INTO t1(c1) VALUES ('abc'); -- stores a TEXT

Both rows can coexist even if c1 was declared as TEXT.

In the next sections we will explore how this manifest‑typing model interacts with:

  • Record headers
  • Serial types
  • On‑disk encoding

Understanding SQLite’s byte‑level record encoding will make the design of the VM, the Mem object, and the B‑tree layer click into place.

About git‑lrc

AI agents write code fast, but they can silently:

  • Remove logic
  • Change behavior
  • Introduce bugs

Often you discover these issues only in production.

git‑lrc fixes this. It hooks into git commit and reviews every diff before it lands.

  • Free – unlimited AI code reviews that run on each commit
  • Easy – 60‑second setup
  • Open source – source‑available, ready for anyone to use

See It In Action

Detect serious security issues such as leaked credentials, expensive cloud operations, and sensitive material in log statements.

Watch demo video (git‑lrc‑intro‑60s.mp4)

Why Use git‑lrc?

🤖AI agents silently break things – code removed, logic changed, edge cases gone. You won’t notice until production.
🔍Catch it before it ships.

Any feedback or contributions are welcome!

Repository: git‑lrc – free, unlimited AI code reviews that run on commit.

fore it ships. AI-powered inline comments show you exactly what changed and what looks wrong.

🔁 **Build a habit, ship better code.**  
Regular review → fewer bugs → more robust code → better results in your team.

🔗 **Why git?**  
Git is universal. Every editor, every IDE, every AI…

[View on GitHub](#)
0 views
Back to Blog

Related posts

Read more »

SQL Joins & Window Functions

markdown !Musungu Ruth Ambogohttps://media2.dev.to/dynamic/image/width=50,height=50,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws...

SQL JOINS AND WINDOW FUNCTIONS

SQL Joins and Window Functions !tonny otienohttps://media2.dev.to/dynamic/image/width=50,height=50,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uplo...