DuckDB Full-Text Search vs PostgreSQL FTS vs Meilisearch: 100 Million Document Index — Build Time, Query Latency, Memory
Source: Dev.to
Test Setup: Real‑World Workload, Real Hardware
- Corpus: 100 million Reddit comments (~50 GB raw text, 14.8 GB compressed Parquet)
- Server: Hetzner AX‑52 (AMD Ryzen 7 7700, 64 GB RAM, 2 × 1 TB NVMe)
- Queries: Derived from production search logs, covering four classes: simple matches, multi‑word phrases, fuzzy matches, and boolean queries.
- Versions: DuckDB 1.1, PostgreSQL 17.4, Meilisearch 1.10 – each tuned for performance.
Key Finding 1: Index Build Time — DuckDB Surprises
- DuckDB: 38 minutes (2.4× faster than PostgreSQL)
- PostgreSQL: 91 minutes
- Meilisearch: 44 minutes
DuckDB’s advantage stems from columnar I/O and pipelined tokenization—it reads only the indexed column (body) from Parquet, avoiding unnecessary data movement. PostgreSQL must ingest rows into heap pages before building a GIN index, doubling I/O overhead.
Meilisearch, while fast, peaked at 29 GB RAM during indexing, which can be prohibitive for smaller deployments. PostgreSQL excelled at incremental updates (14 seconds for 1 M new docs) thanks to its GIN index, whereas DuckDB’s columnar architecture made partial updates cheaper than a full rebuild.
Key Finding 2: Query Latency — Specialization Matters
| Workload | Engine | P50 Latency | P99 Latency |
|---|---|---|---|
| Simple boolean AND | PostgreSQL GIN | 4 ms | — |
| Fuzzy / analytical (e.g., Levenshtein) | DuckDB | ~1 ms (≈4× faster than PostgreSQL) | — |
| Typo‑tolerant ranking | Meilisearch | — | 800 ms+ |
- PostgreSQL’s mature query planner shines on simple boolean queries.
- DuckDB dominates fuzzy and analytical queries, leveraging fast columnar scans and aggregations—ideal for “search‑as‑an‑analytics‑primitive.”
- Meilisearch provides the best typo‑tolerant ranking but struggles at scale, likely due to its single‑shard design.
Key Finding 3: Resource Efficiency — DuckDB’s Disk Advantage
- DuckDB index: ~3× smaller than Meilisearch’s, thanks to compressed columnar storage.
- PostgreSQL GIN index: Larger than DuckDB’s but more compact than Meilisearch’s.
For disk‑constrained environments, DuckDB’s smaller footprint can be decisive.
Verdict: No Universal Winner
| Preference | Recommended Engine |
|---|---|
| OLTP‑integrated search with fast boolean queries and incremental updates | PostgreSQL |
| Fast analytical queries, low disk usage, batch indexing | DuckDB |
| Strong typo tolerance and developer experience (smaller corpora or horizontal scaling) | Meilisearch |
Read the full article at novvista.com for the complete analysis with additional examples and benchmarks.
Originally published at NovVista.