IVFFlat Indexing in pgvector
Source: Dev.to
What is IVFFlat in pgvector?
IVFFlat (Inverted File with Flat Vectors) is an ANN index. Instead of a brute‑force scan that compares a query vector against every vector in the table, IVFFlat partitions vectors into multiple “lists” (or clusters). During a query, only the most relevant lists are searched.
Key Benefits
- Faster similarity search on large datasets
- Approximate results, but accuracy is tunable
- Well‑suited for high‑dimensional embeddings
How IVFFlat Works?
IVFFlat uses a centroid‑based clustering approach.
1. Training Step
- Vectors are clustered into lists using k‑means.
- Each list represents a centroid.
2. Index Structure
- Each vector is assigned to the closest centroid/list.
- The index stores lists of vectors (inverted lists).
3. Query Execution
- The query vector is compared to all centroids.
- The probes most similar lists are selected.
- Only vectors within those lists are compared.
What do we control?
- lists – number of clusters
- probes – number of clusters searched during a query
Increasing the number of probes improves accuracy but reduces speed.
Implementing IVFFlat Indexing in pgvector
1. Install pgvector (if not already installed)
CREATE EXTENSION IF NOT EXISTS vector;
2. Create a table with vector embeddings
CREATE TABLE documents (
id bigserial PRIMARY KEY,
embedding vector(768)
);
3. Create IVFFlat Index
CREATE INDEX vector_ivfflat_idx
ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 1000);
Note:
- The index should be created after inserting enough rows (for better k‑means training).
- Aim for at least 1 000 rows per list.
4. Querying with IVFFlat
Example cosine similarity search:
SET ivfflat.probes = 20;
SELECT id
FROM documents
ORDER BY embedding '[0.5, 0.3, …]'
LIMIT 10;
Set globally if desired:
ALTER SYSTEM SET ivfflat.probes = 20;
Tuning probes in IVFFlat
- Probes control how many IVF lists are scanned during a query.
- Lower probes → faster but less accurate (fewer clusters searched).
- Higher probes → better accuracy at the cost of speed.
- Choose the optimal value based on your priority between speed and recall.
Recommended Ranges
Low probes (1–10)
- ✅ Fastest search performance
- ✅ Ideal for real‑time or high‑throughput workloads
- ❌ Lower accuracy and recall
- ❌ May miss similar vectors if clusters are coarse
Medium probes (~10 % of total lists)
- ✅ Balanced speed and accuracy
- ✅ Suitable for most production workloads
- ✅ Good recall without major performance sacrifice
- ❌ Slightly slower than low‑probe settings
High probes (50–100 % of lists)
- ✅ Near‑exact search results (high recall)
- ✅ Good for quality‑sensitive workloads (e.g., search relevance)
- ❌ Much slower due to scanning many lists
- ❌ Reduces the performance benefit of ANN indexing
Maintenance Tasks: REINDEX, ANALYZE, VACUUM
IVFFlat indexes need regular maintenance to keep search performance stable.
1. ANALYZE – Improve Query Planning
Run ANALYZE after large batches of insertions or rely on autovacuum/analyze.
ANALYZE documents;
-- Check the last ANALYZE time for your table
SELECT relname, last_analyze, last_autoanalyze
FROM pg_stat_all_tables
WHERE relname = 'documents';
2. REINDEX – Required After Massive Data Changes
If many vectors are inserted or deleted, centroids can drift and degrade performance.
REINDEX INDEX vector_ivfflat_idx;
-- Keep the table live during rebuilds
REINDEX INDEX CONCURRENTLY vector_ivfflat_idx;
When to REINDEX
- After inserting millions of new rows
- After deleting a large portion of data
- If recall noticeably decreases
3. VACUUM – Keep Storage Clean
Regular VACUUM helps maintain table and index health.
VACUUM (VERBOSE, ANALYZE) documents;
Enable autovacuum for continuous maintenance.
Conclusion
IVFFlat is a powerful ANN indexing method available in pgvector, offering a balance of performance, memory efficiency, and simplicity. With proper configuration and maintenance, IVFFlat can deliver high‑performance vector search right inside PostgreSQL—no external database required.