Building a Sub-Millisecond Vector Database in Rust/WASM
Source: Dev.to
Overview
EdgeVec is a high‑performance vector database that runs entirely in the browser using WebAssembly. It achieves sub‑millisecond search times, making client‑side semantic search, RAG, and recommendation feasible without a server.
Performance
| Vectors | Float32 Search | Quantized (SQ8) Search |
|---|---|---|
| 10 k | 203 µs | 88 µs |
| 50 k | 480 µs | 167 µs |
| 100 k | 572 µs | 329 µs |
These latencies are comparable to server‑side solutions while running completely client‑side.
Implementation Details
Algorithm
EdgeVec uses Hierarchical Navigable Small World (HNSW) graphs, the same algorithm employed by production vector databases such as Weaviate and Qdrant.
Quantization
Instead of storing 32‑bit floats (768 dimensions × 4 bytes ≈ 3 KB per vector), vectors are compressed to 8‑bit integers. This yields a 3.6× memory reduction with minimal impact on accuracy.
SIMD Optimization
- Native (x86_64): AVX2 instructions
- WebAssembly:
simd128where available
Rust’s portable SIMD is used to vectorize distance calculations, delivering the sub‑millisecond query times shown above.
Build Size
Compiled with wasm-pack, the final bundle is only 148 KB gzipped, small enough for any web application.
Where Client‑Side Vector Search Makes Sense
- Privacy – embeddings never leave the device.
- Latency – zero network round‑trip.
- Offline – works without an internet connection.
- Cost – eliminates server expenses.
Ideal for browser extensions, local‑first apps, and privacy‑preserving RAG pipelines.
Usage Example
import init, { EdgeVec, EdgeVecConfig } from 'edgevec';
await init();
const config = new EdgeVecConfig(768);
const index = new EdgeVec(config);
// Insert a vector
index.insert(new Float32Array(768).fill(0.1));
// Perform a search
const results = index.search(query, 10);
// results: [{ id: 0, score: 0.0 }, ...]
Resources
- GitHub:
- npm:
npm install edgevec
Alpha release – feedback welcome!