[Paper] When More Cores Hurts: The Vector Database Scaling Paradox in HPC

Published: 3 days ago (June 7, 2026 at 10:51 PM EDT)

2 min read

Source: arXiv

Source: arXiv - 2606.08950v1

Overview

Vector databases have been designed and optimized for cloud environments; however, emerging scientific AI workloads (e.g., molecular search, meteorological trajectory detection, and literature-driven hypothesis generation) demand efficient, scalable execution on HPC systems. We present a large-scale evaluation of three state-of-the-art vector databases — Qdrant, Milvus, and Weaviate — on two production supercomputers, scaling to 256 distributed workers across 64 compute nodes. We evaluate representative workload patterns — mixed read/write and write-then-read — using popular benchmarks, multimodal embeddings, and a novel real-world scientific dataset. Our results reveal that workload characteristics can limit latency reduction, additional cores can reduce query throughput by up to 30.67%, and scaling from 16 to 256 workers (16x) only yields a 5.46x improvement. This scaling paradox exposes the fundamental mismatch between cloud-oriented designs and HPC systems, highlighting the need for new, HPC-aware vector database designs.

Key Contributions

This paper presents research in the following areas:

cs.DC
cs.DB

Methodology

Please refer to the full paper for detailed methodology.

Practical Implications

This research contributes to the advancement of cs.DC.

Authors

Seth Ockerman
Song Young Oh
Amal Gueroudji
Rochana Chaturvedi
Philip Carns
Nicholas Chia
Matthieu Dorier
Robert Latham
Tanwi Mallick
Swan Perarnau
Robert Underwood
Kyle Chard
Ian Foster
Robert Ross
Shivaram Venkataraman

Paper Information

arXiv ID: 2606.08950v1
Categories: cs.DC, cs.DB
Published: June 8, 2026
PDF: Download PDF

[Paper] When More Cores Hurts: The Vector Database Scaling Paradox in HPC

Overview

Key Contributions

Methodology

Practical Implications

Authors

Paper Information

Related posts

[Paper] Fair Comparison of Scheduling Algorithms on Heterogeneous Edge Clusters: A Continuous Adaptive Benchmark

[Paper] Efficient and Robust Online Learning to Rank in Decentralized Systems

[Paper] The PM-EdgeMap: Towards Real-Time Process Mining on the Edge-Cloud Continuum

[Paper] Near-Optimal Distributed 2-Ruling Sets on Graphs with Low Arboricity