[Paper] Neighborhood Stability as a Measure of Nearest Neighbor Searchability

Published: 2 months ago (February 18, 2026 at 01:09 PM EST)

4 min read

Source: arXiv

Source: arXiv - 2602.16673v1

Overview

The paper introduces Neighborhood Stability as a lightweight, data‑only way to predict how well a clustering‑based Approximate Nearest Neighbor Search (ANNS) will work on a given high‑dimensional dataset. By measuring how “stable” the nearest‑neighbor relationships are inside and across clusters, the authors provide two practical metrics that let engineers decide early whether a clustering‑driven ANNS pipeline is likely to be accurate—without having to run expensive search experiments first.

Key Contributions

Clustering‑Neighborhood Stability Measure (clustering‑NSM) – an internal quality metric that evaluates a given clustering by looking at how often true nearest‑neighbors stay within the same cluster.
Point‑Neighborhood Stability Measure (point‑NSM) – a dataset‑level “clusterability” score that predicts the clustering‑NSM value before any clustering is performed.
Theoretical link between point‑NSM and clustering‑NSM, showing that high point‑NSM implies high clustering‑NSM, and thus higher ANNS accuracy.
Empirical validation on several real‑world high‑dimensional benchmarks (image embeddings, text vectors, recommendation data) confirming that both measures correlate strongly with recall@k of popular clustering‑based ANNS methods (e.g., IVF, ScaNN).
Distance‑agnostic design – the metrics rely only on nearest‑neighbor relationships (rankings), not on raw Euclidean distances, making them usable with inner‑product, cosine similarity, or custom similarity functions.

Methodology

Define neighbor sets – For each point (x_i) in the dataset, compute its top‑(k) true nearest neighbors (using the exact metric).
Clustering‑NSM
- Given a flat clustering (e.g., k‑means, hierarchical k‑means), count how many of the (k) true neighbors of each point fall into the same cluster as the point itself.
- Normalize this count across all points to obtain a stability score in ([0,1]); higher values mean the clustering respects true neighbor relationships.
Point‑NSM
- Without any clustering, examine the mutual nearest‑neighbor graph: two points are linked if each appears in the other’s top‑(k) list.
- Compute the proportion of points whose mutual‑NN graph forms tightly‑connected components (high conductance). This reflects how naturally the data can be partitioned into stable neighborhoods.
Predictive pipeline – Use point‑NSM to estimate the expected clustering‑NSM for a chosen clustering algorithm and hyper‑parameters (e.g., number of centroids).
Evaluation – Run standard clustering‑based ANNS pipelines (IVF‑Flat, IVF‑PQ, ScaNN) on the same datasets, measure recall@k, and correlate with the two stability scores.

Results & Findings

Dataset (embedding dim)	Point‑NSM	Clustering‑NSM (IVF‑Flat)	Recall@10 (IVF‑Flat)
ImageNet‑ResNet (128)	0.78	0.71	0.92
SQuAD‑BERT (768)	0.62	0.55	0.81
MovieLens‑ALS (64)	0.84	0.80	0.95

Strong linear correlation (Pearson ≈ 0.88) between clustering‑NSM and recall across all tested methods.
Point‑NSM predicts clustering‑NSM with an (R^{2}) of 0.81, meaning you can estimate ANNS performance before clustering.
The metrics remain stable when swapping Euclidean distance for cosine similarity or inner‑product, confirming distance‑agnostic behavior.
For datasets with low point‑NSM (e.g., highly noisy or uniformly distributed points), clustering‑based ANNS consistently underperforms, suggesting alternative indexing (graph‑based, product quantization) may be preferable.

Practical Implications

Pre‑deployment sanity check – Before investing compute in building an IVF or ScaNN index, run a cheap nearest‑neighbor graph on a sample and compute point‑NSM. A low score warns you that clustering‑based ANNS will likely yield poor recall.
Hyper‑parameter tuning shortcut – Point‑NSM can guide the choice of the number of centroids: higher point‑NSM allows fewer centroids while preserving accuracy, saving memory and indexing time.
Cross‑metric portability – Because the measures only need neighbor rankings, they can be applied to embeddings from any model (vision, language, recommendation) without re‑deriving distance‑specific formulas.
Hybrid pipelines – For mixed datasets (some sub‑spaces high‑stability, others not), you can partition data based on point‑NSM and apply clustering‑based ANNS only where it’s justified, falling back to graph‑based search elsewhere.
Monitoring drift – In production, recompute point‑NSM periodically on newly ingested data to detect when the data distribution has shifted enough to degrade the existing ANNS index.

Limitations & Future Work

Exact neighbor computation is required to obtain the ground‑truth neighbor sets for the metrics, which can be costly on massive datasets; the authors suggest using approximate neighbor graphs as a proxy but this introduces noise.
The study focuses on flat (single‑level) clusterings; extending the stability concepts to hierarchical or multi‑level indexes (e.g., HNSW‑IVF hybrids) remains open.
Point‑NSM assumes a fixed (k) (typically 10–20); sensitivity to this hyper‑parameter is not fully explored.
Real‑world systems often combine multiple similarity functions (e.g., hybrid of inner‑product and Euclidean); evaluating stability under mixed metrics is future work.

Authors

Thomas Vecchiato
Sebastian Bruch

Paper Information

arXiv ID: 2602.16673v1
Categories: cs.LG, cs.IR
Published: February 18, 2026
PDF: Download PDF

[Paper] Neighborhood Stability as a Measure of Nearest Neighbor Searchability

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] The Geometry of Noise: Why Diffusion Models Don't Need Noise Conditioning

[Paper] Subgroups of $U(d)$ Induce Natural RNN and Transformer Architectures

[Paper] Unifying approach to uniform expressivity of graph neural networks

[Paper] Latent Equivariant Operators for Robust Object Recognition: Promise and Challenges