[Paper] Neighborhood Stability as a Measure of Nearest Neighbor Searchability

Published: (February 18, 2026 at 01:09 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2602.16673v1

Overview

The paper introduces Neighborhood Stability as a lightweight, data‑only way to predict how well a clustering‑based Approximate Nearest Neighbor Search (ANNS) will work on a given high‑dimensional dataset. By measuring how “stable” the nearest‑neighbor relationships are inside and across clusters, the authors provide two practical metrics that let engineers decide early whether a clustering‑driven ANNS pipeline is likely to be accurate—without having to run expensive search experiments first.

Key Contributions

  • Clustering‑Neighborhood Stability Measure (clustering‑NSM) – an internal quality metric that evaluates a given clustering by looking at how often true nearest‑neighbors stay within the same cluster.
  • Point‑Neighborhood Stability Measure (point‑NSM) – a dataset‑level “clusterability” score that predicts the clustering‑NSM value before any clustering is performed.
  • Theoretical link between point‑NSM and clustering‑NSM, showing that high point‑NSM implies high clustering‑NSM, and thus higher ANNS accuracy.
  • Empirical validation on several real‑world high‑dimensional benchmarks (image embeddings, text vectors, recommendation data) confirming that both measures correlate strongly with recall@k of popular clustering‑based ANNS methods (e.g., IVF, ScaNN).
  • Distance‑agnostic design – the metrics rely only on nearest‑neighbor relationships (rankings), not on raw Euclidean distances, making them usable with inner‑product, cosine similarity, or custom similarity functions.

Methodology

  1. Define neighbor sets – For each point (x_i) in the dataset, compute its top‑(k) true nearest neighbors (using the exact metric).
  2. Clustering‑NSM
    • Given a flat clustering (e.g., k‑means, hierarchical k‑means), count how many of the (k) true neighbors of each point fall into the same cluster as the point itself.
    • Normalize this count across all points to obtain a stability score in ([0,1]); higher values mean the clustering respects true neighbor relationships.
  3. Point‑NSM
    • Without any clustering, examine the mutual nearest‑neighbor graph: two points are linked if each appears in the other’s top‑(k) list.
    • Compute the proportion of points whose mutual‑NN graph forms tightly‑connected components (high conductance). This reflects how naturally the data can be partitioned into stable neighborhoods.
  4. Predictive pipeline – Use point‑NSM to estimate the expected clustering‑NSM for a chosen clustering algorithm and hyper‑parameters (e.g., number of centroids).
  5. Evaluation – Run standard clustering‑based ANNS pipelines (IVF‑Flat, IVF‑PQ, ScaNN) on the same datasets, measure recall@k, and correlate with the two stability scores.

Results & Findings

Dataset (embedding dim)Point‑NSMClustering‑NSM (IVF‑Flat)Recall@10 (IVF‑Flat)
ImageNet‑ResNet (128)0.780.710.92
SQuAD‑BERT (768)0.620.550.81
MovieLens‑ALS (64)0.840.800.95
  • Strong linear correlation (Pearson ≈ 0.88) between clustering‑NSM and recall across all tested methods.
  • Point‑NSM predicts clustering‑NSM with an (R^{2}) of 0.81, meaning you can estimate ANNS performance before clustering.
  • The metrics remain stable when swapping Euclidean distance for cosine similarity or inner‑product, confirming distance‑agnostic behavior.
  • For datasets with low point‑NSM (e.g., highly noisy or uniformly distributed points), clustering‑based ANNS consistently underperforms, suggesting alternative indexing (graph‑based, product quantization) may be preferable.

Practical Implications

  • Pre‑deployment sanity check – Before investing compute in building an IVF or ScaNN index, run a cheap nearest‑neighbor graph on a sample and compute point‑NSM. A low score warns you that clustering‑based ANNS will likely yield poor recall.
  • Hyper‑parameter tuning shortcut – Point‑NSM can guide the choice of the number of centroids: higher point‑NSM allows fewer centroids while preserving accuracy, saving memory and indexing time.
  • Cross‑metric portability – Because the measures only need neighbor rankings, they can be applied to embeddings from any model (vision, language, recommendation) without re‑deriving distance‑specific formulas.
  • Hybrid pipelines – For mixed datasets (some sub‑spaces high‑stability, others not), you can partition data based on point‑NSM and apply clustering‑based ANNS only where it’s justified, falling back to graph‑based search elsewhere.
  • Monitoring drift – In production, recompute point‑NSM periodically on newly ingested data to detect when the data distribution has shifted enough to degrade the existing ANNS index.

Limitations & Future Work

  • Exact neighbor computation is required to obtain the ground‑truth neighbor sets for the metrics, which can be costly on massive datasets; the authors suggest using approximate neighbor graphs as a proxy but this introduces noise.
  • The study focuses on flat (single‑level) clusterings; extending the stability concepts to hierarchical or multi‑level indexes (e.g., HNSW‑IVF hybrids) remains open.
  • Point‑NSM assumes a fixed (k) (typically 10–20); sensitivity to this hyper‑parameter is not fully explored.
  • Real‑world systems often combine multiple similarity functions (e.g., hybrid of inner‑product and Euclidean); evaluating stability under mixed metrics is future work.

Authors

  • Thomas Vecchiato
  • Sebastian Bruch

Paper Information

  • arXiv ID: 2602.16673v1
  • Categories: cs.LG, cs.IR
  • Published: February 18, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »