[Paper] Survey on Neural Routing Solvers

Published: 3 days ago (February 25, 2026 at 05:24 AM EST)

5 min read

Source: arXiv

Source: arXiv - 2602.21761v1

Overview

Neural Routing Solvers (NRSs) are a new breed of deep‑learning models that aim to solve vehicle routing problems (VRPs) by learning the “rules of thumb” that human‑crafted heuristics use. This survey paper shines a light on the heuristic nature of NRSs, organizes the rapidly growing literature into a clear taxonomy, and proposes a more realistic way to benchmark these models for real‑world generalization.

Key Contributions

Heuristic‑centric perspective: Re‑frames NRS research as the evolution of classic routing heuristics (e.g., savings, insertion, local search) that are now learned rather than manually coded.
Hierarchical taxonomy: Introduces a three‑level classification (problem‑scope → heuristic principle → architectural family) that makes it easy to locate any NRS in the literature.
Generalization‑focused evaluation pipeline: Designs a benchmark that tests models on out‑of‑distribution (OOD) instances, varying size, geography, and demand patterns—addressing the over‑fitting problem of existing pipelines.
Comprehensive empirical comparison: Runs a head‑to‑head study of representative NRSs under both the traditional and the new pipelines, revealing hidden performance gaps and robustness issues.
Open‑source toolbox: Releases code for the taxonomy, dataset generators, and evaluation scripts, enabling reproducible research and rapid prototyping.

Methodology

Literature mapping: The authors collected 70+ NRS papers (spanning 2018‑2024) and annotated each with (a) the VRP variant it targets, (b) the heuristic principle it emulates (construction, improvement, or hybrid), and (c) the neural architecture (graph neural network, transformer, reinforcement‑learning agent, etc.).
Taxonomy construction: Using the annotations, they built a tree‑like hierarchy:
- Level 1 – Problem scope: CVRP, VRPTW, PDPTW, etc.
- Level 2 – Heuristic principle: Construction (e.g., learned savings), Improvement (learned local search moves), Hybrid (learned meta‑heuristics).
- Level 3 – Architecture family: GNN‑based encoders, attention‑based decoders, RL policy networks, diffusion models, etc.
Evaluation pipelines:
- Conventional pipeline: Train on a fixed set of instances (often from a single distribution) and test on similarly sized, same‑distribution instances.
- Generalization pipeline (proposed): Create multiple test suites that differ in size (small → large), spatial distribution (clustered vs. uniform), and demand stochasticity. Models are trained once and evaluated across all suites, mimicking real‑world deployment where problem characteristics shift.
Benchmarking: Selected 10 representative NRSs (e.g., Attention Model, POMO, Neural Large‑Neighbourhood Search, Graph‑Based RL) and ran them on standard CVRP datasets (Solomon, Augerat) plus the OOD suites. Metrics include solution quality (percentage gap to optimal/Best‑Known), inference speed, and robustness (variance across OOD sets).

Results & Findings

Pipeline	Best‑performing NRS (avg. gap)	Notable observations
Conventional	POMO – 1.8 % gap	Works well when train‑test distributions match.
Generalization	Neural Large‑Neighbourhood Search (NLNS) – 3.4 % gap	Maintains relatively low degradation on larger, clustered instances.
Conventional (speed)	Attention Model – 0.5 ms per instance	Very fast but quality drops sharply on OOD data.
Generalization (robustness)	Hybrid GNN‑RL – 4.1 % gap, low variance	Shows consistent performance across diverse test suites.

Key takeaways

Many NRSs that look impressive under the conventional pipeline lose 2‑5× more quality when faced with OOD instances.
Architectures that incorporate local search or neighborhood exploration (e.g., NLNS) generalize better than pure end‑to‑end sequence models.
Inference speed remains a strong advantage of NRSs, but the trade‑off with robustness must be considered for production use.

Practical Implications

For logistics software vendors: The survey suggests that plugging a vanilla attention‑based NRS into an existing routing engine may yield quick wins on static, well‑characterized routes, but a more robust hybrid (construction + learned improvement) is needed for dynamic fleets with varying order patterns.
For developers building custom routing solutions: The taxonomy helps you pick a starting point—e.g., if you already have a heuristic insertion routine, you can replace its decision rule with a GNN‑based policy rather than rebuilding from scratch.
Edge deployment: Because most NRSs run inference in milliseconds on a GPU/TPU, they are suitable for real‑time dispatching, but you should validate on OOD data that mirrors your city’s geography and demand spikes.
Tooling & reproducibility: The released toolbox lets you generate OOD benchmark suites with a single command, making it easier to integrate NRS evaluation into CI pipelines.

Limitations & Future Work

Dataset bias: The surveyed papers largely focus on CVRP and VRPTW; other complex variants (e.g., stochastic demand, multi‑modal fleets) remain under‑explored.
Scalability ceiling: While inference is fast, training still requires massive synthetic data and GPU hours, limiting accessibility for small companies.
Explainability: Learned heuristics are often opaque; the survey calls for methods to extract human‑readable rules from NRSs, facilitating trust and regulatory compliance.
Future directions:
1. Unified benchmark suites covering a broader set of VRP flavors.
2. Meta‑learning approaches that adapt a single NRS to new distributions with few‑shot fine‑tuning.
3. Tighter integration of NRSs with classic OR solvers (e.g., using NRSs to generate warm‑starts for mixed‑integer programming).

Authors

Yunpeng Ba
Xi Lin
Changliang Zhou
Ruihao Zheng
Zhenkun Wang
Xinyan Liang
Zhichao Lu
Jianyong Sun
Yuhua Qian
Qingfu Zhang

Paper Information

arXiv ID: 2602.21761v1
Categories: math.OC, cs.AI, cs.LG, cs.NE
Published: February 25, 2026
PDF: Download PDF

[Paper] Survey on Neural Routing Solvers

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Model Agreement via Anchoring

[Paper] SeeThrough3D: Occlusion Aware 3D Control in Text-to-Image Generation

[Paper] A Dataset is Worth 1 MB

[Paper] SOTAlign: Semi-Supervised Alignment of Unimodal Vision and Language Models via Optimal Transport