[Paper] A Methodology for Effective Surrogate Learning in Complex Optimization
Source: arXiv - 2602.08825v1
Overview
The paper introduces PTME, a systematic methodology for evaluating deep‑learning surrogate models that stand in for costly, real‑world optimization problems. By jointly measuring Precision, Time, Memory, and Energy, the authors show how to build surrogates that are both trustworthy and lightweight enough for large‑scale experimentation—demonstrated on the challenging task of optimizing city‑wide traffic‑light networks across Europe.
Key Contributions
- PTME framework: a four‑dimensional evaluation rubric (Precision, Time, Memory, Energy) for surrogate models.
- Design and comparison of multiple surrogate architectures (e.g., feed‑forward nets, graph‑based models) tailored to traffic‑light scheduling.
- Empirical study of how sampling strategies, dataset sizes, and hardware choices affect PTME metrics.
- Integration of the best surrogate into new meta‑heuristic optimizers, achieving real‑city decision‑making improvements.
- Guidelines for re‑using PTME in other domains that rely on surrogate‑based optimization (e.g., energy grids, manufacturing, autonomous systems).
Methodology
- Problem definition – The target optimization problem is the coordinated timing of traffic lights in a European city, a combinatorial task with a huge search space and expensive simulation cost.
- Surrogate construction – Several deep‑learning models are trained to predict the objective (e.g., total travel time) from a candidate traffic‑light schedule. Input representations range from flat vectors to graph neural networks that respect the road‑network topology.
- PTME evaluation –
- Precision: prediction error (MAE, RMSE) on a held‑out test set.
- Time: inference latency per candidate schedule.
- Memory: RAM/GPU memory footprint of the model.
- Energy: measured power draw (Joules) during inference using on‑board sensors or software estimators.
- Experimental factors – The authors vary (a) sampling method (random vs. stratified), (b) dataset size (10 k–1 M samples), and (c) hardware platform (CPU, mid‑range GPU, edge accelerator).
- Optimization loop – The top‑ranked surrogate feeds into a meta‑heuristic (e.g., CMA‑ES, Genetic Algorithm) that searches the schedule space, using the surrogate for fast fitness evaluation and occasionally falling back to the full traffic simulator for validation.
Results & Findings
| Metric | Best Surrogate (Graph‑NN) | Baseline Feed‑Forward |
|---|---|---|
| Precision (RMSE) | 1.8 % error | 4.5 % error |
| Inference Time | 0.7 ms per schedule | 2.3 ms per schedule |
| Memory | 210 MB | 85 MB |
| Energy | 0.12 J per inference | 0.09 J per inference |
| Optimization outcome (average travel‑time reduction) | 12 % vs. current city plan | 6 % vs. current city plan |
Key takeaways:
- Precision gains from graph‑aware models translate into noticeably better traffic‑light schedules.
- Time and energy remain low enough to evaluate millions of candidates per hour, enabling meta‑heuristics to converge faster.
- Dataset size shows diminishing returns after ~200 k samples; further growth hurts memory/energy without meaningful precision improvement.
- Sampling strategy matters: stratified sampling that respects traffic‑flow patterns yields 15 % lower error than pure random sampling.
Practical Implications
- Faster prototyping: Engineers can replace heavyweight traffic simulators with a PTME‑validated surrogate, cutting iteration cycles from hours to seconds.
- Edge deployment: The low‑energy surrogate can run on city‑edge devices (e.g., traffic‑control cabinets) for on‑the‑fly schedule adjustments.
- Scalable meta‑heuristics: By feeding a high‑precision, low‑cost surrogate into evolutionary algorithms, practitioners can explore richer solution spaces without prohibitive compute budgets.
- Cross‑domain reuse: The PTME checklist is generic—any team building surrogates for power‑grid optimization, supply‑chain routing, or autonomous‑vehicle planning can adopt the same four‑metric lens to balance accuracy against operational constraints.
Limitations & Future Work
- Generalization to unseen cities: The surrogates were trained on a set of European cities; performance may degrade on markedly different road networks without transfer learning.
- Hardware specificity: Energy measurements were taken on a particular GPU; results could vary on ASICs or newer accelerators.
- Dynamic traffic conditions: The study assumes static demand patterns; incorporating real‑time traffic fluctuations would require online surrogate updates.
- Future directions suggested by the authors include (1) exploring continual‑learning pipelines for adaptive surrogates, (2) extending PTME to multi‑objective settings (e.g., emissions + travel time), and (3) benchmarking PTME on other large‑scale combinatorial problems.
Authors
- Tomohiro Harada
- Enrique Alba
- Gabriel Luque
Paper Information
- arXiv ID: 2602.08825v1
- Categories: cs.NE
- Published: February 9, 2026
- PDF: Download PDF