[Paper] Pareto-Optimal Anytime Algorithms via Bayesian Racing
Source: arXiv - 2603.08493v1
Overview
The paper introduces PolarBear, a new framework for comparing and selecting optimization algorithms when you don’t know in advance how much time you’ll have to run them. By treating “anytime performance” (how solution quality evolves over time) as a Pareto‑optimization problem, the authors can rank algorithms without needing problem‑specific bounds or manual plot inspection. The result is a statistically sound, automatically generated set of algorithms that dominate the rest at any possible runtime.
Key Contributions
- Pareto‑based anytime comparison: Formalizes algorithm dominance across all time points, not just a single budget or an aggregated scalar.
- Ranking‑only inference: Uses only relative rankings of solution quality (no absolute objective values), eliminating the need for normalization or known optima.
- Temporal Plackett‑Luce model: Extends the classic Plackett‑Luce ranking model to handle time‑indexed data, providing posterior probabilities of pairwise dominance.
- Bayesian racing procedure (PolarBear): Adaptive sampling that discards algorithms once they are confidently dominated, dramatically reducing evaluation cost.
- Decision‑ready output: Returns a posterior‑calibrated Pareto set that can be plugged directly into downstream selection tools, supporting arbitrary time‑budget preferences and risk tolerances.
Methodology
- Data collection: Run each candidate algorithm on a collection of problem instances, recording the best objective value it attains at a series of time checkpoints (e.g., every second).
- Convert to rankings: At each checkpoint, sort the algorithms by their current best value per instance and assign a rank (1 = best, 2 = second‑best, …). This step removes any dependence on the actual numeric values.
- Temporal Plackett‑Luce model: Treat the sequence of rankings as observations generated from a latent “skill” parameter for each algorithm that can evolve over time. Bayesian inference yields a posterior distribution over these skills at every checkpoint.
- Bayesian racing: Using the posterior, compute the probability that algorithm A dominates algorithm B at all future checkpoints. If this probability exceeds a user‑defined confidence threshold, B is eliminated from the race. The process iterates, focusing computational budget on the still‑alive contenders.
- Pareto set extraction: After racing stops (either all but a few algorithms are eliminated or the budget is exhausted), the remaining algorithms constitute the anytime Pareto set—no algorithm in the set is dominated across the entire time horizon.
Results & Findings
- Efficiency: On benchmark suites (e.g., SAT, TSP, continuous optimization), PolarBear eliminated up to 70 % of candidate algorithms after evaluating only ~30 % of the total runtime, saving considerable compute.
- Robustness: Because rankings are used, the method performed consistently even when objective scales differed wildly across instances or when optimal values were unknown.
- Decision quality: When downstream selection was simulated with varied time‑budget preferences, the Pareto set produced by PolarBear yielded selections within 2 % of the optimal (oracle) choice, outperforming traditional scalar‑based benchmarks (e.g., AUC, area‑under‑curve) by 10–15 %.
- Calibration: Posterior probabilities of dominance were well‑calibrated (Brier score < 0.05), giving developers confidence to set aggressive elimination thresholds without risking premature discarding.
Practical Implications
- Automated algorithm portfolios: DevOps pipelines can integrate PolarBear to build a portfolio of solvers that automatically adapts to the available compute time, delivering the best possible solution at any moment.
- Hyper‑parameter tuning services: Cloud‑based optimization APIs can use the Pareto set to expose a “time‑budget” slider to end‑users, internally mapping the request to the appropriate algorithm without extra benchmarking.
- Resource‑aware scheduling: In heterogeneous clusters where job runtimes are unpredictable, schedulers can query the posterior to pick an algorithm that minimizes expected regret given the remaining wall‑clock time.
- Risk‑aware decision making: The Bayesian posterior enables risk‑sensitive policies (e.g., “pick the algorithm that has at least 95 % chance of being within 5 % of the best possible solution after 10 s”).
- Reduced benchmarking cost: Companies can evaluate dozens of candidate solvers on a modest set of instances and still obtain a reliable anytime Pareto front, cutting down on expensive large‑scale benchmark runs.
Limitations & Future Work
- Scalability to very large algorithm pools: While Bayesian racing cuts down runtime, the inference step still scales quadratically with the number of alive algorithms; sparse approximations or hierarchical modeling could help.
- Assumption of monotonic improvement: The method presumes that solution quality never degrades over time, which holds for most deterministic optimizers but may be violated by stochastic or restart‑heavy strategies.
- Temporal granularity: Choosing checkpoint intervals is a trade‑off; too coarse a grid may miss short‑term dominance shifts, while too fine a grid increases data volume. Adaptive checkpointing is a promising direction.
- Extension to multi‑objective problems: The current formulation handles a single objective; extending the ranking model to handle vector‑valued objectives (e.g., cost vs. latency) would broaden applicability.
PolarBear offers a principled, developer‑friendly way to answer the “which optimizer should I run when I have X seconds left?” question—turning a traditionally manual, plot‑driven process into an automated, statistically backed decision engine.
Authors
- Jonathan Wurth
- Helena Stegherr
- Neele Kemper
- Michael Heider
- Jörg Hähner
Paper Information
- arXiv ID: 2603.08493v1
- Categories: cs.NE, cs.LG
- Published: March 9, 2026
- PDF: Download PDF