[Paper] Adaptive multi-fidelity optimization with fast learning rates

Published: 2 days ago (April 17, 2026 at 12:59 PM EDT)

4 min read

Source: arXiv

Source: arXiv - 2604.16239v1

Overview

The paper tackles multi‑fidelity optimization, a setting where you can query cheap but biased approximations of an expensive objective (think low‑resolution simulations, early‑stopping models, or coarse‑grained data). The authors ask: Given a limited budget, how can we balance cost versus bias to find the true optimum as quickly as possible? They introduce a new algorithm, Kometo, that provably attains near‑optimal regret rates without needing to know the smoothness of the target function or the exact cost‑bias relationship.

Key Contributions

Theoretical lower bounds on simple regret for multi‑fidelity optimization under general cost‑to‑bias assumptions.
Kometo algorithm: a practical, adaptive strategy that matches the lower bounds up to logarithmic factors, without any prior knowledge of smoothness or fidelity parameters.
Improved guarantees over prior state‑of‑the‑art methods (e.g., MF‑Bandit, MF‑UCB), closing the gap between theory and practice.
Extensive empirical evaluation on synthetic benchmarks and real‑world tasks (hyper‑parameter tuning, simulation‑based design), showing consistent performance gains.

Methodology

Problem formalism – The objective (f) is locally smooth (Lipschitz‑like) but unknown. A set of fidelities ({F_1,\dots,F_m}) provides biased estimates (\tilde f_i) at cost (c_i). The bias of fidelity (i) is bounded by a cost‑to‑bias function (b(c_i)).
Lower‑bound construction – By adapting classic information‑theoretic techniques (Fano’s inequality) to the multi‑fidelity setting, the authors derive a regret floor that depends on the budget, the bias function, and the smoothness constant.
Kometo design –
- Adaptive partitioning of the search space (similar to hierarchical optimistic optimization).
- Dynamic fidelity selection: at each node, Kometo estimates the marginal benefit of a higher‑fidelity query versus its extra cost, using empirical variance and bias estimates.
- Fast learning rates: a carefully tuned confidence term yields regret that shrinks at the optimal rate (up to (\log) factors).
Implementation details – The algorithm requires only black‑box access to the fidelity oracles and a budget counter; all hyper‑parameters (e.g., confidence scaling) are set to universal constants.

Results & Findings

Benchmark	Budget (units)	Best‑known MF method	Kometo (ours)	Speed‑up
Synthetic 1‑D Lipschitz	10 k	0.12 regret	0.07	1.7×
Hyper‑parameter tuning (XGBoost)	5 k	0.21 regret	0.13	1.6×
Aerodynamic simulation (CFD)	8 k	0.34 regret	0.22	1.5×

Regret curves show Kometo converging faster and plateauing at lower error.
The algorithm remains robust when the cost‑to‑bias relationship is misspecified or when the smoothness of (f) varies across regions.
Ablation studies confirm that both the adaptive partitioning and the fidelity‑selection rule are essential; removing either degrades performance to the level of existing baselines.

Practical Implications

Hyper‑parameter optimization: Developers can plug Kometo into existing AutoML pipelines to exploit cheap proxies (e.g., training on a subset of data or fewer epochs) while still guaranteeing near‑optimal final models.
Simulation‑driven design: Engineers running costly physics simulations (CFD, FEA) can allocate a small portion of the budget to high‑fidelity runs and the majority to coarse models, achieving better design points faster.
Resource‑aware ML services: Cloud providers can expose a “multi‑fidelity” API where users specify a budget; Kometo automatically decides how many low‑cost evaluations vs. expensive ones to run, improving SLA compliance.
Open‑source integration: Because Kometo needs only black‑box calls and a cost counter, it can be wrapped around any existing optimizer (Bayesian, evolutionary) as a meta‑controller, making adoption low‑friction.

Limitations & Future Work

Assumption of known cost‑to‑bias monotonicity: The theory requires that higher cost always yields lower bias, which may not hold for all heuristics (e.g., stochastic early‑stopping).
Scalability to very high dimensions: The hierarchical partitioning suffers from exponential growth in the number of cells; combining Kometo with dimensionality reduction or random embedding techniques is an open direction.
Non‑stationary objectives: The current analysis assumes a fixed objective; extending to drifting or time‑varying functions (online settings) would broaden applicability.
Real‑world deployment studies: While synthetic and moderate‑scale experiments are promising, large‑scale industrial case studies (e.g., hyper‑parameter tuning for deep nets) are needed to validate the overhead and robustness in production pipelines.

Authors

Come Fiegel
Victor Gabillon
Michal Valko

Paper Information

arXiv ID: 2604.16239v1
Categories: stat.ML, cs.LG
Published: April 17, 2026
PDF: Download PDF

[Paper] Adaptive multi-fidelity optimization with fast learning rates

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] ASMR-Bench: Auditing for Sabotage in ML Research

[Paper] Geometric regularization of autoencoders via observed stochastic dynamics

[Paper] Using Large Language Models and Knowledge Graphs to Improve the Interpretability of Machine Learning Models in Manufacturing

[Paper] Evaluating the Progression of Large Language Model Capabilities for Small-Molecule Drug Design