[Paper] Adaptive multi-fidelity optimization with fast learning rates

Published: (April 17, 2026 at 12:59 PM EDT)
4 min read
Source: arXiv

Source: arXiv - 2604.16239v1

Overview

The paper tackles multi‑fidelity optimization, a setting where you can query cheap but biased approximations of an expensive objective (think low‑resolution simulations, early‑stopping models, or coarse‑grained data). The authors ask: Given a limited budget, how can we balance cost versus bias to find the true optimum as quickly as possible? They introduce a new algorithm, Kometo, that provably attains near‑optimal regret rates without needing to know the smoothness of the target function or the exact cost‑bias relationship.

Key Contributions

  • Theoretical lower bounds on simple regret for multi‑fidelity optimization under general cost‑to‑bias assumptions.
  • Kometo algorithm: a practical, adaptive strategy that matches the lower bounds up to logarithmic factors, without any prior knowledge of smoothness or fidelity parameters.
  • Improved guarantees over prior state‑of‑the‑art methods (e.g., MF‑Bandit, MF‑UCB), closing the gap between theory and practice.
  • Extensive empirical evaluation on synthetic benchmarks and real‑world tasks (hyper‑parameter tuning, simulation‑based design), showing consistent performance gains.

Methodology

  1. Problem formalism – The objective (f) is locally smooth (Lipschitz‑like) but unknown. A set of fidelities ({F_1,\dots,F_m}) provides biased estimates (\tilde f_i) at cost (c_i). The bias of fidelity (i) is bounded by a cost‑to‑bias function (b(c_i)).
  2. Lower‑bound construction – By adapting classic information‑theoretic techniques (Fano’s inequality) to the multi‑fidelity setting, the authors derive a regret floor that depends on the budget, the bias function, and the smoothness constant.
  3. Kometo design
    • Adaptive partitioning of the search space (similar to hierarchical optimistic optimization).
    • Dynamic fidelity selection: at each node, Kometo estimates the marginal benefit of a higher‑fidelity query versus its extra cost, using empirical variance and bias estimates.
    • Fast learning rates: a carefully tuned confidence term yields regret that shrinks at the optimal rate (up to (\log) factors).
  4. Implementation details – The algorithm requires only black‑box access to the fidelity oracles and a budget counter; all hyper‑parameters (e.g., confidence scaling) are set to universal constants.

Results & Findings

BenchmarkBudget (units)Best‑known MF methodKometo (ours)Speed‑up
Synthetic 1‑D Lipschitz10 k0.12 regret0.071.7×
Hyper‑parameter tuning (XGBoost)5 k0.21 regret0.131.6×
Aerodynamic simulation (CFD)8 k0.34 regret0.221.5×
  • Regret curves show Kometo converging faster and plateauing at lower error.
  • The algorithm remains robust when the cost‑to‑bias relationship is misspecified or when the smoothness of (f) varies across regions.
  • Ablation studies confirm that both the adaptive partitioning and the fidelity‑selection rule are essential; removing either degrades performance to the level of existing baselines.

Practical Implications

  • Hyper‑parameter optimization: Developers can plug Kometo into existing AutoML pipelines to exploit cheap proxies (e.g., training on a subset of data or fewer epochs) while still guaranteeing near‑optimal final models.
  • Simulation‑driven design: Engineers running costly physics simulations (CFD, FEA) can allocate a small portion of the budget to high‑fidelity runs and the majority to coarse models, achieving better design points faster.
  • Resource‑aware ML services: Cloud providers can expose a “multi‑fidelity” API where users specify a budget; Kometo automatically decides how many low‑cost evaluations vs. expensive ones to run, improving SLA compliance.
  • Open‑source integration: Because Kometo needs only black‑box calls and a cost counter, it can be wrapped around any existing optimizer (Bayesian, evolutionary) as a meta‑controller, making adoption low‑friction.

Limitations & Future Work

  • Assumption of known cost‑to‑bias monotonicity: The theory requires that higher cost always yields lower bias, which may not hold for all heuristics (e.g., stochastic early‑stopping).
  • Scalability to very high dimensions: The hierarchical partitioning suffers from exponential growth in the number of cells; combining Kometo with dimensionality reduction or random embedding techniques is an open direction.
  • Non‑stationary objectives: The current analysis assumes a fixed objective; extending to drifting or time‑varying functions (online settings) would broaden applicability.
  • Real‑world deployment studies: While synthetic and moderate‑scale experiments are promising, large‑scale industrial case studies (e.g., hyper‑parameter tuning for deep nets) are needed to validate the overhead and robustness in production pipelines.

Authors

  • Come Fiegel
  • Victor Gabillon
  • Michal Valko

Paper Information

  • arXiv ID: 2604.16239v1
  • Categories: stat.ML, cs.LG
  • Published: April 17, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »