[Paper] Landscape-aware Automated Algorithm Design: An Efficient Framework for Real-world Optimization

Published: (February 4, 2026 at 08:18 AM EST)
4 min read
Source: arXiv

Source: arXiv - 2602.04529v1

Overview

The paper introduces a landscape‑aware automated algorithm design framework that lets developers harness Large Language Models (LLMs) to invent optimization algorithms without repeatedly running costly real‑world evaluations. By pairing a genetic‑programming function generator with an LLM‑driven evolutionary designer, the authors create cheap proxy problems that mimic the “shape” of the target problem, dramatically cutting the computational budget while still delivering high‑performing solvers.

Key Contributions

  • Decoupled discovery pipeline: Separates algorithm synthesis from expensive problem evaluations, enabling massive exploration of the algorithmic space at low cost.
  • Landscape similarity guiding: Uses statistical descriptors of problem landscapes (e.g., modality, ruggedness, separability) to match generated proxy functions to real‑world problems, ensuring transferability of discovered algorithms.
  • Hybrid GP + LLM engine: A Genetic Programming (GP) module creates diverse test functions, while an LLM‑based evolutionary algorithm designer selects and refines algorithmic components (selection, mutation, crossover, etc.).
  • Empirical validation on real‑world benchmarks: Demonstrates up to 70 % reduction in expensive evaluations while achieving comparable or better solution quality than baseline LLM‑only or hand‑crafted algorithms.
  • Open‑source prototype: Provides a reusable Python library that integrates popular LLM APIs (e.g., OpenAI, Anthropic) with DEAP‑based GP and landscape analysis tools.

Methodology

  1. Landscape Characterisation – For each target optimization problem, the authors compute a set of inexpensive landscape metrics (e.g., fitness‑distance correlation, autocorrelation length, number of local optima).
  2. Proxy Function Generation (GP) – A GP system evolves mathematical functions whose landscape metrics are close to those of the target problem. The fitness of a GP individual is the distance between its metrics and the target’s metrics.
  3. LLM‑Driven Algorithm Designer – An LLM (prompted with the proxy function’s description) proposes a candidate optimization algorithm expressed in a domain‑specific language (e.g., a JSON schema of operators and parameters).
  4. Evolutionary Loop – The candidate algorithms are evaluated only on the cheap proxy functions. Their performance feeds back into a standard evolutionary loop that mutates and recombines algorithmic components.
  5. Final Validation – After a predefined budget of proxy evaluations, the best‑performing algorithm is run on the real problem once (or a few times) to confirm its quality.

The whole pipeline runs like a “sandbox”: developers can experiment with thousands of algorithmic ideas without ever touching the expensive real‑world objective.

Results & Findings

BenchmarkReal‑world evaluations savedBest‑found solution quality*
Portfolio optimisation (10k assets)68 %1.02 × baseline (hand‑tuned GA)
Hyper‑parameter tuning for a deep net71 %0.98 × baseline (grid search)
Scheduling for a manufacturing line65 %1.05 × baseline (commercial solver)

*Quality expressed as a ratio to the best known baseline (lower is better for minimisation).

  • Proxy fidelity: Landscape‑matched proxies yielded a 0.92 correlation between proxy performance and real‑world performance, far higher than random proxies (≈0.45).
  • Algorithm diversity: The framework discovered novel hybrid algorithms (e.g., differential evolution with adaptive mutation guided by LLM‑suggested heuristics) that were not present in the initial search space.
  • Speed: End‑to‑end runtime on a single GPU node dropped from ~48 h (full LLM‑only search) to ~14 h, mainly due to the cheap proxy evaluations.

Practical Implications

  • Cost‑effective AutoML/AutoOpt: Teams can now let LLMs “design” custom optimizers for niche problems (e.g., supply‑chain routing, energy grid balancing) without burning cloud credits on thousands of trial runs.
  • Rapid prototyping: Developers can iterate on algorithmic ideas in minutes, test them on realistic proxies, and only commit to a full run once they have a high‑confidence candidate.
  • Integration with CI/CD: The open‑source library can be scripted into CI pipelines, automatically generating and benchmarking new solvers whenever a problem definition changes.
  • Domain‑specific solver libraries: Companies can build a catalog of LLM‑crafted optimizers tuned to their own data distributions, delivering “plug‑and‑play” performance boosts for downstream services.

Limitations & Future Work

  • Landscape metric selection: The current set of descriptors may not capture all nuances of highly constrained or discrete problems, limiting proxy fidelity in those domains.
  • LLM dependence: Quality of generated algorithms hinges on prompt engineering and LLM capabilities; newer models may be required for more complex designs.
  • Scalability of GP proxy generation: While cheap, GP still incurs overhead for very high‑dimensional landscapes; the authors suggest exploring surrogate‑based or neural landscape generators.
  • Future directions: Extending the framework to multi‑objective settings, incorporating reinforcement‑learning‑based LLM agents, and automating the metric‑selection process via meta‑learning.

Authors

  • Haoran Yin
  • Shuaiqun Pan
  • Zhao Wei
  • Jian Cheng Wong
  • Yew‑Soon Ong
  • Anna V. Kononova
  • Thomas Bäck
  • Niki van Stein

Paper Information

  • arXiv ID: 2602.04529v1
  • Categories: cs.NE
  • Published: February 4, 2026
  • PDF: Download PDF
Back to Blog

Related posts

Read more »