[Paper] Learning to Evolve with Convergence Guarantee via Neural Unrolling

Published: (December 12, 2025 at 05:46 AM EST)
4 min read
Source: arXiv

Source: arXiv - 2512.11453v1

Overview

The paper presents Learning to Evolve (L2E), a new meta‑optimization framework that teaches an evolutionary algorithm how to search while still offering a mathematical convergence guarantee. By casting the evolutionary process as a neural unrolling operation rooted in Krasnosel’skii‑Mann fixed‑point theory, the authors bridge the gap between the flexibility of learned optimizers and the reliability of classic theory‑driven methods.

Key Contributions

  • Bilevel meta‑optimization formulation that treats evolutionary search as a neural unrolled operator, enabling end‑to‑end learning of search dynamics.
  • Contractive inner‑loop operator built on a structured Mamba‑style neural network, guaranteeing a strictly convergent trajectory.
  • Composite gradient‑derived solver that blends learned global proposals with local proxy‑gradient steps, achieving a balance between exploration and exploitation.
  • Provable convergence under Krasnosel’skii‑Mann fixed‑point theory, a rare property for data‑driven optimizers.
  • Extensive empirical validation showing zero‑shot generalization to high‑dimensional synthetic benchmarks and real‑world control tasks, demonstrating scalability and robustness.

Methodology

  1. Neural Unrolling as Evolutionary Search – Each iteration of an evolutionary algorithm is reinterpreted as a layer of a deep network. The network (the operator) receives a population, applies learned transformations, and outputs the next population.
  2. Inner Loop (Contractive Dynamics) – A Mamba‑inspired neural operator is constrained to be contractive: the distance between successive populations shrinks, mathematically ensuring convergence to a fixed point.
  3. Outer Loop (Meta‑learning) – A higher‑level optimizer adjusts the parameters of the neural operator so that its fixed point aligns with the minimizer of the target objective. This forms a classic bilevel problem: the inner loop solves a fixed‑point equation, the outer loop tunes the operator.
  4. Composite Solver – At each step, the algorithm mixes two signals:
    • Learned evolutionary proposals (global, data‑driven moves).
    • Proxy gradient steps (local refinements derived from a differentiable surrogate of the objective).
      The mixing ratio is itself learned, allowing adaptive control of exploration versus exploitation.
  5. Convergence Proof – Grounding the operator in Krasnosel’skii‑Mann theory, the authors show that, under mild assumptions, the unrolled process converges to a fixed point regardless of the learned parameters, providing a safety net that most learned optimizers lack.

Results & Findings

  • Scalability: L2E successfully optimizes problems with up to 10,000 dimensions, outperforming traditional evolutionary strategies (CMA‑ES, DE) and recent learned optimizers in both solution quality and wall‑clock time.
  • Zero‑Shot Generalization: Models trained on synthetic functions (e.g., Rastrigin, Ackley) transferred directly to unseen control environments (e.g., cart‑pole, robotic arm) without fine‑tuning, achieving lower cumulative regret than baselines.
  • Exploration‑Exploitation Balance: Ablation studies reveal that the adaptive mixing is crucial; removing the gradient component leads to premature convergence, while removing learned proposals stalls progress in multimodal landscapes.
  • Robustness: Across 30 random seeds, L2E’s performance variance is significantly lower than that of pure evolutionary baselines, indicating more predictable behavior.

Practical Implications

  • Plug‑and‑Play Optimizer: Developers can drop the learned L2E module into existing pipelines (hyper‑parameter tuning, neural architecture search, reinforcement‑learning policy optimization) and gain both adaptability and a convergence guarantee.
  • Reduced Engineering Overhead: Because L2E learns a generic search manifold, teams no longer need to hand‑craft problem‑specific heuristics or spend weeks tuning evolutionary hyper‑parameters.
  • Safety‑Critical Systems: The provable convergence makes L2E a viable candidate for domains where unpredictable optimizer behavior is unacceptable (e.g., autonomous vehicle control, finance).
  • Accelerated Research: Researchers can train L2E on modest synthetic suites and then reuse the same model for a wide range of downstream tasks, cutting down experimental cycles.

Limitations & Future Work

  • Assumption of Contractivity: Enforcing a strictly contractive operator may limit the expressiveness of the learned dynamics, potentially hindering performance on highly non‑convex or discontinuous landscapes.
  • Meta‑Training Cost: The bilevel training process is computationally intensive; scaling to extremely large datasets or real‑time adaptation remains an open challenge.
  • Proxy Gradient Quality: The method relies on a differentiable surrogate of the objective; when such a proxy is poor or unavailable, the local refinement component may degrade.
  • Future Directions: The authors suggest exploring adaptive contractivity constraints, incorporating richer surrogate models (e.g., learned physics simulators), and extending the framework to multi‑objective or constrained optimization scenarios.

Authors

  • Jiaxin Gao
  • Yaohua Liu
  • Ran Cheng
  • Kay Chen Tan

Paper Information

  • arXiv ID: 2512.11453v1
  • Categories: cs.NE
  • Published: December 12, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »