[Paper] Learning to Evolve with Convergence Guarantee via Neural Unrolling
Source: arXiv - 2512.11453v1
Overview
The paper presents Learning to Evolve (L2E), a new meta‑optimization framework that teaches an evolutionary algorithm how to search while still offering a mathematical convergence guarantee. By casting the evolutionary process as a neural unrolling operation rooted in Krasnosel’skii‑Mann fixed‑point theory, the authors bridge the gap between the flexibility of learned optimizers and the reliability of classic theory‑driven methods.
Key Contributions
- Bilevel meta‑optimization formulation that treats evolutionary search as a neural unrolled operator, enabling end‑to‑end learning of search dynamics.
- Contractive inner‑loop operator built on a structured Mamba‑style neural network, guaranteeing a strictly convergent trajectory.
- Composite gradient‑derived solver that blends learned global proposals with local proxy‑gradient steps, achieving a balance between exploration and exploitation.
- Provable convergence under Krasnosel’skii‑Mann fixed‑point theory, a rare property for data‑driven optimizers.
- Extensive empirical validation showing zero‑shot generalization to high‑dimensional synthetic benchmarks and real‑world control tasks, demonstrating scalability and robustness.
Methodology
- Neural Unrolling as Evolutionary Search – Each iteration of an evolutionary algorithm is reinterpreted as a layer of a deep network. The network (the operator) receives a population, applies learned transformations, and outputs the next population.
- Inner Loop (Contractive Dynamics) – A Mamba‑inspired neural operator is constrained to be contractive: the distance between successive populations shrinks, mathematically ensuring convergence to a fixed point.
- Outer Loop (Meta‑learning) – A higher‑level optimizer adjusts the parameters of the neural operator so that its fixed point aligns with the minimizer of the target objective. This forms a classic bilevel problem: the inner loop solves a fixed‑point equation, the outer loop tunes the operator.
- Composite Solver – At each step, the algorithm mixes two signals:
- Learned evolutionary proposals (global, data‑driven moves).
- Proxy gradient steps (local refinements derived from a differentiable surrogate of the objective).
The mixing ratio is itself learned, allowing adaptive control of exploration versus exploitation.
- Convergence Proof – Grounding the operator in Krasnosel’skii‑Mann theory, the authors show that, under mild assumptions, the unrolled process converges to a fixed point regardless of the learned parameters, providing a safety net that most learned optimizers lack.
Results & Findings
- Scalability: L2E successfully optimizes problems with up to 10,000 dimensions, outperforming traditional evolutionary strategies (CMA‑ES, DE) and recent learned optimizers in both solution quality and wall‑clock time.
- Zero‑Shot Generalization: Models trained on synthetic functions (e.g., Rastrigin, Ackley) transferred directly to unseen control environments (e.g., cart‑pole, robotic arm) without fine‑tuning, achieving lower cumulative regret than baselines.
- Exploration‑Exploitation Balance: Ablation studies reveal that the adaptive mixing is crucial; removing the gradient component leads to premature convergence, while removing learned proposals stalls progress in multimodal landscapes.
- Robustness: Across 30 random seeds, L2E’s performance variance is significantly lower than that of pure evolutionary baselines, indicating more predictable behavior.
Practical Implications
- Plug‑and‑Play Optimizer: Developers can drop the learned L2E module into existing pipelines (hyper‑parameter tuning, neural architecture search, reinforcement‑learning policy optimization) and gain both adaptability and a convergence guarantee.
- Reduced Engineering Overhead: Because L2E learns a generic search manifold, teams no longer need to hand‑craft problem‑specific heuristics or spend weeks tuning evolutionary hyper‑parameters.
- Safety‑Critical Systems: The provable convergence makes L2E a viable candidate for domains where unpredictable optimizer behavior is unacceptable (e.g., autonomous vehicle control, finance).
- Accelerated Research: Researchers can train L2E on modest synthetic suites and then reuse the same model for a wide range of downstream tasks, cutting down experimental cycles.
Limitations & Future Work
- Assumption of Contractivity: Enforcing a strictly contractive operator may limit the expressiveness of the learned dynamics, potentially hindering performance on highly non‑convex or discontinuous landscapes.
- Meta‑Training Cost: The bilevel training process is computationally intensive; scaling to extremely large datasets or real‑time adaptation remains an open challenge.
- Proxy Gradient Quality: The method relies on a differentiable surrogate of the objective; when such a proxy is poor or unavailable, the local refinement component may degrade.
- Future Directions: The authors suggest exploring adaptive contractivity constraints, incorporating richer surrogate models (e.g., learned physics simulators), and extending the framework to multi‑objective or constrained optimization scenarios.
Authors
- Jiaxin Gao
- Yaohua Liu
- Ran Cheng
- Kay Chen Tan
Paper Information
- arXiv ID: 2512.11453v1
- Categories: cs.NE
- Published: December 12, 2025
- PDF: Download PDF