[Paper] EvoLattice: Persistent Internal-Population Evolution through Multi-Alternative Quality-Diversity Graph Representations for LLM-Guided Program Discovery

Published: 3 days ago (December 15, 2025 at 02:43 PM EST)

4 min read

Source: arXiv

Source: arXiv - 2512.13857v1

Overview

EvoLattice is a new framework that lets large language models (LLMs) evolve whole populations of programs—or even multi‑agent behaviors—inside a single directed‑acyclic graph (DAG). By storing multiple persistent alternatives at each node, every valid path through the graph becomes a distinct, runnable candidate, dramatically expanding the search space while keeping the underlying structure compact and reusable.

Key Contributions

Graph‑based population encoding – Represents an entire candidate pool in one DAG, avoiding the “single‑candidate overwrite” limitation of prior LLM‑guided evolution.
Multi‑alternative nodes – Each node holds several interchangeable code fragments (or prompt pieces), enabling combinatorial explosion of candidates without duplicating shared code.
Alternative‑level evaluation – Scores are collected for every alternative across all paths it participates in, yielding fine‑grained statistics about local design choices and their global impact.
Deterministic self‑repair – A built‑in mechanism guarantees acyclicity and dependency consistency, automatically fixing structural errors introduced by the LLM.
Implicit quality‑diversity dynamics – The multi‑alternative representation naturally drives both performance improvement and behavioral diversity, without needing an external archive.
Unified program & agent evolution – The same graph can host source‑code snippets or prompt fragments, making the approach applicable to program synthesis, optimizer meta‑learning, and multi‑agent system design.

Methodology

Graph Construction – Start with a root node representing an empty program. The LLM proposes new alternatives (e.g., a function definition, a loop, a prompt fragment) that are added as child nodes.
Path Extraction – Any acyclic path from the root to a leaf spells out a complete program/agent. Because each node may have several alternatives, the number of possible paths grows combinatorially.
Evaluation Loop –
- Execute each candidate path (or a sampled subset for scalability).
- Record performance metrics (e.g., test‑case pass rate, reward in a simulated environment).
- Propagate the scores back to every alternative that appeared in the evaluated paths, aggregating statistics such as mean, variance, and contribution to success.
LLM‑Guided Mutation & Recombination – The aggregated statistics become a dense feedback signal. The LLM is prompted to:
- Mutate low‑scoring alternatives (replace or tweak code).
- Recombine high‑scoring alternatives from different branches to create new paths.
Pruning & Self‑Repair – Alternatives that consistently underperform are pruned. The self‑repair routine checks the DAG for cycles or broken dependencies and automatically restructures it, ensuring every path remains executable.
Iterative Evolution – Steps 2‑5 repeat for a fixed number of generations or until a performance threshold is met.

Results & Findings

Benchmark	Baseline (single‑candidate LLM)	EvoLattice	Observations
Program synthesis (synthetic tasks)	62 % success after 30 generations	78 % success	Faster convergence, fewer catastrophic regressions
Optimizer meta‑learning	0.71 average reward	0.84 average reward	More stable improvement curve, less variance
Multi‑agent prompt composition	48 % task completion	66 % task completion	Emergent diversity of strategies without explicit archive

Stability: EvoLattice’s self‑repair prevented crashes that plagued overwrite‑based methods, resulting in smoother learning curves.
Expressivity: The combinatorial path space allowed the discovery of solutions that combined previously unrelated code fragments, something single‑candidate approaches could never explore.
Implicit QD behavior: Diversity metrics (e.g., number of distinct functional behaviours) rose naturally as alternatives diversified, mirroring explicit quality‑diversity algorithms.

Practical Implications

Scalable code generation pipelines: Teams can integrate EvoLattice into CI/CD to continuously evolve utility scripts, configuration generators, or domain‑specific languages while preserving useful building blocks.
Robust agent design: In reinforcement‑learning or chatbot contexts, developers can evolve prompt libraries or sub‑policies that automatically recombine, yielding more adaptable agents.
Reduced LLM waste: Because alternatives are persisted and re‑used, the same LLM calls generate multiple candidate solutions, lowering API costs compared to generating fresh programs each iteration.
Debug‑friendly evolution: The deterministic self‑repair gives developers confidence that generated code will at least be syntactically valid, simplifying downstream testing and deployment.
Plug‑and‑play with existing tools: EvoLattice’s DAG can be exported to common graph formats (e.g., GraphML, DOT) and visualized, making it compatible with version‑control diff tools and code‑review workflows.

Limitations & Future Work

Scalability of exhaustive evaluation: While sampling mitigates the combinatorial blow‑up, very large graphs still require careful budget allocation; smarter path‑selection heuristics are needed.
LLM dependence: The quality of mutations hinges on the underlying model; weaker LLMs may produce many low‑utility alternatives, increasing pruning overhead.
Domain specificity: The current implementation focuses on imperative code and prompt fragments; extending to functional languages, hardware description languages, or graphics shaders may need custom node semantics.
User‑guided constraints: Future work could expose APIs for developers to inject hard constraints (e.g., security policies) directly into the graph, guiding the evolution toward compliant solutions.

EvoLattice opens a promising avenue for turning LLMs into true evolutionary engineers—preserving what works, exploring what could work, and doing it all within a single, self‑healing graph structure. As the community builds richer evaluation metrics and tighter integration with development pipelines, we can expect LLM‑guided program discovery to become a practical tool in the everyday developer’s toolbox.

Authors

Kamer Ali Yuksel

Paper Information

arXiv ID: 2512.13857v1
Categories: cs.AI, cs.CL, cs.LG, cs.MA, cs.NE
Published: December 15, 2025
PDF: Download PDF

[Paper] EvoLattice: Persistent Internal-Population Evolution through Multi-Alternative Quality-Diversity Graph Representations for LLM-Guided Program Discovery

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Predictive Concept Decoders: Training Scalable End-to-End Interpretability Assistants

[Paper] Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers

[Paper] Explaining the Reasoning of Large Language Models Using Attribution Graphs

[Paper] PPSEBM: An Energy-Based Model with Progressive Parameter Selection for Continual Learning