[Paper] Digital Red Queen: Adversarial Program Evolution in Core War with LLMs

Published: 1 month ago (January 6, 2026 at 01:58 PM EST)

4 min read

Source: arXiv

Source: arXiv - 2601.03335v1

Overview

The paper introduces Digital Red Queen (DRQ), a lightweight self‑play framework that lets a large language model (LLM) continuously evolve assembly‑like programs—called warriors—to out‑compete every previously generated opponent in the classic Core War sandbox. By turning the optimization problem into an open‑ended “Red Queen” arms race, the authors show that LLM‑generated code can become increasingly general and converge toward robust strategies, offering a new lens on adversarial AI and potential lessons for security‑focused applications.

Key Contributions

Red‑Queen self‑play loop: A simple algorithm where each new LLM‑generated warrior must defeat all earlier warriors, enforcing continual adaptation.
LLM‑driven program synthesis: Uses a state‑of‑the‑art language model to write low‑level Core War assembly code from high‑level prompts.
Empirical evidence of convergence: Across many generations, warriors become more general (perform better against unseen human‑crafted opponents) and less behaviorally diverse, mirroring convergent evolution.
Core War as a testbed: Demonstrates that the Turing‑complete Core War VM is a tractable, controllable sandbox for studying adversarial co‑evolution and for benchmarking LLM‑based evolution methods.
Broader vision: Shows how minimal self‑play setups could be transplanted to real‑world adversarial domains such as cybersecurity red‑team/blue‑team exercises or drug‑resistance modeling.

Methodology

Environment: Core War—a virtual machine where two programs (warriors) battle for control of shared memory. The language is assembly‑like, deterministic, and fully observable.
Initial population: A set of baseline warriors (including human‑written ones) seeds the competition.
Self‑play loop (DRQ):
- At round t, the LLM receives a prompt describing the goal: “Write a Core War warrior that defeats every warrior generated in rounds 0 … t‑1.”
- The model generates candidate code, which is compiled and tested against the full archive of previous warriors.
- The first candidate that wins all matches becomes the new champion and is added to the archive.
Evaluation: After many rounds, the authors test the evolved warriors against a held‑out suite of human‑crafted opponents and measure behavioral diversity using execution trace clustering.
Analysis: Track win‑rates, generality (performance on unseen opponents), and diversity trends over independent runs.

Results & Findings

Increasing generality: After ~200 generations, the DRQ warriors achieve higher win‑rates against a diverse set of human warriors than any single generation earlier in the run.
Convergence of behavior: Independent DRQ runs produce warriors with remarkably similar execution patterns, indicating a strong attractor strategy in the fitness landscape.
Efficiency: The entire evolution process runs on commodity hardware (single GPU) and completes within a few hours, showing that sophisticated adversarial dynamics need not require massive compute.
Comparison to static optimization: A baseline where the LLM is asked to optimize against a fixed opponent plateaus quickly, whereas the Red‑Queen loop continues to push performance upward.

Practical Implications

Adversarial code generation for security testing: DRQ‑style self‑play could automate the creation of novel exploits or defensive payloads that continuously adapt to each other, providing richer red‑team/blue‑team training scenarios.
Robust AI agents: The convergence toward general strategies suggests a pathway for training LLM‑based agents that remain effective even as opponents evolve, useful in competitive gaming, automated negotiation, or autonomous defense systems.
Benchmark for LLM program synthesis: Core War offers a low‑overhead, reproducible benchmark for measuring how well LLMs can generate correct, performant low‑level code under adversarial pressure.
Rapid prototyping of co‑evolutionary algorithms: The minimal DRQ loop can be transplanted to other sandboxed domains (e.g., network packet filters, smart contract fuzzing) to explore arms‑race dynamics without building large simulation infrastructures.

Limitations & Future Work

Domain specificity: Core War, while expressive, is a toy environment; results may not directly transfer to high‑stakes real‑world systems without additional constraints.
LLM dependence: The quality of evolved warriors hinges on the underlying model’s code‑generation capabilities; smaller or less‑trained models may stall early.
Diversity loss: Convergent behavior, while indicating a strong strategy, also reduces exploration of alternative tactics that could be valuable in heterogeneous threat landscapes.
Future directions: Extending DRQ to multi‑objective settings (e.g., stealth + speed), integrating reinforcement‑learning critics for finer‑grained feedback, and applying the framework to realistic cybersecurity sandboxes or drug‑resistance simulations.

Authors

Akarsh Kumar
Ryan Bahlous-Boldi
Prafull Sharma
Phillip Isola
Sebastian Risi
Yujin Tang
David Ha

Paper Information

arXiv ID: 2601.03335v1
Categories: cs.AI, cs.NE
Published: January 6, 2026
PDF: Download PDF

[Paper] Digital Red Queen: Adversarial Program Evolution in Core War with LLMs

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Manifold limit for the training of shallow graph convolutional neural networks

[Paper] AdaFuse: Adaptive Ensemble Decoding with Test-Time Scaling for LLMs

[Paper] LookAroundNet: Extending Temporal Context with Transformers for Clinically Viable EEG Seizure Detection

[Paper] Detecting Stochasticity in Discrete Signals via Nonparametric Excursion Theorem