[Paper] AkiraRust: Re-thinking LLM-aided Rust Repair Using a Feedback-guided Thinking Switch
Source: arXiv - 2602.21681v1
Overview
The paper introduces AkiraRust, a novel framework that leverages large language models (LLMs) to automatically detect and fix undefined‑behaviour (UB) bugs in Rust code. By coupling LLM reasoning with a runtime‑aware finite‑state machine (FSM), the system can adapt its repair strategy on the fly, delivering far more semantically correct patches than prior static‑template approaches.
Key Contributions
- Feedback‑guided FSM controller – a lightweight finite‑state machine that monitors execution semantics and dynamically switches between detection, repair, and verification states.
- Dual‑mode reasoning – a “fast‑thinking” (quick pattern matching) and “slow‑thinking” (deep semantic analysis) pipeline coordinated across multiple LLM agents.
- Waveform‑driven transition logic – a controller that decides when to roll back, checkpoint, or advance based on runtime signals, ensuring context‑aware fixes.
- Empirical validation – on a benchmark of real‑world Rust UB cases, AkiraRust attains ~92 % semantic correctness and a 2.2× speed improvement over the current state‑of‑the‑art (SOTA) repair tools.
Methodology
- Problem framing – The authors treat UB repair as a sequential decision process: detect a potential UB, propose a fix, verify the fix, and iterate if needed.
- Finite‑State Machine (FSM) – Each step of the process is represented as a state (e.g., Detect, Propose, Validate, Rollback). The FSM receives “waveform” signals from the running program (e.g., panic traces, memory‑safety checks) that indicate whether the current state succeeded.
- LLM agents –
- Fast‑thinking agent: uses a lightweight prompt template to quickly generate candidate patches based on syntactic clues.
- Slow‑thinking agent: invokes a more detailed prompt that includes execution traces, type‑level information, and the FSM’s current context, allowing deeper semantic reasoning.
- Transition controller – A rule‑based module that interprets the waveform signals and decides whether to accept the fast agent’s patch, invoke the slow agent for refinement, or roll back to a previous state.
- Verification loop – After a patch is applied, the program is re‑executed under Rust’s safety checks (MIR‑based UB sanitizer). If the UB persists, the FSM cycles back, possibly trying a different reasoning mode.
The overall architecture keeps the LLM “grounded” in actual runtime behaviour, preventing it from proposing fixes that look plausible on paper but break semantics when executed.
Results & Findings
| Metric | AkiraRust | Prior SOTA (e.g., RustFix, LLM‑Rust) |
|---|---|---|
| Semantic correctness (patches that truly eliminate UB) | 92 % | ~68 % |
| Average repair time per bug | 1.8 s | 4.0 s |
| Speed‑up factor | 2.2× | – |
| Number of rollback cycles needed (median) | 1 | 3 |
Key takeaways:
- The FSM‑guided feedback loop dramatically reduces the number of “blind” LLM guesses, leading to higher correctness.
- Fast‑thinking agents handle the bulk of simple cases, while the slow‑thinking fallback is only triggered for the harder, semantics‑heavy bugs, saving time.
- The approach scales to larger codebases because the FSM isolates repair to the affected module rather than re‑analyzing the whole crate.
Practical Implications
- Developer tooling – AkiraRust can be integrated as a VS Code extension or a
cargosubcommand that offers on‑the‑fly suggestions for UB fixes, with confidence scores derived from the FSM state. - CI/CD pipelines – Teams can automatically run the FSM‑driven repair step after a test suite fails due to UB, reducing mean‑time‑to‑repair (MTTR).
- Safety‑critical Rust projects – Industries such as embedded systems, aerospace, and finance, where UB is unacceptable, can adopt AkiraRust to enforce stricter correctness guarantees without manual code audits.
- LLM cost efficiency – By limiting expensive “slow‑thinking” LLM calls to only the cases that truly need them, organizations can keep API usage (and associated costs) low while still benefiting from deep semantic analysis.
Limitations & Future Work
- Dependency on runtime traces – AkiraRust requires the program to be runnable in a test harness; bugs that only manifest under specific hardware conditions may evade detection.
- FSM rule engineering – The transition logic is handcrafted; extending it to new kinds of UB or other languages will need additional manual effort.
- LLM model constraints – The study used a single, high‑capacity LLM; performance may vary with smaller or open‑source models.
- Future directions suggested by the authors include: (1) learning the FSM transition policies from data instead of hand‑coding them, (2) expanding the framework to handle concurrency‑related UB, and (3) evaluating the approach on multi‑crate ecosystems and cross‑language interop scenarios.
Authors
- Renshuang Jiang
- Yichong Wang
- Pan Dong
- Xiaoxiang Fang
- Zhenling Duan
- Tinglue Wang
- Yuchen Hu
- Jie Yu
- Zhe Jiang
Paper Information
- arXiv ID: 2602.21681v1
- Categories: cs.SE
- Published: February 25, 2026
- PDF: Download PDF