[Paper] AkiraRust: Re-thinking LLM-aided Rust Repair Using a Feedback-guided Thinking Switch

Published: 3 days ago (February 25, 2026 at 03:34 AM EST)

4 min read

Source: arXiv

Source: arXiv - 2602.21681v1

Overview

The paper introduces AkiraRust, a novel framework that leverages large language models (LLMs) to automatically detect and fix undefined‑behaviour (UB) bugs in Rust code. By coupling LLM reasoning with a runtime‑aware finite‑state machine (FSM), the system can adapt its repair strategy on the fly, delivering far more semantically correct patches than prior static‑template approaches.

Key Contributions

Feedback‑guided FSM controller – a lightweight finite‑state machine that monitors execution semantics and dynamically switches between detection, repair, and verification states.
Dual‑mode reasoning – a “fast‑thinking” (quick pattern matching) and “slow‑thinking” (deep semantic analysis) pipeline coordinated across multiple LLM agents.
Waveform‑driven transition logic – a controller that decides when to roll back, checkpoint, or advance based on runtime signals, ensuring context‑aware fixes.
Empirical validation – on a benchmark of real‑world Rust UB cases, AkiraRust attains ~92 % semantic correctness and a 2.2× speed improvement over the current state‑of‑the‑art (SOTA) repair tools.

Methodology

Problem framing – The authors treat UB repair as a sequential decision process: detect a potential UB, propose a fix, verify the fix, and iterate if needed.
Finite‑State Machine (FSM) – Each step of the process is represented as a state (e.g., Detect, Propose, Validate, Rollback). The FSM receives “waveform” signals from the running program (e.g., panic traces, memory‑safety checks) that indicate whether the current state succeeded.
LLM agents –
- Fast‑thinking agent: uses a lightweight prompt template to quickly generate candidate patches based on syntactic clues.
- Slow‑thinking agent: invokes a more detailed prompt that includes execution traces, type‑level information, and the FSM’s current context, allowing deeper semantic reasoning.
Transition controller – A rule‑based module that interprets the waveform signals and decides whether to accept the fast agent’s patch, invoke the slow agent for refinement, or roll back to a previous state.
Verification loop – After a patch is applied, the program is re‑executed under Rust’s safety checks (MIR‑based UB sanitizer). If the UB persists, the FSM cycles back, possibly trying a different reasoning mode.

The overall architecture keeps the LLM “grounded” in actual runtime behaviour, preventing it from proposing fixes that look plausible on paper but break semantics when executed.

Results & Findings

Metric	AkiraRust	Prior SOTA (e.g., RustFix, LLM‑Rust)
Semantic correctness (patches that truly eliminate UB)	92 %	~68 %
Average repair time per bug	1.8 s	4.0 s
Speed‑up factor	2.2×	–
Number of rollback cycles needed (median)	1	3

Key takeaways:

The FSM‑guided feedback loop dramatically reduces the number of “blind” LLM guesses, leading to higher correctness.
Fast‑thinking agents handle the bulk of simple cases, while the slow‑thinking fallback is only triggered for the harder, semantics‑heavy bugs, saving time.
The approach scales to larger codebases because the FSM isolates repair to the affected module rather than re‑analyzing the whole crate.

Practical Implications

Developer tooling – AkiraRust can be integrated as a VS Code extension or a cargo subcommand that offers on‑the‑fly suggestions for UB fixes, with confidence scores derived from the FSM state.
CI/CD pipelines – Teams can automatically run the FSM‑driven repair step after a test suite fails due to UB, reducing mean‑time‑to‑repair (MTTR).
Safety‑critical Rust projects – Industries such as embedded systems, aerospace, and finance, where UB is unacceptable, can adopt AkiraRust to enforce stricter correctness guarantees without manual code audits.
LLM cost efficiency – By limiting expensive “slow‑thinking” LLM calls to only the cases that truly need them, organizations can keep API usage (and associated costs) low while still benefiting from deep semantic analysis.

Limitations & Future Work

Dependency on runtime traces – AkiraRust requires the program to be runnable in a test harness; bugs that only manifest under specific hardware conditions may evade detection.
FSM rule engineering – The transition logic is handcrafted; extending it to new kinds of UB or other languages will need additional manual effort.
LLM model constraints – The study used a single, high‑capacity LLM; performance may vary with smaller or open‑source models.
Future directions suggested by the authors include: (1) learning the FSM transition policies from data instead of hand‑coding them, (2) expanding the framework to handle concurrency‑related UB, and (3) evaluating the approach on multi‑crate ecosystems and cross‑language interop scenarios.

Authors

Renshuang Jiang
Yichong Wang
Pan Dong
Xiaoxiang Fang
Zhenling Duan
Tinglue Wang
Yuchen Hu
Jie Yu
Zhe Jiang

Paper Information

arXiv ID: 2602.21681v1
Categories: cs.SE
Published: February 25, 2026
PDF: Download PDF

[Paper] AkiraRust: Re-thinking LLM-aided Rust Repair Using a Feedback-guided Thinking Switch

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Array-Carrying Symbolic Execution for Function Contract Generation

[Paper] LLM-Powered Silent Bug Fuzzing in Deep Learning Libraries via Versatile and Controlled Bug Transfer

[Paper] CL4SE: A Context Learning Benchmark For Software Engineering Tasks

[Paper] Managing Uncertainty in LLM-based Multi-Agent System Operation