[Paper] Learner-Tailored Program Repair: A Solution Generator with Iterative Edit-Driven Retrieval Enhancement

Published: 3 weeks ago (January 13, 2026 at 08:31 AM EST)

4 min read

Source: arXiv

Source: arXiv - 2601.08545v1

Overview

The paper introduces Learner‑Tailored Program Repair (LPR), a new task that goes beyond simply fixing buggy code—it also explains why the bug occurred, making it ideal for intelligent programming coaching systems. To tackle LPR, the authors present \textsc{Learner‑Tailored Solution Generator}, a two‑stage framework that combines retrieval of similar past fixes with large‑language‑model (LLM) reasoning, and iteratively refines its search based on execution feedback.

Key Contributions

New task definition (LPR): Repairs code and generates human‑readable bug explanations tailored to learners.
Edit‑driven retrieval engine: Builds a searchable database of prior solutions and retrieves the most relevant ones based on the edits needed to fix a bug.
Solution‑guided repair: Uses the retrieved snippets as concrete guidance for an LLM to produce a corrected program and an explanatory narrative.
Iterative Retrieval Enhancement (IRE): After an initial repair attempt, execution results are fed back to steer the retrieval process toward better candidate solutions, effectively “learning” from its own mistakes.
Empirical validation: Shows substantial gains over strong baselines on benchmark datasets of student code, confirming the practicality of the approach.

Methodology

Solution Retrieval Database Construction
- Collect a large corpus of correct programs and their associated edit scripts (the diff between buggy and fixed versions).
- Index these edit scripts so that, given a new buggy snippet, the system can quickly find past fixes that involve similar edits.
Edit‑Driven Retrieval (Stage 1)
- When a learner submits buggy code, the system extracts a minimal set of syntactic edits that would resolve the failure (e.g., “add a missing return”, “change == to ===”).
- These edits are used as a query to pull the top‑k most similar past solutions from the database.
Solution‑Guided Repair (Stage 2)
- The retrieved solutions are fed to a powerful LLM (e.g., GPT‑4) together with the original buggy code.
- The LLM generates:
  - a repaired version of the code,
  - a concise, learner‑friendly explanation of the bug’s root cause.
Iterative Retrieval Enhancement
- The repaired code is executed against hidden test cases.
- Failure signals (e.g., which test case still fails, error messages) are transformed into new edit queries, prompting another round of retrieval.
- This loop repeats until the code passes or a budget limit is reached, allowing the system to “self‑correct” its retrieval direction.

The pipeline is fully automated, requiring only the buggy submission and a test harness.

Results & Findings

Accuracy boost: The proposed framework achieves +30%–45% higher pass rates on standard student‑code benchmarks compared with vanilla LLM repair or retrieval‑only baselines.
Explanation quality: Human evaluators rated the generated bug explanations as clear and educational in >80% of cases, a notable improvement over prior methods that output only patches.
Iterative gains: Adding the IRE loop yields an extra 10%–15% increase in successful repairs, demonstrating that feedback‑driven retrieval is effective.
Speed: Despite the two‑stage design, average end‑to‑end latency stays under 5 seconds per submission, making it viable for real‑time tutoring tools.

Practical Implications

Intelligent tutoring systems (ITS): Deploying this framework can turn a simple “auto‑grader” into a coach that not only tells students their code is wrong but also explains the conceptual mistake.
Developer onboarding tools: New hires can paste failing snippets into a chat‑assistant that returns a fixed version and a short lesson on the underlying pattern (e.g., off‑by‑one errors).
Code review bots: In CI pipelines, the system could automatically suggest patches and annotate the change with a rationale, reducing back‑and‑forth between reviewers and authors.
Educational content generation: By mining the retrieval database, instructors can automatically assemble collections of common bug patterns and their fixes for curriculum design.

Overall, the approach bridges the gap between raw code correction and pedagogical feedback, aligning AI‑driven repair with how human mentors teach.

Limitations & Future Work

Dependence on a high‑quality solution corpus: Retrieval effectiveness drops if the database lacks diverse edit examples for a given language or domain.
Scalability to large codebases: The current edit‑driven indexing works best on relatively small, self‑contained functions typical of student assignments; extending to multi‑file projects may require hierarchical retrieval.
Explainability of LLM reasoning: While the generated explanations are readable, the internal decision path of the LLM remains a black box; future work could integrate more transparent reasoning modules.
Cross‑language generalization: Experiments focus on a single programming language (Python); adapting the pipeline to statically typed languages (Java, C++) is an open research direction.

The authors suggest enriching the retrieval database with community‑sourced patches and exploring hybrid symbolic‑LLM methods to further boost both repair accuracy and explanatory depth.

Authors

Zhenlong Dai
Zhuoluo Zhao
Hengning Wang
Xiu Tang
Sai Wu
Chang Yao
Zhipeng Gao
Jingyuan Chen

Paper Information

arXiv ID: 2601.08545v1
Categories: cs.AI, cs.CL, cs.SE
Published: January 13, 2026
PDF: Download PDF

[Paper] Learner-Tailored Program Repair: A Solution Generator with Iterative Edit-Driven Retrieval Enhancement

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Do explanations generalize across large reasoning models?

[Paper] Building Production-Ready Probes For Gemini

[Paper] The Poisoned Apple Effect: Strategic Manipulation of Mediated Markets via Technology Expansion of AI Agents

[Paper] MHA2MLA-VLM: Enabling DeepSeek's Economical Multi-Head Latent Attention across Vision-Language Models