After 2 years of AI-assisted coding, I automated the one thing that actually improved quality: AI Pair Programming
Source: Dev.to
After nearly two years of AI‑assisted development—from ChatGPT 3.5 to Claude Code—I kept hitting the same problem: every model makes mistakes it can’t catch. Inspired by pair programming and the Ralph Loop, I built a dual‑agent workflow where one agent writes and another reviews. Last week, a PR written entirely by the two agents was merged into a 15 k‑star open‑source Electron project after three rounds of maintainer feedback. I don’t write TypeScript.
Background: AI‑Assisted Programming
I started with ChatGPT 3.5 generating snippets, then moved through Claude, Cursor, TRAE, and eventually fell in love with Claude Code. While Claude Code excels at architecture, it skips error handling when the context gets long and becomes sloppy on defensive code in later turns. The pattern is consistent: a single agent can’t reliably catch its own mistakes—it writes code and judges whether that code is good, like grading its own exam.
Pair Programming
Pair programming was formalized by Kent Beck as part of Extreme Programming (XP) in the late 1990s. The core idea is simple:
- Two developers share one workstation.
- One driver writes code.
- One navigator watches, catches mistakes in real time, questions design decisions, and keeps the big picture in focus.
Research consistently shows that pair programming produces fewer defects and better designs, despite the perception that it “wastes” half the developers.
The Ralph Loop
The Ralph Loop (by Geoffrey Huntley) wraps a coding agent in an external loop so it keeps iterating. This powerful idea gave me the push to automate my dual‑agent workflow.
Dual‑Agent Workflow
- Author agent analyzes feedback, writes fixes, and runs tests.
- Reviewer agent reviews the diffs, verifies correctness, and confirms type safety.
The agents iterate until the reviewer has nothing left to flag.
Example Issues Fixed
- Double
super.kill()race condition → added an idempotent guard. - Swallowed errors (
.catch(() => {})) → now log warnings. treeKilldiscrepancy → PR description aligned with upstream implementation.
Case Study: AionUI PR
I applied this workflow to fork AionUI (≈ 15 k ⭐ Electron + React app) into an internal AI assistant for my company:
- 30 commits, zero manual code.
- Full rebrand, core engine rewrite, database migration, CI/CD rebuild—all done through the dual‑agent loop.
When the maintainer provided feedback, I pointed the two agents at the issues. The author agent wrote the fixes, the reviewer agent validated them, and after a few rounds the PR merged with 133/133 tests passing. I watched but didn’t write a single line of code.
Community Feedback
After I posted about this, another developer—Hwee‑Boon Yar, an indie dev with 30 years of experience—shared a similar approach: a skill that shells out to a second agent for review, looping until the reviewer has nothing left to flag. It’s lighter, works within a single session, and trades off some automation for simplicity. The core insight is the same.
Limitations & What Doesn’t Work
- The AI coding conversation is often too focused on generation and not enough on review.
- Benchmarks tend to measure how fast and how much code models can produce, but rarely ask who checks it.
Getting Started
npm i -g ralph-lisa-loop
The tool is early‑stage but already used daily for real work, not just demos.
Conclusion
If you’ve been doing AI coding and hitting that frustrating “almost right, but not quite” problem—you’re not alone. A dual‑agent loop can help, or at least inspire your own approach. The failure modes are often more interesting than the successes, and I’m happy to discuss them.