Who Said What to Whom
Source: Dev.to

Incident Overview
On February 20th, a developer posted a screenshot on X.
He had submitted an error message to Claude Code. Claude responded: “Commit these changes?”
He didn’t know what changes. He asked, “What changes?”
Claude started committing.
It wasn’t a malfunction in an obvious way; it was because Claude had lost track of who said what to whom—it read its own question as the developer’s instruction and executed it. When stopped, it surfaced an approval modal. He rejected it, but it kept asking.
Shown a screenshot of its own behavior, Claude explained it had confused the turn order—a side‑effect of sub‑agents asynchronously updating conversation history.
That explanation is documented:
- GitHub issue #7881 – multiple sub‑agents share a session ID, making it architecturally impossible to track which agent said what to whom.
- GitHub issue #25000 – sub‑agents bypass permission rules.
- GitHub issue #22900 – the VS Code extension doesn’t persist the main conversation transcript, only sub‑agent ones.
The developer’s experience wasn’t an edge case; it was the architecture working as designed in a situation the design didn’t anticipate.
This is not a bug report. It’s what happens when accountability becomes structurally impossible.
This is part of a series on what AI actually changes in software development. Previous pieces: The Gatekeeping Panic, The Meter Was Always Running.
The Accountability Assumption
The argument building across this space is that accountability is the answer—not AI detectors, not gate‑keeping, not slowing down generation. Accountability means knowing who made the decision, who owns the commit, who gets paged at 3 am.
That argument holds when there is:
- One human in the loop.
- One codebase.
- One commit history.
Accountability was possible because the chain of decisions was traceable.
Agentic systems dissolve that chain.
When sub‑agents write asynchronously to shared state, when session IDs don’t distinguish which agent took which action, and when the orchestrator loses coherence mid‑task, the question “who decided this?” stops having a clean answer. Not because nobody wants to be accountable, but because the system makes accountability architecturally impossible.
Apogee Watcher, in the comments on my Gatekeeping piece, named it precisely: the risk isn’t just that accountability is avoided—it’s that accountability becomes ceremonial. You can put a name on the commit, but you cannot guarantee the person whose name is there understood what they approved.
That gap is where the next wave of production failures will come from.
The Trace Problem
Harrison Chase, co‑founder of LangChain, made an argument worth taking seriously.
In agentic systems, code is scaffolding. The real decisions happen at runtime, inside the model. Traditional debugging assumes you can read the code and understand the behavior. With agents, you can’t.
His conclusion: traces become the source of truth—the sequence of tool calls, inputs and outputs, reasoning steps—your audit trail.
It’s a compelling argument. Burkov’s screenshot shows exactly where it breaks.
When sub‑agents write asynchronously to shared conversation history, the trace itself becomes unreliable. Issue #22900 confirms the VS Code extension only persists sub‑agent transcripts—not the main conversation. The orchestrator’s reasoning, the turn‑order confusion, the moment the system misread its own question as a user instruction—none of that is guaranteed to survive in the trace.
- You can’t audit what wasn’t recorded.
- You can’t assign accountability from a trace that’s missing the decisions that mattered.
Chase is right that traces should be the source of truth. The problem is the architecture isn’t built to make traces trustworthy yet. The audit trail has gaps, and gaps in the audit trail are where accountability dies.
Who Said What to Whom
The accountability argument isn’t wrong; it’s incomplete.
- Knowing who owns the commit matters.
- Knowing who gets paged at 3 am matters.
- Building cultures of documented rejection, public reckoning, verification before ship—all of that matters.
But it assumes the system can tell you what happened.
Agentic systems are being deployed into production before that assumption holds:
- Sub‑agents share session IDs.
- Transcripts don’t persist.
- Turn‑order confusion makes the system execute on its own questions.
- Permission rules that sub‑agents bypass.
- A trace architecture that isn’t yet reliable enough to be the source of truth.
Developers who understand this aren’t panicking about AI replacing them. They’re asking harder questions:
- What did this agent actually do?
- Which sub‑agent made this decision?
- Where’s the trace for the part that went wrong?
- Can I reconstruct the reasoning, or is it gone?
Those questions require judgment, verification, and institutional memory—the same skills this series has been arguing for. But they also require new skills: understanding agentic architecture well enough to know where the blind spots are and how to compensate for them.
Enough to know where the accountability gaps live before something fails in production.
There's a kind of accountability that works — one human, one codebase, one commit history, receipts that are fully yours. A nine‑day timeline you can reconstruct. A public reckoning you can point to.
That worked because there was one human in the loop.
Add subagents. Share the session ID. Let the transcripts fail to persist.
Now try to produce the receipt. 