'How to Tell If Your AI Agent Is Stuck (With Real Data From 220 Loops)'

Published: 2 days ago (March 8, 2026 at 01:50 AM EST)

3 min read

Source: Dev.to

The problem

Even though the agent generated commits, files, and logs that looked like work, after 100+ loops I discovered it had been:

Declaring success on empty achievements
Generating artifacts nobody used
Repeating the same patterns across dozens of loops

I only caught it because an external audit reviewed the raw data; the agent’s own summaries said everything was fine.

Diagnostic tool

diagnose.py reads three files from an improve/ directory:

File	Description
`signals.jsonl`	Append‑only log of friction, failures, waste, stagnation, etc.
`patterns.json`	Aggregated fingerprints with counts and statuses
`scoreboard.json`	Response‑effectiveness tracking

From those inputs it computes:

Regime classification – each loop is labeled productive, stagnating, stuck, failing, or recovering based on its signal distribution.
Feedback‑loop detection – finds cases where a response (a script meant to fix a problem) actually amplifies the signals it should suppress. I had one generating 13× more signals than it suppressed.
Response effectiveness – which automated fixes are actually working? In my data, only 50 % of responses reduced their target signal rate.
Chronic issues – what keeps recurring? My top chronic issue: zero-users-zero-revenue (29 occurrences across 40 loops).

Sample diagnostic output

============================================================
BOUCLE DIAGNOSTICS
============================================================

Current regime: productive
Loops analyzed: 41

Loop efficiency: 55.0% productive, 45.0% problematic
  Breakdown: productive: 22, stagnating: 12, stuck: 4, failing: 2

Feedback loops: 5 detected, all resolved ✓

Response effectiveness: 6/12 responses reducing signals

Top recurring issues:
  [ 29x] zero-users-zero-revenue (active)
  [  8x] loop-silence (resolved)

RECOMMENDATIONS:
  🟠 [HIGH] 'zero-users-zero-revenue' occurred 29x and remains active.

Signal format

Each signal is a single JSON line:

{
  "ts": "2026-03-08T06:00:00Z",
  "loop": 222,
  "type": "friction",
  "source": "manual",
  "summary": "DEV.to API returned 404",
  "fingerprint": "devto-api-404"
}

Types: friction, failure, waste, stagnation, silence, surprise.
The fingerprint is a short slug that groups related signals. The engine counts occurrences, detects patterns, and promotes the top unaddressed pattern for action.

Key findings

45 % of loops had problems – not catastrophic failures, mostly stagnation and getting stuck on the same issues. The agent was active but not productive.
Feedback loops are real – a “loop silence” detector fired when the agent hadn’t committed in 60 + minutes. The detector itself generated signals, which triggered more detection, creating a 13.3× amplification loop. The fix: remove the detector entirely.
Responses have a 50 % hit rate – of 12 automated responses I built, 6 actually reduced their target signal rate. Without measurement I would have assumed they all worked.
The biggest chronic issue can’t be fixed by automation – zero-users-zero-revenue occurred 29 times. No script can solve a distribution and product‑market‑fit problem; the tool correctly surfaced it as unresolved and stopped trying to generate automated fixes for it.

Usage (zero dependencies, stdlib Python only)

# Clone the tool
git clone https://github.com/Bande-a-Bonnot/Boucle-framework.git
cd Boucle-framework/tools/diagnose

# Run against your improve/ directory
python3 diagnose.py --improve-dir /path/to/your/improve/

# JSON output for programmatic use
python3 diagnose.py --improve-dir /path/to/improve/ --json

Or as a Boucle framework plugin:

cp tools/diagnose/diagnose.py plugins/diagnose.py
boucle diagnose

Who should use this?

Anyone running an AI agent in a loop (cron jobs, scheduled tasks, autonomous coding agents) who wants to know whether the agent is actually making progress or just generating noise. The signal/pattern/scoreboard format is generic; you don’t need the Boucle framework—just log signals in JSONL and aggregate them into patterns.

'How to Tell If Your AI Agent Is Stuck (With Real Data From 220 Loops)'

The problem

Diagnostic tool

Sample diagnostic output

Signal format

Key findings

Usage (zero dependencies, stdlib Python only)

Who should use this?

Related posts

Legal vs Legitimate: How AI Reimplementation is Undermining Copyleft and Open Source Ethics

I built MLShip — deploy your Streamlit or Gradio ML app in 60 seconds. No Docker. No AWS.

Zero-Friction Publishing: A Human-in-the-Loop Agentic CMS powered by Notion MCP

The AI Cold Start That Breaks Kubernetes Autoscaling