I Almost Went Broke Letting AI Agents Work for Me
Source: Dev.to
Introduction
AI agents are powerful, but they can also be expensive in a very quiet way.
When I use a normal chatbot, I send one message and get one answer—the cost is easy to understand.
When I let an AI coding agent work, it may read files, edit code, run tests, fail, retry, send more context, and call the model again and again.
Sometimes that is useful. Sometimes it is just stuck in a loop.
Most LLM dashboards only tell you how much money you spent after the money is already gone. I wanted something that could stop a dangerous agent run before the next provider call happens.
AgentCostFirewall
I built AgentCostFirewall, a local‑first OpenAI‑compatible proxy that sits between your AI agent and your model provider.

Cursor / Continue / OpenClaw / local agent
↓
AgentCostFirewall
↓
OpenAI‑compatible provider
How it works
The idea is simple: detect risky or over‑budget agent runs before they burn your API budget.
Current features
- Pre‑call budget checks
- Over‑budget blocking
- Basic runaway‑loop detection
- Exact cache with savings metrics
- Local dashboard
- Password authentication
- Streaming passthrough
- Tool‑call passthrough
- No‑key demo mode
Repository
GitHub:
Call for feedback
I am looking for feedback from people using Cursor, Continue.dev, OpenClaw, Codex API‑key mode, Cline, Roo Code, or custom local agents.
Would you put something like this in front of your AI agent?