Repo Optimizer: I Let a KISS AI Agent Optimize Itself Overnight. It Cut Its Own Cost by 98%.
Source: Dev.to
The Setup
I maintain KISS, a minimalist multi‑agent framework built on one principle: keep it simple.
The framework’s flagship coding agent, RelentlessCodingAgent, is a single‑agent system with smart auto‑continuation—it runs sub‑sessions of an LLM‑powered coding loop, tracks progress across sessions, and keeps hammering at a task until it succeeds or exhausts its budget. The agent was self‑evolved to run relentlessly.
It works, but it was expensive. A single run with Claude Sonnet 4.5 cost $3–5 and took 600–800 s. For a framework that preaches simplicity and efficiency, that felt like hypocrisy.
So I built a 69‑line Python script and told it, in plain English, to fix the problem.

The Tool: repo_optimizer.py
The entire optimizer is a RelentlessCodingAgent pointed at its own source code. Below is the core of the script:
from kiss.agents.coding_agents.relentless_coding_agent import RelentlessCodingAgent
TASK = """
Can you run 'uv run src/kiss/agents/coding_agents/relentless_coding_agent.py'
in the background so that I can see its output and you monitor the output in real time?
If you observe any repeated errors in the output, please fix them and run the command again.
Once the command succeeds, analyze the output and optimize
src/kiss/agents/coding_agents/relentless_coding_agent.py
so that it runs reliably, faster, and with less cost.
Keep repeating the process until the running time and the cost are reduced significantly,
such 99%.
...
"""
agent = RelentlessCodingAgent("RepoAgent")
result = agent.run(
prompt_template=TASK,
model_name="claude-opus-4-6",
work_dir=PROJECT_ROOT,
)
That’s it. The agent runs itself, watches the output, diagnoses problems, edits its own code, and runs again — in a loop — until the numbers drop.
No gradient descent. No hyper‑parameter grid search. No reward model. Just an LLM reading logs and rewriting source files.
What the Optimizer Actually Does
The feedback loop works like this:
- Run the target agent on a benchmark task and capture the output.
- Monitor the logs in real time. If the agent crashes or hits repeated errors, fix the code and rerun.
- Analyze a successful run: wall‑clock time, token count, dollar cost.
- Optimize the source code using strategies specified in plain English — compress prompts, switch models, eliminate wasted steps.
- Repeat until the metrics plateau or the target reduction is hit.
The strategies themselves are just bullet points in the task prompt:
- Shorter system prompts that preserve meaning
- Remove redundant instructions
- Minimize conversation turns
- Batch operations, use early termination
- Search the web for agentic patterns that improve efficiency and reliability
The optimizer isn’t hard‑coded to apply any particular technique. It reads, reasons, experiments, and iterates. Which techniques it picks depend on what the logs reveal.
The Results
After running overnight, the optimizer produced this report:
| Metric | Before (Claude Sonnet 4.5) | After (Gemini 2.5 Flash) | Reduction |
|---|---|---|---|
| Time | ~600–800 s | 169.5 s | ~75 % |
| Cost | ~$3–5 | $0.12 | 96–98 % |
| Tokens | millions | 300,729 | massive |
All three benchmark tests passed after optimization: diamond dependency resolution, circular detection, and failure propagation.
What the Optimizer Changed
The optimizer made nine concrete modifications, all discovered autonomously:
- Model switch – Claude Sonnet 4.5 ($3/$15 per M tokens) → Gemini 2.5 Flash ($0.30/$2.50 per M tokens) — 10× cheaper input, 6× cheaper output.
- Compressed prompts – Stripped verbose
CODING_INSTRUCTIONSboilerplate, shortenedTASK_PROMPTandCONTINUATION_PROMPTwithout losing meaning. - Added
Write()tool – The original agent only hadEdit(), which fails on uniqueness conflicts. Each failure wasted 2–3 steps; addingWrite()eliminated that. - Stronger finish instruction – “IMMEDIATELY call finish once tests pass. NO extra verification.” – stopped the agent from burning tokens on redundant confirmation runs.
- Bash timeout guidance – “set
timeout_seconds=120for test runs” – prevented hangs on parallel bash execution. - Bounded poll loops – “use bounded poll loops, never unbounded waits” – eliminated infinite‑loop risks on background processes.
- Reduced
max_steps– 25 → 15. Forced the agent to be efficient while still enough to complete the task. - Simplified step threshold – Always
max_steps - 2instead of a complex adaptive calculation. - Removed
CODING_INSTRUCTIONSimport – Eliminated unnecessary token overhead loaded into every prompt.
None of these changes are exotic; each is obvious in hindsight. Together they compound into a 98 % cost reduction. The point is that no human sat down and applied them — the optimizer discovered and validated each one through experimentation.
Why This Works
The RelentlessCodingAgent is a general‑purpose coding loop: it gets a task in natural language, has access to Bash, Read, Edit, and Write tools, and runs sub‑sessions until it succeeds. The repo_optimizer.py simply reuses this same loop, pointed inward.
This is possible because of three properties of the KISS framework:
- Agents are just Python functions. There’s no config c
(The original content ends here; the remainder of the explanation was truncated in the source.)
The Bigger Picture: repo_agent.py
The optimizer is actually a specialization of an even simpler tool: repo_agent.py. This is a 28‑line script that takes any task as a command‑line argument and executes it against your project root:
uv run python -m kiss.agents.coding_agents.repo_agent "Add retry logic to the API client."
The repo agent and the repo optimizer share the same engine (RelentlessCodingAgent) and the same interface (a string). The only difference is the task. The optimizer’s task happens to be “optimize this agent for speed and cost.” It could just as easily be “add comprehensive test coverage” or “migrate from REST to GraphQL.”
The agents in KISS don’t care what you ask them to do. They care about doing it relentlessly until it’s done.
Try It Yourself
# Install KISS
# https://github.com/ksenxx/kiss_ai/README.md
# Run the repo optimizer on your own codebase
uv run python -m kiss.agents.coding_agents.repo_optimizer
# Or give the repo agent any task in plain English
uv run python -m kiss.agents.coding_agents.repo_agent "Refactor the database layer for connection pooling."
The framework, the agents, and the optimizer are all open source:
github.com/ksenxx/kiss_ai
KISS is built by Koushik Sen. Contributions welcome.