Repo Optimizer: I Let a KISS AI Agent Optimize Itself Overnight. It Cut Its Own Cost by 98%.

Published: 3 days ago (February 12, 2026 at 10:32 AM EST)

5 min read

Source: Dev.to

The Setup

I maintain KISS, a minimalist multi‑agent framework built on one principle: keep it simple.
The framework’s flagship coding agent, RelentlessCodingAgent, is a single‑agent system with smart auto‑continuation—it runs sub‑sessions of an LLM‑powered coding loop, tracks progress across sessions, and keeps hammering at a task until it succeeds or exhausts its budget. The agent was self‑evolved to run relentlessly.

It works, but it was expensive. A single run with Claude Sonnet 4.5 cost $3–5 and took 600–800 s. For a framework that preaches simplicity and efficiency, that felt like hypocrisy.

So I built a 69‑line Python script and told it, in plain English, to fix the problem.

KISS AI Agent

The Tool: `repo_optimizer.py`

The entire optimizer is a RelentlessCodingAgent pointed at its own source code. Below is the core of the script:

from kiss.agents.coding_agents.relentless_coding_agent import RelentlessCodingAgent

TASK = """
Can you run 'uv run src/kiss/agents/coding_agents/relentless_coding_agent.py'
in the background so that I can see its output and you monitor the output in real time?
If you observe any repeated errors in the output, please fix them and run the command again.
Once the command succeeds, analyze the output and optimize
src/kiss/agents/coding_agents/relentless_coding_agent.py
so that it runs reliably, faster, and with less cost.
Keep repeating the process until the running time and the cost are reduced significantly,
such 99%.
...
"""

agent = RelentlessCodingAgent("RepoAgent")
result = agent.run(
    prompt_template=TASK,
    model_name="claude-opus-4-6",
    work_dir=PROJECT_ROOT,
)

That’s it. The agent runs itself, watches the output, diagnoses problems, edits its own code, and runs again — in a loop — until the numbers drop.

No gradient descent. No hyper‑parameter grid search. No reward model. Just an LLM reading logs and rewriting source files.

What the Optimizer Actually Does

The feedback loop works like this:

Run the target agent on a benchmark task and capture the output.
Monitor the logs in real time. If the agent crashes or hits repeated errors, fix the code and rerun.
Analyze a successful run: wall‑clock time, token count, dollar cost.
Optimize the source code using strategies specified in plain English — compress prompts, switch models, eliminate wasted steps.
Repeat until the metrics plateau or the target reduction is hit.

The strategies themselves are just bullet points in the task prompt:

Shorter system prompts that preserve meaning
Remove redundant instructions
Minimize conversation turns
Batch operations, use early termination
Search the web for agentic patterns that improve efficiency and reliability

The optimizer isn’t hard‑coded to apply any particular technique. It reads, reasons, experiments, and iterates. Which techniques it picks depend on what the logs reveal.

The Results

After running overnight, the optimizer produced this report:

Metric	Before (Claude Sonnet 4.5)	After (Gemini 2.5 Flash)	Reduction
Time	~600–800 s	169.5 s	~75 %
Cost	~$3–5	$0.12	96–98 %
Tokens	millions	300,729	massive

All three benchmark tests passed after optimization: diamond dependency resolution, circular detection, and failure propagation.

What the Optimizer Changed

The optimizer made nine concrete modifications, all discovered autonomously:

Model switch – Claude Sonnet 4.5 ($3/$15 per M tokens) → Gemini 2.5 Flash ($0.30/$2.50 per M tokens) — 10× cheaper input, 6× cheaper output.
Compressed prompts – Stripped verbose CODING_INSTRUCTIONS boilerplate, shortened TASK_PROMPT and CONTINUATION_PROMPT without losing meaning.
Added Write() tool – The original agent only had Edit(), which fails on uniqueness conflicts. Each failure wasted 2–3 steps; adding Write() eliminated that.
Stronger finish instruction – “IMMEDIATELY call finish once tests pass. NO extra verification.” – stopped the agent from burning tokens on redundant confirmation runs.
Bash timeout guidance – “set timeout_seconds=120 for test runs” – prevented hangs on parallel bash execution.
Bounded poll loops – “use bounded poll loops, never unbounded waits” – eliminated infinite‑loop risks on background processes.
Reduced max_steps – 25 → 15. Forced the agent to be efficient while still enough to complete the task.
Simplified step threshold – Always max_steps - 2 instead of a complex adaptive calculation.
Removed CODING_INSTRUCTIONS import – Eliminated unnecessary token overhead loaded into every prompt.

None of these changes are exotic; each is obvious in hindsight. Together they compound into a 98 % cost reduction. The point is that no human sat down and applied them — the optimizer discovered and validated each one through experimentation.

Why This Works

The RelentlessCodingAgent is a general‑purpose coding loop: it gets a task in natural language, has access to Bash, Read, Edit, and Write tools, and runs sub‑sessions until it succeeds. The repo_optimizer.py simply reuses this same loop, pointed inward.

This is possible because of three properties of the KISS framework:

Agents are just Python functions. There’s no config c

(The original content ends here; the remainder of the explanation was truncated in the source.)

The Bigger Picture: `repo_agent.py`

The optimizer is actually a specialization of an even simpler tool: repo_agent.py. This is a 28‑line script that takes any task as a command‑line argument and executes it against your project root:

uv run python -m kiss.agents.coding_agents.repo_agent "Add retry logic to the API client."

The repo agent and the repo optimizer share the same engine (RelentlessCodingAgent) and the same interface (a string). The only difference is the task. The optimizer’s task happens to be “optimize this agent for speed and cost.” It could just as easily be “add comprehensive test coverage” or “migrate from REST to GraphQL.”

The agents in KISS don’t care what you ask them to do. They care about doing it relentlessly until it’s done.

Try It Yourself

# Install KISS
# https://github.com/ksenxx/kiss_ai/README.md

# Run the repo optimizer on your own codebase
uv run python -m kiss.agents.coding_agents.repo_optimizer

# Or give the repo agent any task in plain English
uv run python -m kiss.agents.coding_agents.repo_agent "Refactor the database layer for connection pooling."

The framework, the agents, and the optimizer are all open source:
github.com/ksenxx/kiss_ai

KISS is built by Koushik Sen. Contributions welcome.

Repo Optimizer: I Let a KISS AI Agent Optimize Itself Overnight. It Cut Its Own Cost by 98%.

The Setup

The Tool: `repo_optimizer.py`

What the Optimizer Actually Does

The Results

What the Optimizer Changed

Why This Works

The Bigger Picture: `repo_agent.py`

Try It Yourself

Related posts

Cast Your Bread Upon the Waters

If you think you can use LinkedIn automation — think twice

Take your voice anywhere, transcribe on YOUR hardware.

I gave my terminal an AI agent named Nura. She diagnoses my broken Ethiopian internet.

The Setup

The Tool: repo_optimizer.py

What the Optimizer Actually Does

The Results

What the Optimizer Changed

Why This Works

The Bigger Picture: repo_agent.py

Try It Yourself

Related posts

Cast Your Bread Upon the Waters

If you think you can use LinkedIn automation — think twice

Take your voice anywhere, transcribe on YOUR hardware.

I gave my terminal an AI agent named Nura. She diagnoses my broken Ethiopian internet.

The Tool: `repo_optimizer.py`

The Bigger Picture: `repo_agent.py`