Your AI Agent Is Dumpster Diving Through Your Code,,,

Published: (March 9, 2026 at 03:54 PM EDT)
6 min read
Source: Dev.to

Source: Dev.to

Introduction

…and we built something to stop it.

There’s a pattern every developer using AI agents eventually notices. You ask the agent to find where authentication is handled. It opens a file, skims 2,000 lines, opens another file, skims that, opens a third. By the time it answers, it’s consumed 40 000 tokens — most of them irrelevant — and your context window is half‑gone before the real work starts.

We call this dumpster diving. The agent isn’t reading strategically; it’s digging through everything looking for something edible.

We’ve been watching this happen across millions of sessions with jCodeMunch and jDocMunch. And we built something to fix it: jMRI — the jMunch Retrieval Interface.

Today we’re publishing the spec, the benchmark, and an open‑source SDK. All Apache 2.0.

The Numbers

We ran the benchmark against two real codebases: FastAPI and Flask. Three methods were compared: naive file reading, chunk RAG, and jMRI retrieval via jCodeMunch. Ten queries per repo. Here’s what came out.

FastAPI (~950 K source tokens)

MethodAvg TokensCost/QueryPrecision
Naive (read all files)949,904$2.85100 %
Chunk RAG330,372$0.9974 %
jMRI480$0.001496 %

Flask (~148 K source tokens)

MethodAvg TokensCost/QueryPrecision
Naive (read all files)147,854$0.44100 %
Chunk RAG55,251$0.1780 %
jMRI480$0.001496 %

jMRI uses 1,979 × fewer tokens than naive on FastAPI. It also beats chunk RAG on precision — 96 % vs 74 %.

That last point matters. The usual assumption is that precision is the trade‑off you make for efficiency. Chunk RAG is cheaper than naive but misses more. jMRI is cheaper than both and misses less. That’s not a coincidence — it’s a consequence of using structure instead of text similarity.

Reproduce it yourself in under 5 minutes

git clone https://github.com/jgravelle/mcp-retrieval-spec
cd mcp-retrieval-spec/benchmark
python benchmark.py --all

Why Chunk RAG Loses on Precision

Chunk RAG splits files into overlapping windows of text and ranks them by keyword overlap or embedding similarity. A chunk boundary might fall in the middle of a function. The top‑ranked chunk might contain the right words but not the right code. The retrieval is approximate by design.

jMRI retrieval is structurally exact. jCodeMunch parses source files into an AST‑derived index: every function, class, and method is a named, addressable symbol with a stable ID. When you search for "OAuth2 password bearer authentication", you get back IDs like fastapi/security/oauth2.py::OAuth2PasswordBearer#class. When you retrieve that ID, you get exactly the class — no more, no less. No boundary accidents. No half‑functions.

The 96 % precision figure reflects cases where the top search result was the correct symbol for the query. The 4 % where it wasn’t were genuinely ambiguous queries — where even a human would have debated the right answer.

What Is jMRI?

jMRI (jMunch Retrieval Interface) is an open specification for MCP servers that do retrieval right.

Four operations, one response envelope, two compliance levels:

Agent
 ├─ discover()    → What knowledge sources are available?
 ├─ search(query) → Which symbols/sections are relevant? (IDs + summaries only)
 ├─ retrieve(id)  → Give me the exact source for this ID.
 └─ metadata(id?) → What would naive reading have cost?

Every response includes a _meta block:

{
  "source": "def get_db():\n    db = SessionLocal()\n    try:\n        yield db\n    finally:\n        db.close()\n",
  "_meta": {
    "tokens_saved": 42318,
    "total_tokens_saved": 1284950,
    "cost_avoided": { "claude-sonnet-4-6": 0.127 },
    "timing_ms": 12
  }
}

The agent doesn’t have to guess whether it’s being efficient. It knows, on every call.

The spec is deliberately minimal. We’re not trying to build a platform; we’re trying to name a pattern that already works at scale and make it easy for others to implement.

The Implementations

The spec is open. The best implementations are commercial.

ImplementationDomainStarsInstall
jCodeMunchCode (30+ languages)900+uvx jcodemunch-mcp
jDocMunchDocs (MD, RST, HTML, notebooks)45+uvx jdocmunch-mcp

Both implement jMRI‑Full — the complete spec including batch retrieval, hash‑based drift detection, byte‑offset addressing, and the full _meta envelope.

The two servers have collectively saved over 18 billion tokens across user sessions in the first week of March 2026. That number is computed on‑device from real session telemetry — every participating response reports tokens_saved via os.stat, no estimation.

Getting Started

Claude Code

Add to ~/.claude.json:

{
  "mcpServers": {
    "jcodemunch-mcp": {
      "command": "uvx",
      "args": ["jcodemunch-mcp"]
    },
    "jdocmunch-mcp": {
      "command": "uvx",
      "args": ["jdocmunch-mcp"]
    }
  }
}

Python SDK

pip install jmri-sdk
from jmri.client import MRIClient

client = MRIClient()

# What’s indexed?
sources = client.discover()

# Find it
results = client.search(
    "database session dependency injection",
    repo="fastapi/fastapi"
)

# Get exactly that
symbol = client.retrieve(results[0]["id"], repo="fastapi/fastapi")
print(symbol["source"])
print(f"Tokens saved this call: {symbol['_meta']['tokens_saved']:,}")

TypeScript SDK

npm install @jmri/sdk
import { MRIClient } from "@jmri/sdk";

const client = new MRIClient();

// Discover available sources
const sources = await client.discover();

// Search for a symbol
const results = await client.search(
  "database session dependency injection",
  { repo: "fastapi/fastapi" }
);

// Retrieve the exact source
const symbol = await client.retrieve(results[0].id, { repo: "fastapi/fastapi" });
console.log(symbol.source);
console.log(`Tokens saved this call: ${symbol._meta.tokens_saved.toLocaleString()}`);

Example Usage (TypeScript)

import "mri-client";

const client = new MRIClient();
const results = await client.search("OAuth2 bearer auth", "fastapi/fastapi");
const symbol = await client.retrieve(results[0].id, "fastapi/fastapi");

The Open Spec

Everything is hosted at github.com/jgravelle/mcp-retrieval-spec.

  • SPEC.md – the full jMRI v1.0 specification (Apache 2.0)
  • sdk/python/ – Python client helper
  • sdk/typescript/ – TypeScript client
  • reference/server.py – minimal jMRI‑compliant MCP server
  • examples/ – Claude Code, Cursor, and generic agent integrations

The spec is intentionally minimal. PRs that improve examples or add language SDKs are welcome. PRs that extend the core interface need a strong argument.

If you’re building a retrieval MCP server, implement jMRI‑Core. Your users’ agents will thank you.
— J. Gravelle, March 2026

Benchmark source:
github.com/jgravelle/mcp-retrieval-spec/benchmark

SDK installation:

pip install jmri-sdk      # Python
npm install mri-client    # TypeScript / JavaScript

Spec repository:
github.com/jgravelle/mcp-retrieval-spec

0 views
Back to Blog

Related posts

Read more »