Prompt-caching – auto-injects Anthropic cache breakpoints (90% token savings)

Published: (March 13, 2026 at 07:38 AM EDT)
2 min read
Source: Hacker News

Source: Hacker News

MCP Plugin · Open Source · MIT
Built with Claude, for Claude.

How prompt‑caching works

Anthropic’s caching API stores stable content server‑side for 5 minutes. Cache reads cost 0.1× instead of 1×. This plugin places the breakpoints automatically.

🐛 BugFix Mode

Detects stack traces in your messages. Caches the buggy file + error context once. Every follow‑up only pays for the new question.

♻️ Refactor Mode

Detects refactor keywords + file lists. Caches the before‑pattern, style guides, and type definitions. Only per‑file instructions are re‑sent.

📂 File Tracking

Tracks read counts per file. On the second read, injects a cache breakpoint. All future reads cost 0.1× instead of 1×. (always on — all modes)

🧊 Conversation Freeze

After N turns, freezes all messages before turn (N − 3) as a cached prefix. Only the last 3 turns are sent fresh. Savings compound.

Benchmarks

Measured on real Claude Code sessions with Sonnet. Break‑even at turn 2.

Session typeTurnsWithout cachingWith cachingSavings
Bug fix (single file)20184 000 tokens28 400 tokens85 %
Refactor (5 files)15310 000 tokens61 200 tokens80 %
General coding40890 000 tokens71 200 tokens92 %
Repeated file reads (5 × 5)50 000 tokens5 100 tokens90 %

Cache creation costs 1.25× normal. Cache reads cost 0.1×. Every turn after the first is pure savings.

Install prompt‑caching

⏳ Pending approval in the official Claude Code plugin marketplace. Install directly from GitHub in the meantime:

/plugin marketplace add https://github.com/flightlesstux/prompt-caching
/plugin install prompt-caching@ercan-ermis

Claude Code’s plugin system handles everything automatically. The get_cache_stats tool is available immediately after install.

Install globally via npm

npm install -g prompt-caching-mcp

Add to your client’s MCP config

{
  "mcpServers": {
    "prompt-caching-mcp": {
      "command": "prompt-caching-mcp"
    }
  }
}

Supported MCP‑compatible clients include Cursor, Windsurf, ChatGPT, Perplexity, Zed, Continue.dev, and any other MCP client.


Open source · MIT · Zero lock‑in

Ready to cut your Claude Code token costs by 90%?

0 views
Back to Blog

Related posts

Read more »

Claude March 2026 usage promotion

We're offering a limited-time promotion that doubles usage limits for Claude users outside 8 AM‑2 PM ET / 5‑11 AM PT. This promotion is available for Free, Pro,...