Prompt-caching – auto-injects Anthropic cache breakpoints (90% token savings)
Source: Hacker News
MCP Plugin · Open Source · MIT
Built with Claude, for Claude.
How prompt‑caching works
Anthropic’s caching API stores stable content server‑side for 5 minutes. Cache reads cost 0.1× instead of 1×. This plugin places the breakpoints automatically.
🐛 BugFix Mode
Detects stack traces in your messages. Caches the buggy file + error context once. Every follow‑up only pays for the new question.
♻️ Refactor Mode
Detects refactor keywords + file lists. Caches the before‑pattern, style guides, and type definitions. Only per‑file instructions are re‑sent.
📂 File Tracking
Tracks read counts per file. On the second read, injects a cache breakpoint. All future reads cost 0.1× instead of 1×. (always on — all modes)
🧊 Conversation Freeze
After N turns, freezes all messages before turn (N − 3) as a cached prefix. Only the last 3 turns are sent fresh. Savings compound.
Benchmarks
Measured on real Claude Code sessions with Sonnet. Break‑even at turn 2.
| Session type | Turns | Without caching | With caching | Savings |
|---|---|---|---|---|
| Bug fix (single file) | 20 | 184 000 tokens | 28 400 tokens | 85 % |
| Refactor (5 files) | 15 | 310 000 tokens | 61 200 tokens | 80 % |
| General coding | 40 | 890 000 tokens | 71 200 tokens | 92 % |
| Repeated file reads (5 × 5) | — | 50 000 tokens | 5 100 tokens | 90 % |
Cache creation costs 1.25× normal. Cache reads cost 0.1×. Every turn after the first is pure savings.
Install prompt‑caching
Claude Code (recommended)
⏳ Pending approval in the official Claude Code plugin marketplace. Install directly from GitHub in the meantime:
/plugin marketplace add https://github.com/flightlesstux/prompt-caching
/plugin install prompt-caching@ercan-ermisClaude Code’s plugin system handles everything automatically. The get_cache_stats tool is available immediately after install.
Install globally via npm
npm install -g prompt-caching-mcpAdd to your client’s MCP config
{
"mcpServers": {
"prompt-caching-mcp": {
"command": "prompt-caching-mcp"
}
}
}Supported MCP‑compatible clients include Cursor, Windsurf, ChatGPT, Perplexity, Zed, Continue.dev, and any other MCP client.
Open source · MIT · Zero lock‑in
Ready to cut your Claude Code token costs by 90%?