The Wipe & Inject Pattern: Full Context for Implementation After Long Planning Sessions
Source: Dev.to

The Wall
If you use Claude Code (or any agentic tool) for serious development, you have probably hit “The Wall.”
Scenario
Phase 1 – Planning
- Spend ~45 minutes debating the architecture.
- Ask Claude to read 20 files, check dependencies, and plan the auth system.
- Cost: ~150 k tokens.
- Result: A perfect plan.
Phase 2 – Implementation
- You say: “Great, write the code.”
- Claude responds: “I need to compact my memory to proceed.”
The Workaround Everyone Uses
- “OK, save the plan to a
.mdfile first.” - Claude compacts (loses context).
- “Now read the
.mdfile you just created.” - Claude re‑explores files mentioned in the plan.
- “Update the
.mdwith your progress as you go.” - Repeat steps 2‑5 whenever context fills up.
The Problem
When the agent “compacts” context, it retains the WHAT (e.g., “We are building auth”) but discards the WHY (e.g., “We chose cookies over headers because of XSS concerns”).
The implementation phase starts with a low‑resolution view, causing the agent to forget constraints, re‑ask questions, and produce buggy code.
The Solution – “Wipe & Inject” Pattern
We faced this daily while building Grov. To fix it, we built an orchestration flow called Planning CLEAR, which turns a “run‑on‑sentence” session into a “chapter‑book” session.
How It Works (The Logic)
-
Detect Completion
- A small model (Claude Haiku) monitors the session.
- When it detects a switch from Planning to Implementation, it triggers a CLEAR event.
-
Extract the Signal
- Before wiping memory, two data points are extracted into a JSON structure:
{ "key_decisions": [ "Use Zod for validation" ], "reasoning_trace": [ "Because Joi doesn't support type inference" ] } -
The “Wipe” (Reset)
- The
messages[]array is emptied completely. - Old Context Usage: 150 k tokens → New Context Usage: 0 tokens.
- The
-
The “Inject”
- The structured summary is injected directly into the
system_promptof the new session, giving the agent “full recall” of architectural constraints.
- The structured summary is injected directly into the
The Result
When you start typing code, you aren’t fighting for the last 50 k tokens of space. You have a fresh ~195 k token window, but the agent still remembers all decisions and their rationale.
Bonus – The “Heartbeat” (Solving the 5‑Minute Timeout)
Anthropic’s prompt cache expires after 5 minutes of inactivity. A 10‑minute coffee break kills the “warm cache,” causing the next prompt to cost full price and take longer.
The Fix
We added a --extended-cache flag to Grov. It runs a background heartbeat that sends a minimal token (a single .) to the API every 4 minutes while idle.
- Cost: ≈ $0.002 per keep‑alive request (roughly every 4 minutes).
- Value: Keeps the session “hot” indefinitely.
Try It Out (Open Source)
We built these workflows into Grov, our open‑source proxy for Claude Code.
- Repository:
- Install:
npm install -g grov
If you’re tired of running out of tokens or losing context mid‑implementation, give it a shot and let us know whether this pattern improves your workflow!