The Wipe & Inject Pattern: Full Context for Implementation After Long Planning Sessions

Published: (December 10, 2025 at 06:30 PM EST)
3 min read
Source: Dev.to

Source: Dev.to

Cover image for The Wipe & Inject Pattern: Full Context for Implementation After Long Planning Sessions

The Wall

If you use Claude Code (or any agentic tool) for serious development, you have probably hit “The Wall.”

Scenario

Phase 1 – Planning

  • Spend ~45 minutes debating the architecture.
  • Ask Claude to read 20 files, check dependencies, and plan the auth system.
  • Cost: ~150 k tokens.
  • Result: A perfect plan.

Phase 2 – Implementation

  • You say: “Great, write the code.”
  • Claude responds: “I need to compact my memory to proceed.”

The Workaround Everyone Uses

  1. “OK, save the plan to a .md file first.”
  2. Claude compacts (loses context).
  3. “Now read the .md file you just created.”
  4. Claude re‑explores files mentioned in the plan.
  5. “Update the .md with your progress as you go.”
  6. Repeat steps 2‑5 whenever context fills up.

The Problem

When the agent “compacts” context, it retains the WHAT (e.g., “We are building auth”) but discards the WHY (e.g., “We chose cookies over headers because of XSS concerns”).
The implementation phase starts with a low‑resolution view, causing the agent to forget constraints, re‑ask questions, and produce buggy code.

The Solution – “Wipe & Inject” Pattern

We faced this daily while building Grov. To fix it, we built an orchestration flow called Planning CLEAR, which turns a “run‑on‑sentence” session into a “chapter‑book” session.

How It Works (The Logic)

  1. Detect Completion

    • A small model (Claude Haiku) monitors the session.
    • When it detects a switch from Planning to Implementation, it triggers a CLEAR event.
  2. Extract the Signal

    • Before wiping memory, two data points are extracted into a JSON structure:
    {
      "key_decisions": [
        "Use Zod for validation"
      ],
      "reasoning_trace": [
        "Because Joi doesn't support type inference"
      ]
    }
  3. The “Wipe” (Reset)

    • The messages[] array is emptied completely.
    • Old Context Usage: 150 k tokens → New Context Usage: 0 tokens.
  4. The “Inject”

    • The structured summary is injected directly into the system_prompt of the new session, giving the agent “full recall” of architectural constraints.

The Result

When you start typing code, you aren’t fighting for the last 50 k tokens of space. You have a fresh ~195 k token window, but the agent still remembers all decisions and their rationale.

Bonus – The “Heartbeat” (Solving the 5‑Minute Timeout)

Anthropic’s prompt cache expires after 5 minutes of inactivity. A 10‑minute coffee break kills the “warm cache,” causing the next prompt to cost full price and take longer.

The Fix

We added a --extended-cache flag to Grov. It runs a background heartbeat that sends a minimal token (a single .) to the API every 4 minutes while idle.

  • Cost: ≈ $0.002 per keep‑alive request (roughly every 4 minutes).
  • Value: Keeps the session “hot” indefinitely.

Try It Out (Open Source)

We built these workflows into Grov, our open‑source proxy for Claude Code.

  • Repository:
  • Install: npm install -g grov

If you’re tired of running out of tokens or losing context mid‑implementation, give it a shot and let us know whether this pattern improves your workflow!

Back to Blog

Related posts

Read more »

Compiler Engineering in Practice

Article URL: https://chisophugis.github.io/2025/12/08/compiler-engineering-in-practice-part-1-what-is-a-compiler.html Comments URL: https://news.ycombinator.com...