Refactoring Agent Skills: From Context Explosion to a Fast, Reliable Workflow
Source: Dev.to
1️⃣ The Root Cause: Treating Skills Like Docs
The first trap is incredibly human:
“If I include everything, the model will always have what it needs.”
So you create one Skill per tool, and each Skill becomes a documentation dump:
- setup steps
- API references
- exhaustive examples
- “don’t do X” lists
- every edge case since 2017
Then a task like “deploy a serverless function with a small UI” pulls in:
- your Cloudflare Skill
- your Docker Skill
- your UI‑styling Skill
- your web‑framework Skill …
…and the model starts its job already half‑drowned.
Claude Code’s own docs warn that Skills share the context window with the conversation, the request, and other Skills—meaning uncontrolled loading is a direct performance tax (you feel it as slowness, drift, and “why is it ignoring the obvious part?”).
Bottom line: your problem isn’t “lack of info.” It’s “too much irrelevant info.”
2️⃣ The Fix: Progressive Disclosure (Three Layers)
Claude Code docs explicitly recommend progressive disclosure: keep essential info in SKILL.md, and store the heavy stuff in separate files that get loaded only when the task requires them.
This maps cleanly to a three‑layer system:
Layer 1 – Metadata (always loaded)
A short YAML front‑matter with name, description, and a routing signal. Think of it like a book cover and blurb—you’re not teaching, you’re helping the model decide whether to open the book.
Layer 2 – Entry point: SKILL.md (loaded on activation)
Your navigation map:
- what the Skill is for
- when to use it
- high‑level steps to follow
- which reference files to open next
Not a tutorial, not a wiki.
Layer 3 – References & scripts (loaded only when needed)
Small, focused files:
- one topic per file
- ~200–300 lines per file is a good target
- scripts do deterministic work so the model doesn’t burn tokens “describing” actions
Example folder layout
.claude/skills/devops/
├── SKILL.md
├── references/
│ ├── serverless-cloudflare.md
│ ├── containers-docker.md
│ └── ci-cd-basics.md
└── scripts/
├── validate_env.py
└── deploy_helper.sh
3️⃣ The “200‑Line Rule”: Brutal, Slightly Arbitrary, Weirdly Effective
In the community refactor story, the author landed on a hard constraint:
Keep
SKILL.mdunder ~200 lines.
If you can’t, you’re putting too much in the entry point.
Claude’s own best‑practice docs recommend keeping the body under a few hundred lines (and splitting content as you approach that limit). “200 lines” is a sharper knife: it forces you to write a table of contents, not a textbook.
Why it works
- The model can scan the entry quickly.
- It can decide which reference file to load next.
- The total “initial load” stays small enough that the conversation still has room to breathe.
Quick test you can steal
- Start a fresh session (cold start).
- Trigger your Skill.
- If the first activation loads more than ~500 lines of content, your design is likely leaking scope.
4️⃣ The Real Mental Shift: From Tool‑Centric to Workflow‑Centric
This is the part most people miss.
Tool‑centric Skills (problematic)
cloudflare-skill
tailwind-skill
postgres-skill
kubernetes-skill
They’re encyclopedias. They don’t compose well.
Workflow‑centric Skills (recommended)
devops (deploy + environments + CI/CD)
ui-styling (design rules + component patterns)
web-frameworks (routing + project structure + SSR pitfalls)
databases (schema design + migrations + query patterns)
They map to what you actually do during development.
A workflow Skill answers:
“When I’m in this stage of work, what does the agent need to know to act correctly?”
—not—
“What is everything this tool can do?”
That reframing prevents context blow‑ups almost by itself.
5️⃣ A Minimal, Production‑Grade SKILL.md (Example)
Below is a deliberately small entry point you can copy and customise. Notice what’s missing: long examples, full docs, and “everything you might ever need.”
---
name: ui-styling
description: Apply consistent UI styling across the app (Tailwind + component conventions). Use when building or refactoring UI.
---
# UI Styling Skill
## When to use
- Starting a new UI component.
- Refactoring existing components for consistency.
- Updating the design system.
## High‑level workflow
1. **Identify** the component or page that needs styling.
2. **Run** `scripts/apply_tailwind.sh` to scaffold Tailwind classes.
3. **Reference** `references/tailwind-utilities.md` for utility‑class guidance.
4. **Validate** with `scripts/check_style.py` to ensure no lint errors.
## Reference files (load on demand)
- `references/tailwind-utilities.md` – list of approved utilities and patterns.
- `references/component-conventions.md` – naming, folder structure, and composition rules.
## Scripts (load on demand)
- `scripts/apply_tailwind.sh` – injects Tailwind classes into a file.
- `scripts/check_style.py` – runs style linting and reports violations.
## Quick tip
If a component already follows the design system, skip step 2 and go straight to validation.
TL;DR Checklist
- Metadata only in the YAML front‑matter.
- Keep
SKILL.md≤ 200 lines. - Store heavy content in
references/andscripts/. - Design Skills around workflows, not individual tools.
- Test with a cold start; aim for ≤ 500 lines on first load.
Apply this playbook, and you’ll watch your context window breathe again. 🚀
When to use
- You are building UI components or pages.
- You need consistent spacing, typography, and responsive behavior.
- You need to align with existing design conventions.
Workflow
- Identify the UI surface (page/component) and its constraints (responsive, dark mode, accessibility).
- Apply styling rules from the references—pick only what you need.
- Validate the output against the checklist.
References (load only if needed)
references/design-tokens.md— Spacing, font scale, colour usagereferences/tailwind-patterns.md— Layouts, common utility combosreferences/accessibility-checklist.md— Keyboard, focus, contrast
Output contract
- Use UK English in UI strings.
- Prefer reusable components over copy‑paste blocks.
- Keep
classNamereadable (extract when it gets messy).
Full‑screen toggles (example)
Enter fullscreen mode
Exit fullscreen mode
That’s it.
The Skill’s job is to route the agent to the right file at the right moment — not to become an on‑page encyclopedia.
6️⃣ Measuring Improvements (Without Lying to Yourself)
If you want repeatable results, track metrics that actually matter:
- Initial lines loaded on activation.
- Time to activation (roughly: how “snappy” it feels).
- Relevance ratio (how much of the loaded content is used).
- Context overflow frequency (how often long tasks crash).
You don’t need a full observability stack; a simple repository‑audit script is enough.
Tiny Python audit: count lines per Skill
from pathlib import Path
skills_dir = Path(".claude/skills")
def count_lines(p: Path) -> int:
"""Return the number of lines in a file, ignoring decode errors."""
return sum(1 for _ in p.open("r", encoding="utf-8", errors="ignore"))
for skill in sorted(skills_dir.iterdir()):
skill_md = skill / "SKILL.md"
if skill_md.exists():
lines = count_lines(skill_md)
status = "OK" if lines < 200 else "TOO LONG"
print(f"{skill.name}: {lines} lines – {status}")
Run this weekly and you’ll catch “documentation creep” before it becomes a crisis.
7️⃣ Common Failure Modes (And How to Avoid Them)
Failure mode: Claude writes “a doc” instead of “a Skill”
LLMs love expanding markdown into tutorials.
Fix:
- Explicitly tell the model: this is not documentation.
- Remove “beginner” filler.
- Keep examples short; push detail into reference files.
Failure mode: Entry point bloats because the Skill scope is too wide
Fix:
- Split the Skill by workflow stage.
- Or move decision trees into reference files.
Failure mode: Too many references, still hard to navigate
Fix:
- Add a short “map” section in
SKILL.md. - Keep reference files single‑topic and named by intent, not by tool.
8️⃣ A Copyable Refactor Checklist
- Audit – list Skills + line counts; flag any
SKILL.md> 200 lines. - Group by workflow – merge tool‑specific Skills into capability Skills.
- Create references – move detailed info out of
SKILL.md. - Enforce entry constraints – keep
SKILL.mdlean and navigational. - Cold‑start test – ensure the first activation stays under your chosen budget.
- Keep scripts deterministic – offload “do the thing” to code where possible.
- Re‑check monthly – Skills drift over time; treat them like code.
Final take: Context engineering is “right info, right time”
The big lesson isn’t “200 lines” or “three layers”.
It’s this:
Context is a budget.
The best Skill design spends it like an engineer, not like a librarian.
Don’t load everything. Load what matters — when it matters — and keep the rest one file away.