AGENTS.md vs. Skills: How We Refactored OpenClaw to Fix AI Hallucinations

Published: (February 2, 2026 at 10:15 AM EST)
4 min read
Source: Dev.to

Source: Dev.to

Cover image for AGENTS.md vs. Skills: How We Refactored OpenClaw to Fix AI Hallucinations

I bet everyone has had this experience.

You ask your AI to use the new Gemini 3.0 Pro model, and it argues with you: “That model is invalid, I will use 1.5 Pro instead.”
Or you are working on a Next.js project, and the AI keeps debating you, insisting on using old getStaticProps syntax when you are clearly using the App Router.

It is exhausting. You enforce rules, you add docs, you install MCP servers, you build custom “Skills”… and it still hallucinates. You feel like you are just piling rule after rule on top of a broken foundation.

I was stuck in this loop for weeks. I built complex “Research Skills” designed to force the AI to be smart, but they just turned into black boxes. I pushed a button, the AI disappeared into a script, and it came back with wrong answers (like telling me a £716 visa cost £70k).

Then, last week, I saw an article that solved everything.
Vercel’s AI team published research that completely flipped my perspective. They found that simply dividing your project knowledge into Indices (in a markdown file) vs. Skills (executable code) changed the game.

I immediately tried it on my OpenClaw agent. I deleted my complex “Black Box” skills and replaced them with a simple AGENTS.md index.

The result? It worked perfectly. The hallucinations stopped. The “syntax debates” ended. Here is why—and how you can do it too.

The Vercel Wake‑Up Call

Vercel’s AI SDK team tested this exact problem on coding agents. They compared two methods for teaching an AI about Next.js 16:

MethodDescription
Skills (Tools)Giving the AI a tool to “look up documentation.”
Context (AGENTS.md)Putting the documentation index in a markdown file in the root directory.

Results

  • Skills: 53 % pass rate (the AI often forgot to use the tool or used it incorrectly).
  • Context (AGENTS.md): 100 % pass rate.

Why? Because Skills require a decision – the AI has to stop and think, “Should I check the docs?” Often it gets lazy and guesses. Context is passive – the instructions are just there. The AI doesn’t have to choose to be smart; it has no choice but to see the map.

Refactoring OpenClaw: The “Hands vs. Brains” Split

We took this data and immediately refactored our entire agent stack. We realized we were making a fundamental architecture mistake: we were building Skills for things that should have been Context.

The Old Way (Black Box)

  • Task: “Research this.”
  • Mechanism: Call Tool: Research_Skill().
  • Reality: The AI offloads thinking to a hidden script. It stops being an intelligence and becomes a button‑pusher.

The New Way (The Hybrid Stack)

We split our architecture into two distinct layers: Hands and Brains.

1. Brains (AGENTS.md + docs/)

This layer holds knowledge, rules, and logic.

We deleted the Research.ts skill entirely. In its place we added a simple markdown file: docs/research.md.

# Research Protocol
1. **Source of Truth:** Always check official docs (.gov, .org) first.
2. **Citation:** You must link every claim.
3. **Limit:** Max 5 searches per topic.

In AGENTS.md (the file the AI always sees) we added a single line:

For research tasks, READ docs/research.md first.

2. Hands (skills/)

This layer is for execution only – actions the AI cannot perform with its brain alone.

We kept skills for things the AI physically cannot do:

  • git – running terminal commands
  • whatsapp – sending API requests
  • remindctl – talking to macOS

The Result: Transparency

Now, when I ask: “Research the cost of a UK Global Talent Visa.”

  1. The AI reads AGENTS.md and sees the rule: “Read docs/research.md.”
  2. It reads the protocol: “Check official sources.”

I see it work: it generates the search query site:gov.uk global talent visa fee and returns:

“The application fee is £716. Note: Some consultants charge £70k, but that is a service fee, not the visa cost.”

It worked not because I wrote better code, but because I stopped trying to code the thinking process.

The Guide: When to Use What?

RequirementUse ThisWhy?
“I need you to know X”AGENTS.mdKnowledge should be passive. Don’t make the AI “search” for your coding style.
“I need you to follow process Y”docs/Y.mdRules belong in markdown. They are easier to edit and easier for the AI to read.
“I need you to touch Z”SkillIf it needs an API key or a CLI command, wrap it in a tool.

Start Small: The “Agile” Agent

Don’t over‑engineer. Start with a single AGENTS.md file in your root:

  • Add your project structure.
  • Add your preferred tech stack.
  • Add a link to your docs.

Watch your agent’s IQ double overnight. The best tool you can give your AI isn’t a Python script; it’s a good README.

Back to Blog

Related posts

Read more »

Prompt Engineering Is a Temporary Skill

The Problem Nobody Notices at First Most developers meet AI through a chat window. You type something. At first, this feels empowering. You can “shape” the out...