I Built a Context7 Local-First Alternative With Claude Code

Published: 3 days ago (February 8, 2026 at 01:01 PM EST)

8 min read

Source: Dev.to

Context — A Local‑First Documentation Tool for AI Agents

I built this after Context7 slashed its free tier and added rate limits. The result is a portable, offline‑first docs database you can share with your whole team.

The “Aha” Moment: Why Not Just Store Docs Locally?

Cloud doc services (Context7, Deepcon, etc.) typically do three things:

Clone a library’s docs repository.
Index the markdown into searchable chunks.
Serve results via an API.

Steps 1 and 2 only need to happen once per library version, yet the services run them on their servers and charge you per query for step 3—every single time.

Solution: do steps 1 and 2 locally, store the result as a file, and skip the network entirely.

context add https://github.com/vercel/next.js

context add clones the repo, parses the docs, indexes everything into a SQLite database, and stores it at ~/.context/packages/nextjs@16.0.db.
The resulting .db file contains every piece of Next.js 16 documentation, pre‑indexed and ready for instant queries.
No internet, no rate limits, no monthly bill.

Building It With Claude Code

I built the entire thing using Claude Code as my primary development partner—not just a “generate‑boilerplate‑and‑fix‑it” assistant, but a true collaborator on architecture, implementation, and debugging.

The Stack

Component	Why It’s Used
`better-sqlite3`	Embedded database; no server, no config.
SQLite FTS5	Full‑text search with BM25 ranking and Porter stemming.
`@modelcontextprotocol/sdk`	MCP server SDK that lets Claude, Cursor, VS Code Copilot, etc., query the docs.
`remark-parse` + `unified`	Markdown AST parsing for intelligent chunking.
`commander` + `@inquirer/prompts`	CLI framework with interactive prompts for tag selection.

How the Build Pipeline Works

Running context add <source> performs the following steps:

Source detection – Determines whether the argument is a git URL, a local directory, or a pre‑built .db file. Handles GitHub, GitLab, Bitbucket, Codeberg, SSH shorthand (git@host:user/repo), and monorepo URL patterns.
Shallow clone – Executes git clone --depth 1 (only the docs, not the full history). The CLI fetches tags and lets you pick a version interactively, or you can pass --tag v16.0.0 for automation.
Docs folder detection – Auto‑scans for docs/, documentation/, or doc/ directories, respects .gitignore, and filters by language (default: English; --lang all for multilingual repos).
Markdown parsing & chunking –
- Extracts YAML front‑matter for titles and descriptions.
- Chunks content by H2 headings (the natural unit of documentation).
- Targets ~800 tokens per chunk with a hard limit of 1 200.
- Splits oversized sections first at code‑block boundaries, then at paragraph boundaries.
- Filters out table‑of‑contents sections (detected by link‑ratio > 50 %).
- Strips MDX‑specific React tags (and), etc.
- Deduplicates identical sections using content hashing.
SQLite packaging – Everything is stored in a single .db file:

CREATE TABLE chunks (
  id            INTEGER PRIMARY KEY,
  doc_path      TEXT NOT NULL,
  doc_title     TEXT NOT NULL,
  section_title TEXT NOT NULL,
  content       TEXT NOT NULL,
  tokens        INTEGER NOT NULL,
  has_code      INTEGER DEFAULT 0
);

CREATE VIRTUAL TABLE chunks_fts USING fts5(
  doc_title, section_title, content,
  content='chunks', content_rowid='id',
  tokenize='porter unicode61'
);

The FTS5 virtual table with Porter stemming lets queries like “authentication middleware” match “authenticating in middleware” without any extra NLP. BM25 ranking weights section titles × 10 and doc titles × 5 over body content, giving relevant results without embeddings.

The Search Pipeline: Keeping It Simple

When an MCP client (e.g., Claude) calls:

get_docs({ library: "nextjs@16.0", topic: "middleware" })

the in‑process pipeline runs:

FTS5 query → BM25 ranking → Relevance filter → Token budget → Merge adjacent → Format

Relevance filter – Drops any result scoring below 50 % of the top hit.
Token budget – Caps output at 2 000 tokens (enough to be useful without flooding the context window).
Merge adjacent – Joins neighboring chunks from the same document so the LLM receives coherent sections instead of fragments.

Total latency: under 10 ms, compared with 100–500 ms for a cloud round‑trip plus the AI’s waiting time. This speed matters when the assistant is in the middle of a coding session.

TL;DR

Local‑first docs → no rate limits, no monthly cost.
SQLite + FTS5 → fast, zero‑config storage and search.
Claude Code → built the whole thing in a week of pair‑programming.

Give it a try: Context on GitHub.

AI Coding Agents and Local‑First Documentation

AI coding agents make dozens of tool calls per session. If each doc lookup adds ~300 ms of network latency, that’s seconds of dead time per interaction. Locally, it’s effectively free.

When you build a documentation package, the result is a single .db file. The file is completely self‑contained—metadata, content, search index, everything. You can:

# Build and export
context add https://github.com/your-org/design-system \
  --name design-system --pkg-version 3.1 --save ./packages/

# The result: a portable file
ls -la packages/design-system@3.1.db
# 2.4 MB – your entire design‑system docs, indexed and ready

Now share that file however you want: upload it to an S3 bucket, commit it to a repo, or put it on a shared drive. Your teammates install it with:

context add https://your-cdn.com/design-system@3.1.db

No build step on their end. No cloning repos. No waiting for indexing. The pre‑built package installs instantly because it’s already indexed.

Key architectural advantage of local‑first: with cloud services every user pays the query cost; with local packages you pay the build cost once and distribute the result. It’s the same principle as compiled binaries vs. interpreted scripts—do the expensive work ahead of time.

For internal libraries this is huge. Document your internal APIs, build a package in CI, publish it alongside your npm package, and every developer on the team gets instant, private, offline access to up‑to‑date docs. No cloud service sees your proprietary API queries.

What I Learned Building With Claude Code

A few honest observations from using Claude Code as my primary development tool:

It’s genuinely good at plumbing code. Git URL parsing, CLI argument handling, SQLite schema design—the kind of code that’s tedious but must be correct. Claude Code knocked these out quickly and accurately. The git module handles edge cases I wouldn’t have thought of: monorepo tag formats like @ai-sdk/gateway@1.2.3, SSH shorthand URLs, stripping -docs suffixes from repo names.
It struggles with “taste” decisions. Things like chunk size, how aggressively to filter low‑relevance results, or what BM25 weights feel right need human judgment and iteration. I tried values, tested against real docs, adjusted, and repeated. Claude Code helped implement each variation quickly, but the decision of which one felt right was mine.
Iteration speed is the real super‑power. The whole project—CLI, build pipeline, search engine, MCP server, tests—came together in about a week. Not because the code is trivial (the markdown parser alone handles a dozen edge cases), but because the feedback loop was tight. Describe what you want, review what you get, adjust, and move on.
Test‑driven prompting works well. I’d describe the desired behavior in terms of test cases: “this markdown input should produce these chunks.” Claude Code wrote both the implementation and the tests. When they didn’t match, we figured out why together.

The Numbers

Feature	Context 7	Deepcon	Neuledge
Price	$10 / month	$8 / month	Free
Free tier	1 000 req / month	100 req / month	Unlimited
Rate limits	60 req / hour	Throttled	None
Latency	100‑500 ms	100‑300 ms	< 10 ms
Works offline	No	No	Yes
Privacy	Cloud	Cloud	100 % local
Private repos	$15 / 1 M tokens	No	Free

Setting It Up

# Install
npm install -g @neuledge/context

# Add some docs
context add https://github.com/vercel/next.js
context add https://github.com/vercel/ai

# Connect to your AI agent (Claude Code example)
claude mcp add context -- context serve

It works with Claude Desktop, Cursor, VS Code Copilot, Windsurf, Zed, Goose, and any MCP‑compatible agent. The MCP server exposes a single get_docs tool with a dynamic enum of installed libraries—the AI sees exactly what’s available and queries it when relevant.

What’s Next

The search is currently keyword‑based (FTS5 + BM25). It works well for direct queries like “middleware authentication” or “ai sdk agent loop,” but it doesn’t understand semantic similarity. “How do I protect routes?” won’t match a section titled “Authentication Guards” unless the words overlap.

Planned improvements:

Local embeddings for semantic search – still fully offline, probably using ONNX Runtime with a small model. The SQLite architecture makes this straightforward: add an embeddings table, compute vectors at build time, query with cosine similarity at search time.
GraphRAG‑style relations table – traverse connected documentation. When you ask about middleware, you probably also want authentication, routing, and error handling. A relations graph could surface those automatically.
Package registry – a GitHub‑based index where the community can discover and share pre‑built documentation packages. Instead of everyone independently building the same Next.js docs, build it once and publish it.

The Takeaway

The core lesson from this project: not everything needs to be a cloud service.

Documentation for AI agents is a perfect case for local‑first. The data changes infrequently (per library version), the queries need to be fast (agents make lots of them), privacy matters (you’re asking about your codebase), and the “build once, use forever” model is a natural fit.

If you’re frustrated with rate limits, latency, or paying monthly for something that should be a static file—give it a try.

Context MCP is open source at github.com/neuledge/context. Published on npm as @neuledge/context.

I Built a Context7 Local-First Alternative With Claude Code

Context — A Local‑First Documentation Tool for AI Agents

The “Aha” Moment: Why Not Just Store Docs Locally?

Building It With Claude Code

The Stack

How the Build Pipeline Works

The Search Pipeline: Keeping It Simple

TL;DR

AI Coding Agents and Local‑First Documentation

What I Learned Building With Claude Code

The Numbers

Setting It Up

What’s Next

The Takeaway

Related posts

Happy women in STEM day!! <3

Your Coding Agent Doesn't Need a Bigger Context Window. It Needs Coworkers.

Data-driven decision making using Power BI.

Introducing QwikChek: Security Scanning Built for Developers

Context — A Local‑First Documentation Tool for AI Agents

The “Aha” Moment: Why Not Just Store Docs Locally?

Building It With Claude Code

The Stack

How the Build Pipeline Works

The Search Pipeline: Keeping It Simple

TL;DR

AI Coding Agents and Local‑First Documentation

The Real Win: Build Once, Share Everywhere

What I Learned Building With Claude Code

The Numbers

Setting It Up

What’s Next

The Takeaway

Related posts

Happy women in STEM day!! <3

Your Coding Agent Doesn't Need a Bigger Context Window. It Needs Coworkers.

Data-driven decision making using Power BI.

Introducing QwikChek: Security Scanning Built for Developers

Context — A Local‑First Documentation Tool for AI Agents

Building It With Claude Code

What I Learned Building With Claude Code