MCP Server Testing Is Fragmented. I Built One CLI for Record, Replay, Mock, Audit, and CI
Source: Dev.to
The Problem with MCP Server Testing
Building MCP servers is straightforward, but testing them end‑to‑end is fragmented.
- You can manually test or write a few scripts, ship the server, and later change something (input schema, response format, a dependency) without any regression safety net.
- Teammates need API keys and a running instance to develop against your server.
- CI pipelines often don’t verify that the server actually works.
- Auditing tool descriptions for malicious content is rarely automated.
Each of these pain points has a separate solution, but stitching them together requires a lot of glue code that rarely survives into production.
Existing Tools (A Quick Survey)
| Tool | Strength | Limitation |
|---|---|---|
| MCP Inspector (Anthropic) | Interactive debugging and exploration | Not suited for CI or automated testing |
| MCP‑Scan (Invariant Labs / Snyk) | Security scanning (tool poisoning, rug‑pull detection) | Focused only on security |
| Promptfoo | LLM red‑team testing with recent MCP support | Primarily prompt‑level testing, not full server workflows |
| MCP Protocol Validator | Spec compliance checks | Narrow scope |
| Ad‑hoc SDK scripts | Fully custom | Doesn’t scale; you maintain everything yourself |
None of these tools cover the full loop: record → replay → mock → audit → score → CI.
Introducing MCPSpec
MCPSpec is an open‑source CLI that aims to close the gap by handling the entire testing lifecycle in a single tool.
Core Features
- Record a real interaction session with your server.
- Replay the session against a new server version and get a diff of every response.
- Mock generation: produce a standalone JavaScript mock server that can be used in CI without API keys or a live instance.
- Audit for security issues such as tool poisoning, excessive agency, path traversal, injection, etc.
- Score the server across documentation, schema quality, error handling, responsiveness, and security.
- CI integration: generate GitHub Actions, GitLab CI, or shell scripts that run record/replay, audit, and scoring checks automatically.
- Optional YAML‑based test collections with 10 assertion types, environment variables, tags, and parallel execution.
Recording and Replaying Sessions
# Record a session against the current server
mcpspec record start "npx my-server"
# ...interactively call tools, then save the session
mcpspec record save my-session
# Replay the recorded session against a new version
mcpspec record replay my-session "npx my-server-v2"
Sample output
Replaying 3 steps...
1/3 get_user (id=1)... [OK] 42ms
2/3 list_items... [CHANGED] 38ms
3/3 create_item (name=test) [OK] 51ms
Summary: 2 matched, 1 changed, 0 added, 0 removed
Generating a Mock Server
mcpspec mock my-session --generate ./mocks/server.js
The generated server.js only depends on @modelcontextprotocol/sdk. Commit it to your repository and use it in CI or local development without any external dependencies.
Auditing for Security Issues
# Passive mode (metadata only, safe for production)
mcpspec audit "npx my-server"
# Active mode (sends test payloads, skips destructive tools)
mcpspec audit "npx my-server" --mode active
The audit runs eight rules that detect real problems, including:
- Tool Poisoning – hidden instructions that LLMs might follow blindly.
- Excessive Agency – tools that can perform destructive actions without confirmation.
- Path traversal, injection, input validation, info disclosure, resource exhaustion, auth bypass.
Scoring the Server
# Get a 0‑100 score across five categories
mcpspec score "npx my-server"
# Enforce a minimum score (e.g., fail CI if below 80)
mcpspec score "npx my-server" --min-score 80
The score can be displayed as a badge in your README or used to gate merges.
CI Integration
mcpspec ci-init
This command scaffolds a GitHub Actions workflow, GitLab CI configuration, or a generic shell script that runs the record/replay, audit, and scoring steps automatically.
Installation & Quick Start
npm install -g mcpspec
Run a pre‑built test collection without any setup:
mcpspec test examples/collections/servers/filesystem.yaml
MCPSpec ships with 70 ready‑to‑run tests for seven popular MCP servers (filesystem, memory, time, fetch, everything, GitHub, Chrome DevTools).
A web dashboard is also available:
mcpspec ui
All of this is free, fast, repeatable, and MIT‑licensed.
- GitHub: https://github.com/light-handle/mcpspec
- Documentation: https://light-handle.github.io/mcpspec
What’s Next?
Future work includes:
- Contract snapshots – automatically detect breaking schema changes.
- Schema drift detection for CI pipelines.
Ideas and feedback are welcome!