MCP Server Testing Is Fragmented. I Built One CLI for Record, Replay, Mock, Audit, and CI

Published: (March 7, 2026 at 01:10 PM EST)
4 min read
Source: Dev.to

Source: Dev.to

The Problem with MCP Server Testing

Building MCP servers is straightforward, but testing them end‑to‑end is fragmented.

  • You can manually test or write a few scripts, ship the server, and later change something (input schema, response format, a dependency) without any regression safety net.
  • Teammates need API keys and a running instance to develop against your server.
  • CI pipelines often don’t verify that the server actually works.
  • Auditing tool descriptions for malicious content is rarely automated.

Each of these pain points has a separate solution, but stitching them together requires a lot of glue code that rarely survives into production.

Existing Tools (A Quick Survey)

ToolStrengthLimitation
MCP Inspector (Anthropic)Interactive debugging and explorationNot suited for CI or automated testing
MCP‑Scan (Invariant Labs / Snyk)Security scanning (tool poisoning, rug‑pull detection)Focused only on security
PromptfooLLM red‑team testing with recent MCP supportPrimarily prompt‑level testing, not full server workflows
MCP Protocol ValidatorSpec compliance checksNarrow scope
Ad‑hoc SDK scriptsFully customDoesn’t scale; you maintain everything yourself

None of these tools cover the full loop: record → replay → mock → audit → score → CI.

Introducing MCPSpec

MCPSpec is an open‑source CLI that aims to close the gap by handling the entire testing lifecycle in a single tool.

Core Features

  • Record a real interaction session with your server.
  • Replay the session against a new server version and get a diff of every response.
  • Mock generation: produce a standalone JavaScript mock server that can be used in CI without API keys or a live instance.
  • Audit for security issues such as tool poisoning, excessive agency, path traversal, injection, etc.
  • Score the server across documentation, schema quality, error handling, responsiveness, and security.
  • CI integration: generate GitHub Actions, GitLab CI, or shell scripts that run record/replay, audit, and scoring checks automatically.
  • Optional YAML‑based test collections with 10 assertion types, environment variables, tags, and parallel execution.

Recording and Replaying Sessions

# Record a session against the current server
mcpspec record start "npx my-server"
# ...interactively call tools, then save the session
mcpspec record save my-session
# Replay the recorded session against a new version
mcpspec record replay my-session "npx my-server-v2"

Sample output

Replaying 3 steps...

  1/3 get_user (id=1)...       [OK] 42ms
  2/3 list_items...            [CHANGED] 38ms
  3/3 create_item (name=test) [OK] 51ms

Summary: 2 matched, 1 changed, 0 added, 0 removed

Generating a Mock Server

mcpspec mock my-session --generate ./mocks/server.js

The generated server.js only depends on @modelcontextprotocol/sdk. Commit it to your repository and use it in CI or local development without any external dependencies.

Auditing for Security Issues

# Passive mode (metadata only, safe for production)
mcpspec audit "npx my-server"

# Active mode (sends test payloads, skips destructive tools)
mcpspec audit "npx my-server" --mode active

The audit runs eight rules that detect real problems, including:

  • Tool Poisoning – hidden instructions that LLMs might follow blindly.
  • Excessive Agency – tools that can perform destructive actions without confirmation.
  • Path traversal, injection, input validation, info disclosure, resource exhaustion, auth bypass.

Scoring the Server

# Get a 0‑100 score across five categories
mcpspec score "npx my-server"

# Enforce a minimum score (e.g., fail CI if below 80)
mcpspec score "npx my-server" --min-score 80

The score can be displayed as a badge in your README or used to gate merges.

CI Integration

mcpspec ci-init

This command scaffolds a GitHub Actions workflow, GitLab CI configuration, or a generic shell script that runs the record/replay, audit, and scoring steps automatically.

Installation & Quick Start

npm install -g mcpspec

Run a pre‑built test collection without any setup:

mcpspec test examples/collections/servers/filesystem.yaml

MCPSpec ships with 70 ready‑to‑run tests for seven popular MCP servers (filesystem, memory, time, fetch, everything, GitHub, Chrome DevTools).

A web dashboard is also available:

mcpspec ui

All of this is free, fast, repeatable, and MIT‑licensed.

What’s Next?

Future work includes:

  • Contract snapshots – automatically detect breaking schema changes.
  • Schema drift detection for CI pipelines.

Ideas and feedback are welcome!

0 views
Back to Blog

Related posts

Read more »