TracePact: Catch AI agent tool-call regressions before production

Published: (March 8, 2026 at 04:03 AM EDT)
2 min read
Source: Dev.to

Source: Dev.to

What TracePact does

TracePact is a behavioral testing framework for AI agents. It works at the tool‑call level, not the text level.

1. Write behavior contracts

import { TraceBuilder } from '@tracepact/vitest';

const trace = new TraceBuilder()
  .addCall('read_file', { path: 'src/service.ts' }, '...')
  .addCall('write_file', { path: 'src/service.ts', content: '...' })
  .addCall('run_tests', {}, 'PASS')
  .build();

// Did it read before writing?
expect(trace).toHaveCalledToolsInOrder([
  'read_file',
  'write_file',
  'run_tests'
]);

// Did it avoid shell?
expect(trace).toNotHaveCalledTool('bash');

No API calls. No tokens. Runs in milliseconds.

2. Record & replay

# Record a baseline (one‑time, live)
npx tracepact run --live --record

# Replay without API calls (instant, deterministic)
npx tracepact run --replay ./cassettes

3. Diff runs to catch drift

npx tracepact diff baseline.json latest.json --fail-on warn

Sample output

3 changes detected:
- read_file (seq 1) (removed)
+ write_file (seq 3) (added)
~ bash.cmd: "npm test" -> "npm run build"

Summary: 1 removed, 1 added, 1 arg changed

Filter noisy args and irrelevant tools

npx tracepact diff baseline.json latest.json \
  --ignore-keys timestamp,requestId \
  --ignore-tools read_file

Severity levels

  • none – identical
  • warn – arguments changed
  • block – tools added/removed

Use --fail-on in CI to gate deployments.

Good fit

  • Coding agents – read before write, run tests before finishing, never edit restricted files
  • Ops agents – inspect before restarting, check evidence before acting
  • Workflow agents – validate before mutation, avoid duplicate side effects
  • Internal assistants – use the correct system for the correct task

Less useful for

Pure chatbots, style evaluation, creative tasks, or systems where only text output matters. TracePact is for behavioral guarantees, not response quality.

MCP server for IDEs

TracePact ships an MCP server that works with Claude Code, Cursor, and Windsurf:

{
  "mcpServers": {
    "tracepact": {
      "command": "npx",
      "args": ["@tracepact/mcp-server"]
    }
  }
}

Available tools: tracepact_audit, tracepact_run, tracepact_capture, tracepact_replay, tracepact_diff, tracepact_list_tests.

Get started

npm install @tracepact/core @tracepact/vitest @tracepact/cli
npx tracepact init
npx tracepact

GitHub:

We built this because prompt or model changes silently break agent behavior while the output still looks fine. If you’re testing AI agents, we’d love to hear how you’re handling tool‑call regressions today.

0 views
Back to Blog

Related posts

Read more »