I Built Tautest: A Mutation Testing Workflow for AI-Written Tests

Published: 19 hours ago (May 10, 2026 at 08:13 PM EDT)

4 min read

Source: Dev.to

Introduction

AI coding agents are getting really good at writing tests, but passing tests don’t always mean the tests are strong.
Often an AI‑generated test only confirms that the current implementation runs, without proving that the intended behavior is protected.

What is Tautest?

Tautest is an open‑source CLI and GitHub Action that runs mutation testing on the lines changed in a pull request, identifies weak tests, and generates an AI‑ready fix prompt for Claude Code, Cursor, Codex, or human reviewers.

GitHub:
npm package:

Tautest itself is not a mutation‑testing engine; it leverages StrykerJS and adds a workflow layer around it.

How Tautest Works

Reads changed source lines from git diff.
Runs StrykerJS mutation testing only on those lines.
Parses surviving mutants.
Generates Markdown, JSON, and terminal reports.
Writes an AI‑ready fix prompt (.tautest/fix-prompt.md).
(Optional) Posts a sticky GitHub PR comment.

Supported test runners

Vitest (full support)
Jest (beta)

Example

Code under test

// src/discount.ts
if (age >= 65) {
  return subtotal * 0.2;
}

Normal test suite (passes)

// test/discount.test.ts
test("applies senior discount", () => {
  expect(calculateDiscount(70, 100)).toBe(20);
});

Mutated code

// Mutant: change >= to >
if (age > 65) {
  return subtotal * 0.2;
}

If the test suite still passes, the boundary at 65 is not protected – a weak test.

Running Tautest reveals the surviving mutant:

Tautest: MIXED (75.00%, threshold 60.00%)
Killed: 3 | Survived: 1 | No coverage: 0

Top surviving mutants:
- src/discount.ts:2 EqualityOperator
  age >= 65  →  age > 65

After adding a boundary test:

test("applies senior discount at exactly 65", () => {
  expect(calculateDiscount(65, 80)).toBe(16);
});

Tautest reports a perfect score:

Tautest: STRONG (100.00%, threshold 60.00%)
Killed: 4 | Survived: 0

Generated AI Fix Prompt

Tautest creates .tautest/fix-prompt.md containing rules such as:

Do not change production code.
Only edit or add test files.
Every new test must pass against the original code.
Every new test must fail against the mutant behavior.
Do not weaken existing assertions.
Avoid filler tests like expect(true).toBe(true).

Workflow

Run tautest.
Open .tautest/fix-prompt.md.
Paste the prompt into Claude Code, Cursor, Codex, or use it yourself.
Add the missing test(s).
Run your normal test suite.
Run tautest again to verify the mutation score.

Installation

Vitest projects

pnpm add -D tautest @stryker-mutator/core @stryker-mutator/vitest-runner
pnpm exec tautest init --yes --runner vitest --no-install
pnpm exec tautest doctor
pnpm exec tautest run --base origin/main

Jest projects (beta)

pnpm add -D tautest @stryker-mutator/core @stryker-mutator/jest-runner
pnpm exec tautest init --yes --runner jest --no-install

GitHub Action

Add the following workflow file (e.g., .github/workflows/tautest.yml):

name: Tautest
on:
  pull_request:

permissions:
  contents: read
  pull-requests: write

jobs:
  tautest:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0   # required for git history

      - uses: actions/setup-node@v4
        with:
          node-version: 20

      - uses: pnpm/action-setup@v4
        with:
          version: 10

      - run: pnpm install --frozen-lockfile
      - run: pnpm build

      - uses: canblmz1/tautest/packages/github-action@v1
        with:
          base: ${{ github.base_ref }}
          threshold: 60
          comment: changes
          cache: true

Important notes

fetch-depth: 0 is required because Tautest needs the full git history.
pull-requests: write permission is needed for sticky PR comments.

Limitations

Tautest deliberately does not:

Implement its own mutation engine (it relies on StrykerJS).
Replace StrykerJS.
Call any LLM API.
Claim that your tests are perfect.
Fully support monorepos in v1.
Classify AI‑written tests with certainty.

It provides a deterministic pipeline:

changed source lines → mutation testing → surviving mutants → report → fix prompt

Future Improvements

Migrate to Node 24 runtime for the GitHub Action.
Better cache observability.
Beta support for monorepos.
Standalone GitHub Action repository.
PR line annotations.
More Jest fixtures.

Feedback Requested

I would love feedback on:

Clarity of the README and demo.
Reasonableness of the GitHub Action workflow.
Usefulness of the AI fix‑prompt workflow.
Whether the project should stay focused on JavaScript/TypeScript for now.