Stop 'Vibe Checking' Your AI. Use Snapshot Testing Instead.

Published: (January 16, 2026 at 06:08 AM EST)
2 min read
Source: Dev.to

Source: Dev.to

Why aren’t we doing this for AI?

Most of us are still “vibe checking”: manually running the prompt, reading the output, and saying, “Yeah, seems okay.”

I built a tool to fix this.

Introducing SafeStar

SafeStar is a zero‑dependency CLI tool that brings the “Snapshot & Diff” workflow to AI engineering. It works with Python, Node, curl, or any other interface, treating your AI as a black box and answering one question:

“Did the behavior change compared to last time?”

How it works

SafeStar follows a Git‑like workflow:

  1. Snapshot a baseline of “good” behavior.
  2. Run your current code.
  3. Diff the results to detect drift.

Quick Start

You can try SafeStar right now without changing your code.

1. Install

npm install --save-dev safestar

2. Define a Scenario

Create a file scenarios/refund.yaml. Tell SafeStar how to run your script using the exec key.

name: refund_bot
prompt: "I want a refund immediately."

# Your actual code command
exec: "python3 my_agent.py"

# Run it 5 times to catch randomness/instability
runs: 5

# Simple guardrails
checks:
  max_length: 200
  must_not_contain:
    - "I am just an AI"

3. Create a Baseline

Run it until you get an output you like, then “freeze” it:

npx safestar baseline refund_bot

4. Check for Drift in CI

Whenever you change your prompt or model, run:

npx safestar diff scenarios/refund.yaml

If your model drifts, SafeStar alerts you:

--- SAFESTAR REPORT ---
Status: FAIL

Metrics:
  Avg Length: 45 chars -> 120 chars
  Drift:      +166% vs baseline (WARNING)
  Variance:   0.2 -> 9.8 (High instability)

Why I built this

I was tired of complex evaluation dashboards that give a “correctness score” of 87/100. I don’t care about the score; I care about regressions. If my bot was working yesterday, I just want to know if it is different today.

SafeStar is open source, local‑first, and fits right into GitHub Actions.

  • NPM:
  • GitHub:
  • Full blog post:

Let me know if you find it useful!

Back to Blog

Related posts

Read more »

Rapg: TUI-based Secret Manager

We've all been there. You join a new project, and the first thing you hear is: > 'Check the pinned message in Slack for the .env file.' Or you have several .env...

Technology is an Enabler, not a Saviour

Why clarity of thinking matters more than the tools you use Technology is often treated as a magic switch—flip it on, and everything improves. New software, pl...