I Built ac-trace to Check What Tests Actually Protect

Published: (March 9, 2026 at 10:45 AM EDT)
5 min read
Source: Dev.to

Source: Dev.to

# AI‑assisted coding is making one part of software development much faster than another

It is now easier than ever to generate implementation code, unit tests, fixtures, mocks, and even test structure.  
But while output is getting faster, confidence is not automatically getting deeper. In fact, the opposite can happen: the more quickly code and tests are produced, the easier it becomes to confuse visible testing activity with real protection.

That gap is exactly why I built **[ac‑trace](https://github.com/DmytroHuzz/ac-trace)**, a new open‑source tool.

---

## The core problem

Passing tests are often a weaker signal than teams think. Coverage is not enough either. A test suite can be green, a code path can be exercised, and the intended behavior can still be only weakly defended.

What I care about is not just whether code ran, or whether assertions passed. The harder question is this:

> **Are the acceptance criteria actually protected?**

### The problem: green tests do not prove much by themselves

In many teams, these ideas get blended together:

- tests are passing  
- code is covered  
- therefore the requirement is safe  

But those are different signals.

* A passing test tells you that **some expectation held in one scenario**.  
* Coverage tells you that **code executed**.  

Neither one, by itself, proves that the important business behavior is strongly defended against breakage.

This becomes more important with AI‑assisted coding.

AI is good at producing plausible implementations and plausible tests very quickly. That is useful, but it also lowers the cost of producing code that *looks* well‑tested. You get more test files, more green checks, more visible structure — and sometimes only shallow confidence underneath.

---

## A concrete example

Imagine a billing service with this acceptance criterion:

> **Premium users must never be charged above their contractual monthly cap.**

Now imagine the code has:

- tests for invoice creation  
- tests for the premium‑user billing flow  
- good coverage around the billing function  

The relevant lines all execute. The pipeline is green. Looks fine.

But now remove the cap check, flip the comparison, or mutate the billing logic in a way that breaks the intended behavior.

**Do the tests fail?**  

If they do not, then the acceptance criterion was never really protected. The system had tests. The code was covered. But the thing that mattered was still weakly defended.

That is the gap I wanted to make more visible.

![schema](https://media2.dev.to/dynamic/image/width=800,height=,fit=scale-down,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3mrdyj9j1754j74laft5.jpg)

---

## That is why I built **ac‑trace**

**ac‑trace** (Repo: ) is an open‑source tool that:

1. Maps acceptance criteria to code and tests.  
2. Mutates the mapped code to verify whether the tests actually catch the breakage.  

In plain terms:

> **It tries to answer whether the tests defend the behavior they are supposed to defend.**

This is not just traceability for documentation. The point is not only to show links between requirements, code, and tests, but also to test whether those links have *teeth*.

---

## How it works

The current workflow is intentionally simple:

1. **Define acceptance criteria**  
2. **Map them to relevant source code and tests**  
3. **Infer some links from annotated tests**  
4. **Mutate the mapped implementation**  
5. **Run the relevant tests**  
6. **Generate a report showing what failed and what survived**

So the flow is roughly:

acceptance criteria → code → tests → mutation → report


- If the mapped code is changed **and the linked tests fail**, that is a useful sign.  
- If the mapped code is changed **and the linked tests still pass**, that is also useful — it reveals a confidence gap that might otherwise stay hidden behind a green suite.

---

## Why this matters more now

AI‑assisted coding is not the problem by itself.  
The problem is that AI increases output faster than it increases justified confidence.

When implementation and tests both become cheap to generate, teams need better ways to distinguish between:

- code that looks tested  
- code that is covered  
- code whose important behavior is actually defended  

Without that distinction, it becomes very easy to over‑trust green pipelines. That is the broader reason for **ac‑trace**: a practical tool that pushes on this exact point.

---

## Current scope

**ac‑trace** is still early and intentionally narrow. Right now it focuses on:

- Python  
- pytest  
- YAML manifests  
- Inferred links from annotated tests  
- Generated reports  

I kept the scope small on purpose. I would rather build a narrow tool around one precise question than make broad claims too early. This is an experiment in making one software‑quality problem more concrete.

---

## Launch note

This post is also the announcement: **ac‑trace** is now open source → .

If you work on backend systems, care about software quality, or are thinking seriously about how AI changes testing and confidence, I think this problem is worth exploring.

I built **ac‑trace** because I kept coming back to the same thought:

> **Passing tests are useful, but they do not necessarily mean the acceptance criteria are protected.**

I want a more direct way to inspect that gap.

---

## Conclusion

**ac‑trace** is my open‑source attempt to make the gap between green tests and justified confidence more visible.

---

## Call to Action

Check out the repository, try it on your own project, and let me know what you think!  

Check out the repo, try it on a small Python project, and tell me where the idea is useful, naive, or worth pushing further.

0 views
Back to Blog

Related posts

Read more »