PromptLedger v0.3 — Turning prompt history into a practical review workflow.

Published: 1 month ago (March 28, 2026 at 09:37 AM EDT)

6 min read

Source: Dev.to

Source: Dev.to

Devlog — Part 3

Turning prompt history into a practical review workflow

In Part 1, I introduced PromptLedger as a deliberately small, local‑first tool for treating prompts like code.
In Part 2, I added release semantics: labels, label history, and status views that made it easier to answer questions like “what is in production right now?”

With v0.3, the next question became harder:

Even if I can diff two prompt versions, can I review them in a way that feels closer to a real release workflow?

That is the focus of this release.

What’s new in v0.3?

PromptLedger v0.3 adds a small but practical Prompt Review layer on top of the existing history model—while still staying local‑first, SQLite‑backed, and intentionally limited in scope.

After the release‑semantics work in v0.2, the project could already answer questions like:

Which prompt does prod currently point to?
When was that label changed?
How does prod differ from staging?

But another gap became obvious. A raw diff is useful, yet in practice people often want a slightly higher‑level review:

Did the prompt become stricter?
Did the tone change?
Was the output format changed from bullets to JSON?
Did safety or refusal wording get stronger or weaker?
Is this a release change or a likely regression risk?

These are review questions, not execution or observability questions.
So instead of adding prompt execution, external APIs, or any hosted layer, I kept the project focused and added a review workflow built entirely on top of the existing local data.

New CLI Commands

`promptledger review`

promptledger review --id onboarding --from prod --to staging

Compares two refs (versions or labels) and produces a structured review output that includes:

Resolved refs and versions
A semantic summary
Metadata changes
Label context
Warning flags
A few conservative notes

This is deliberately not an evaluation system. It does not score prompts, call a model, or guess too much. It simply makes a prompt diff easier to interpret.

Traditional diffs are still useful, and PromptLedger keeps all previous diff modes.
v0.3 adds a new summary‑oriented mode:

promptledger diff --id onboarding --from 7 --to 9 --mode summary

This produces a heuristic, rule‑based semantic summary instead of a raw line diff.

Design Goals for the Summary

Local – no network calls
Deterministic – same input → same output
Transparent – rules are visible in the source
Intentionally conservative – only says something when the change is clear enough

Current summary categories include:

Tone changes
Tighter or looser constraints
Output format changes
Broader vs. more specific prompts
Safety wording changes
Length requirement changes
Refusal or policy wording changes

The summary is not meant to replace reading the actual prompt. Using an external model for review would introduce network dependence, nondeterministic behavior, more configuration, harder testing, and less trust in the output—exactly the opposite of PromptLedger’s philosophy.

Exporting Reviews

Another practical gap was sharing review output. Reading a diff in the terminal is fine, but you often need a portable document.

promptledger export review \
  --id onboarding \
  --from prod \
  --to staging \
  --format md \
  --out review.md

The exported Markdown is deterministic and structured, containing:

Title
Compared refs
Semantic summary
Text‑diff note
Metadata changes
Warnings
Label information
A reviewer‑notes placeholder

This makes PromptLedger more useful in real workflows without adding any collaboration backend—the file is still just a file.

Metadata‑Aware Reviews

Prompt text is only part of the story. A release change may also involve metadata updates such as:

reason
author
tags
env
metrics

Earlier versions could already diff metadata, but v0.3 makes metadata changes part of the review object itself. This matters because some changes are metadata‑only.

Warning Flags

v0.3 adds simple warning flags for cases such as:

Comparing the same version to itself
Environment changes
Metadata‑only changes
Policy or refusal wording changes that may affect behavior drift

These warnings are not dramatic; for example, a wording change around refusal or safety does not automatically mean the prompt got worse, but it probably means a reviewer should read it more carefully.

API Improvements

The review workflow is not just a CLI feature. The Python API now exposes review results as structured domain objects rather than just formatted strings. Callers can programmatically access:

Resolved refs
Semantic summary items
Metadata changes
Warnings
Notes
Label context

This keeps the CLI and the API aligned while also making formatting a separate concern. The separation turned out to be one of the cleaner changes in this version:

Review logic lives in one place
Rendering logic lives elsewhere
Markdown export and terminal rendering both use the same review result

UI Updates

The Streamlit UI remains read‑only, but the comparison view now surfaces review information more clearly:

Semantic summary
Warnings
Metadata diff
Side‑by‑side prompt comparison
Line diff

This keeps the UI aligned with the CLI review flow without turning it into an editor—the constraint still matters.

What didn’t change

Just as important as the new features is what was left out. v0.3 does not add:

A hosted registry
Prompt execution APIs
Agent tooling
Telemetry pipelines
Tracing dashboards
Cloud sync
Automatic scoring
Evaluation harnesses

There are already plenty of tools going in those directions. PromptLedger is still trying to do one narrower thing well:

Store, version, and review prompts locally—nothing more, nothing less.

Release Highlights

Review workflow – No need to turn the database into something more complicated.
SQLite remains the single source of truth, keeping the implementation smaller and the migration story simpler.
Not every useful feature requires a bigger schema.

v0.3 Overview

The release did not try to make PromptLedger smarter in a flashy way; it stays more reviewable.
The result is still a local tool, but now it is easier to answer a more realistic question:
“What changed?” → “How should I review this change before I move it forward?”
This is a better place for the project to be.