Sem – Semantic version control. Entity-level diffs on top of Git

Published: (March 8, 2026 at 01:01 AM EST)
4 min read

Source: Hacker News

Overview

Semantic version control. Entity‑level diffs on top of Git.

Instead of line 43 changed, sem tells you function validateToken was added in src/auth.ts.

sem diff

┌─ src/auth/login.ts ──────────────────────────────────

│  ⊕ function  validateToken          [added]
│  ∆ function  authenticateUser       [modified]
│  ⊖ function  legacyAuth             [deleted]

└──────────────────────────────────────────────────────

┌─ config/database.yml ─────────────────────────────────

│  ∆ property  production.pool_size   [modified]
│    - 5
│    + 20

└──────────────────────────────────────────────────────

Summary: 1 added, 1 modified, 1 deleted across 2 files

Install

Build from source (requires Rust):

git clone https://github.com/Ataraxy-Labs/sem
cd sem/crates
cargo install --path sem-cli

Or grab a binary from GitHub Releases.

Usage

Works in any Git repo. No setup required.

  • Semantic diff of working changes

    sem diff
  • Staged changes only

    sem diff --staged
  • Specific commit

    sem diff --commit abc1234
  • Commit range

    sem diff --from HEAD~5 --to HEAD
  • JSON output (for AI agents, CI pipelines)

    sem diff --format json
  • Read file changes from stdin (no git repo needed)

    echo '[{"filePath":"src/main.rs","status":"modified","beforeContent":"...","afterContent":"..."}]' \
      | sem diff --stdin --format json
  • Only specific file types

    sem diff --file-exts .py .rs
  • Entity dependency graph

    sem graph
  • Impact analysis (what breaks if this entity changes?)

    sem impact validateToken
  • Entity‑level blame

    sem blame src/auth.ts

What it parses

Programming languages (13)

LanguageExtensionsEntities
TypeScript.ts .tsxfunctions, classes, interfaces, types, enums, exports
JavaScript.js .jsx .mjs .cjsfunctions, classes, variables, exports
Python.pyfunctions, classes, decorated definitions
Go.gofunctions, methods, types, vars, consts
Rust.rsfunctions, structs, enums, impls, traits, mods, consts
Java.javaclasses, methods, interfaces, enums, fields, constructors
C.c .hfunctions, structs, enums, unions, typedefs
C++.cpp .cc .hppfunctions, classes, structs, enums, namespaces, templates
C#.csclasses, methods, interfaces, enums, structs, properties
Ruby.rbmethods, classes, modules
PHP.phpfunctions, classes, methods, interfaces, traits, enums
Fortran.f90 .f95 .ffunctions, subroutines, modules, programs

Structured data formats

FormatExtensionsEntities
JSON.jsonproperties, objects (RFC 6901 paths)
YAML.yml .yamlsections, properties (dot paths)
TOML.tomlsections, properties
CSV.csv .tsvrows (first column as identity)
Markdown.md .mdxheading‑based sections

Everything else falls back to chunk‑based diffing.

How matching works

Three‑phase entity matching:

  1. Exact ID match – same entity in before/after → modified or unchanged.
  2. Structural hash match – same AST structure, different name → renamed or moved (ignores whitespace/comments).
  3. Fuzzy similarity – > 80 % token overlap → probable rename.

This allows sem to detect renames and moves, not just additions and deletions. Structural hashing also distinguishes cosmetic changes (whitespace, formatting) from real logic changes.

JSON output

{
  "summary": {
    "fileCount": 2,
    "added": 1,
    "modified": 1,
    "deleted": 1,
    "total": 3
  },
  "changes": [
    {
      "entityId": "src/auth.ts::function::validateToken",
      "changeType": "added",
      "entityType": "function",
      "entityName": "validateToken",
      "filePath": "src/auth.ts"
    }
  ]
}

As a library

sem-core can be used as a Rust library dependency:

[dependencies]
sem-core = { git = "https://github.com/Ataraxy-Labs/sem", version = "0.3" }

Used by weave (semantic merge driver) and inspect (entity‑level code review).

Architecture

  • tree‑sitter for code parsing (native Rust, not WASM)
  • git2 for Git operations
  • rayon for parallel file processing
  • xxhash for structural hashing
  • Plugin system for adding new languages and formats

Star History

Star History Chart

License

MIT OR Apache‑2.0

0 views
Back to Blog

Related posts

Read more »

Not All Friction Is the Same

Introduction Lately there are many posts celebrating the “death of friction,” praising how AI removes the friction of writing code and increases development ve...

Introducing Attune.js

!Cover image for Introducing Attune.jshttps://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads....