Baboon: Data Modeling with Automatic Evolutions and tagless binary codecs

Published: (November 29, 2025 at 03:12 PM EST)
4 min read

Source: Hacker News

Baboon: Data Modeling with Automatic Evolutions and Tagless Binary Codecs

Baboon is a minimal Data Modeling Language and compiler that provides ergonomic, declarative schemas and enforces reliable schema evolution. The compiler runs as a fast immutable multi‑phase DAG transform, and is easy to understand and maintain.

Essentially, you define your data structures and Baboon generates implementations for you. Then you define new versions, Baboon generates new versions of the structures, the conversions from old structure versions to new ones and forces you to provide conversions which cannot be derived automatically. It also comes with an extremely efficient tagless binary encoding for all your structures.

Currently generates C# and Scala; more backends are on the way.

Highlights

  • Automatic codec derivation for JSON and UEBA (Ultra‑Efficient Binary Aggregate, a custom tagless binary format)
  • Evolution‑aware codegen: derives migrations when possible, emits stubs when manual work is required
  • Set‑based structural inheritance with +, -, and ^ operators
  • Algebraic data types (adt), DTOs (data) and enums
  • Basic form of nominal inheritance (contract)
  • Namespaces, includes, and imports
  • Collections (opt, lst, set, map) and timestamps/UID primitives
  • Deduplicated C# output (reuses as much code as possible to lower binary footprint)

Example model (v1.0.0)

model acme.billing
version "1.0.0"

root adt PaymentMethod {
  data Card {
    pan: str
    holder: str
  }
  data Wallet {
    provider: str
    token: str
  }
}

Refactored model (v2.0.0)

model acme.billing
version "2.0.0"

data Token {
  token: str
}

root adt PaymentMethod {
  data Card {
    pan: str
    holder: str
  }
  // refactored, but same structure as before
  data Wallet {
    provider: str
    + Token
  }

  // new ADT member
  data BankTransfer { iban: str }
}

Baboon generates conversions (migrations) from version 1.0.0 to 2.0.0. In this particular case all the migrations are generated automatically.

A detailed language walkthrough with copy‑paste examples is available in the repository’s docs/language-features.md.

Editor Support

  • IntelliJ IDEA Plugin – source: baboon-intellij
  • VS Code Extension – source: baboon-vscode
  • VSCodium Extension – same as VS Code

Limitations

  • No templates
  • Only Enums, DTOs and ADTs are supported
  • Nominal inheritance support is limited to trait model
  • Generic/type‑constructor support is limited to built‑in collections
  • This is a DML, not an IDL; service/interface definitions support is extremely limited
  • Comments are not preserved in the transpiler output
  • Structural inheritance information is not preserved in the transpiler output
  • Only integer constants may be associated with enum members
  • No newtypes/type aliases
  • No inheritance‑based lenses/projections/conversions

Points marked with an asterisk may be improved in the future.

CLI

See build configuration in .mdl/defs/actions.md and test configuration in .mdl/defs/tests.md.

Notes

  • All types that are not transitively referenced by root types are eliminated from the compiler output.
  • Usages in structural inheritance are not considered references, so structural parents that are not directly referenced as fields and not marked as root will be eliminated.

Foreign Types

Be careful about foreign types; it is your responsibility to wire codecs correctly.

For every foreign type:

  1. Create a custom codec.
  2. Override the generated dummy codec with BaboonCodecs#Register.
  3. Override the generated dummy codec using the setter on ${Foreign_Type_Name}_UEBACodec#Instance.
  4. Override the generated dummy codec using the setter on ${Foreign_Type_Name}_JsonCodec#Instance.

Make sure your foreign types are not primitive types or other generated types. Foreign types may appear anywhere in generics, but you must ensure correctness.

Development

Build Commands

This project uses mudyla for build orchestration.

# Format code
direnv exec . mdl :fmt

# Build the compiler
direnv exec . mdl :build

# Run complete test suite
direnv exec . mdl :build :test

# Run full build pipeline (format, build, test)
direnv exec . mdl :full-build

# Run specific test suites
direnv exec . mdl :build :test-gen-regular-adt :test-cs-regular :test-scala-regular
direnv exec . mdl :build :test-gen-wrapped-adt :test-cs-wrapped :test-scala-wrapped
direnv exec . mdl :build :test-gen-manual :test-gen-compat-scala :test-gen-compat-cs :test-manual-cs :test-manual-scala

# Create distribution packages
direnv exec . mdl :build :mkdist

# Build with custom distribution paths
direnv exec . mdl --mkdist-source=./custom/path --mkdist-target=./output :build :mkdist

Setting Up the Environment

# Enter the Nix development shell
nix develop

# Or use direnv for automatic shell activation
direnv allow
Back to Blog

Related posts

Read more »