What is a build system, anyway?

Published: (December 13, 2025 at 02:58 PM EST)
6 min read

Source: Hacker News

big picture

At a high level, build systems are tools or libraries that provide a way to define and execute a series of transformations from input data to output data that are memoized by caching them in an object store.

Transformations are called steps or rules1 and define how to execute a task that generates zero or more outputs from zero or more inputs.
A rule is usually the unit of caching; i.e. the cache points are the outputs of a rule, and cache invalidations must happen on the inputs of a rule.
Rules can have dependencies on previous outputs, forming a directed graph called a dependency graph.
Dependencies that form a cyclic graph are called circular dependencies and are usually banned.2

Outputs that are only used by other rules, but not “interesting” to the end‑user, are called intermediate outputs.

An output is outdated, dirty, or stale if one of its dependencies is modified, or, transitively, if one of its dependencies is outdated. Stale outputs invalidate the cache and require the outputs to be rebuilt. An output that is cached and not dirty is up‑to‑date. Rules are outdated if any of their outputs are outdated. If a rule has no outputs, it is always outdated.

Each invocation of the build tool is called a build.

  • A full build or clean build occurs when the cache is empty and all transformations are executed as a batch job.
  • A cache is full if all its rules are up‑to‑date.
  • An incremental build occurs when the cache is partially full but some outputs are outdated and need to be rebuilt.
  • Deleting the cache is called cleaning.

A build is correct or sound if all possible incremental builds have the same result as a full build.3
A build is minimal (occasionally optimal) if rules are rerun at most once per build, and only run if necessary for soundness (Build Systems à la Carte, Pluto).

In order for a build to be sound, all possible cache invalidations must be tracked as dependencies.

A build system without caching is called a task runner or batch compiler. Note that task runners often still support dependencies even if they don’t support caching. Build systems with caching can emulate a task runner by only defining tasks with zero outputs, but they are usually not designed for this use case.4

Examples of build systems: make, docker build, rustc.
Examples of task runners: just, shell scripts, gcc.

specifying dependencies

A build can be either inter‑process, where the task is usually a single process execution with input and output files, or intra‑process, where a task is usually a single function call with arguments and return values.

To track dependencies, either all inputs and outputs must be declared in source code ahead of time, or it must be possible to infer them from the execution of a task.

Build systems that track changes to a rule definition are called self‑tracking. Past versions of the rule are called its history (Build Systems à la Carte).

The act of inferring dependencies from runtime behavior is called tracing.
If a traced rule depends on a dependency that hasn’t been built yet, the build system may either error, suspend the task and resume it later once the dependency is built, or abort the task and restart it later once the dependency is built (Build Systems à la Carte).

Inter‑process builds often declare their inputs and outputs, and intra‑process builds often infer them, but this is not inherent to the definition.5

Examples of intra‑process builds: spreadsheets, the wild linker, and memoization libraries such as Python’s functools.cache.

applicative and monadic structure

A build graph is applicative if all inputs, outputs, and rules are declared ahead of time. In this case the graph is statically known. Very few build systems are purely applicative; almost all have an escape hatch.

The graph is monadic if not all outputs are known ahead of time, or if rules can generate other rules dynamically at runtime. Inputs that aren’t known ahead of time are called dynamic dependencies. Dynamic dependencies are weaker than a fully monadic build system, in the sense that they can express fewer build graphs (the capability‑tractability tradeoff).6

Build systems that do not require declaring build rules are always monadic.

Examples of monadic build systems: Shake, ninja’s dyndeps, Cargo build scripts.
Examples of applicative build systems: make (with recursive make disallowed), Bazel (excluding native rules), and map/reduce libraries with memoization, such as this Unison program.

early cutoff

If a dirty rule R has an outdated output, reruns, and creates a new output that matches the old one, the build system has an opportunity to avoid running later rules that depend on R. Taking advantage of that opportunity is called early cutoff.

See the rustc‑dev‑guide for much more information about early cutoff.7

rebuild detection

In unsound build systems, it’s possible that the system does not accurately detect that it needs to rebuild. Such systems sometimes offer a way to force‑rerun a target: keeping the existing cache but rerunning a single rule. For inter‑process build systems, this often involves touch‑ing a file to set its modification date to the current time.

the executor

A build executor runs tasks and schedules them in an order that respects all dependencies, often using heuristics such as dependency depth or the time taken by the task on the last run.
It also detects whether rule inputs have been modified, making the rule outdated; this is called rebuild detection.
The executor may restart or suspend tasks in build systems that support it, provide progress reporting, and sometimes allow querying the dependency graph.
Occasionally executors trace the inputs used by a task to enforce they match the declared dependencies or to automatically add them to an internal dependency graph.

inter‑process builds

In the context of inter‑process builds, an artifact is an output file generated by a rule.8
A source file is an input file that is specific to the current project (sometimes repository or workspace) as opposed to a system dependency that is reused across multiple projects.9


Footnotes

  1. See footnote 14 in the original article.

  2. See footnote 5 in the original article.

  3. See footnote 1 in the original article.

  4. See footnote 7 in the original article.

  5. See footnote 8 in the original article.

  6. See footnote 15 in the original article.

  7. See footnote 9 in the original article.

  8. See footnote 6 in the original article.

  9. See footnote 11 in the original article.

Back to Blog

Related posts

Read more »

Surface Tension of Software

Article URL: https://iamstelios.com/blog/surface-tension-of-software/ Comments URL: https://news.ycombinator.com/item?id=46261739 Points: 41 Comments: 11...