Be intentional about how AI changes your codebase
Source: Hacker News
Code should be self documenting
How you split logic into functions and shape the data they pass around determines how well a codebase holds up over time.
Semantic Functions
Semantic functions are the building blocks of any codebase. A good semantic function should be as minimal as possible to prioritize correctness. It should take all required inputs to complete its goal and return all necessary outputs directly.
- Semantic functions can wrap other semantic functions to describe desired flows and usage.
- Side effects are generally undesirable unless they are the explicit goal, because semantic functions should be safe to reuse without needing to understand their internals.
- When logic is complicated, break the flow into a series of self‑describing semantic functions that take what they need, return the data for the next step, and do nothing else.
Examples of good semantic functions:
fn quadratic_formula(a: f64, b: f64, c: f64) -> (f64, f64) {
// implementation …
}
fn retry_with_exponential_backoff_and_run_y_in_between(x: X, y: Y) {
// implementation …
}
Even if these functions are never used again, future humans and agents will appreciate the clear indexing of information.
Semantic functions should not need surrounding comments; the code itself should be self‑describing. They should also be extremely unit‑testable because a good semantic function is a well‑defined one.
Pragmatic Functions
Pragmatic functions act as wrappers around a series of semantic functions and unique logic. They represent the complex processes of your codebase.
- Use them when production logic gets messy; they help organize that mess.
- They should generally appear in only a few places. If they appear in many places, consider extracting the repeated logic into semantic functions.
Examples:
def provision_new_workspace_for_github_repo(repo, user):
# implementation …
def handle_user_signup_webhook():
# implementation …
Testing pragmatic functions falls into the realm of integration testing, often done within the context of testing whole‑app functionality. Because pragmatic functions are expected to evolve dramatically over time, it’s helpful to include doc comments that note non‑obvious behavior (e.g., “fails early when balance < 10”). Treat such comments as hints, not guarantees—verify them when necessary.
Models
The shape of your data should make wrong states impossible.
- If a model allows a combination of fields that should never coexist, the model isn’t doing its job.
- Every optional field forces the rest of the codebase to answer “is this set?” each time it touches the data.
- Loosely typed fields invite callers to pass values that look right but aren’t.
When models enforce correctness, bugs surface at the point of construction rather than deep inside unrelated flows.
Naming
A model’s name should be precise enough that each field’s belonging is obvious. If the name doesn’t convey this, the model is trying to be too many things.
struct UserAndWorkspace {
user: User,
workspace: Workspace,
}
Good names like UnverifiedEmail, PendingInvite, and BillingAddress tell you exactly what fields belong. Seeing a phone_number field on BillingAddress signals a problem.
Branding Types
Values with identical shapes can represent completely different domain concepts. Brand types wrap primitives in distinct types so the compiler treats them as separate.
struct DocumentId(UUID);
struct MessagePointer(UUID);
With branding, accidentally swapping two IDs becomes a compile‑time error instead of a silent runtime bug.
Where Things Break
Semantic → Pragmatic Drift
Breaks often occur when a semantic function morphs into a pragmatic one for convenience, and other code that relies on it starts behaving unexpectedly. To avoid this, name functions based on where they’re used, not just what they do. Clear naming signals that the function’s internal behavior isn’t a contract to be relied upon.
Model Bloat
Models degrade similarly but more slowly. A focused model may acquire “just one more” optional field because it’s easier than creating a new model. Over time the model becomes a loose bag of half‑related data, forcing every consumer to guess which fields are set and why. When a model’s fields no longer cohere around its name, it’s a signal to split it into distinct, well‑defined models.