The Impossible Normalization: Why Your Database Hates Biology 🧬

Published: 1 month ago (December 29, 2025 at 11:00 AM EST)

4 min read

Source: Dev.to

User Objects

We need to talk about User Objects.

If you’ve been in web dev for more than five minutes, you’ve probably built an e‑commerce backend or a todo app. The data modeling usually goes something like this:

A User exists.
A Product exists (it has a SKU, a price, and a description).
The user buys the product.

It’s clean and deterministic. If I buy a Red T‑Shirt (Size L), that object doesn’t change. It sits in a warehouse, gets shipped, and arrives as a Red T‑Shirt.

Now, imagine that Red T‑Shirt could spontaneously change its size depending on the weather, or describe itself as “sort of reddish‑pink but hurts when you touch it.”

Welcome to HealthTech.

The “Simple” Symptom Trap

Let’s say you’re building an app for patients to track their symptoms. The junior dev in us immediately thinks: “Easy. SQL table.”

CREATE TABLE symptoms (
  id SERIAL PRIMARY KEY,
  user_id INT,
  name VARCHAR(255),   -- "Headache"
  severity INT,        -- 1 to 10
  created_at TIMESTAMP
);

Ship it, right?

Wrong.
Two days later, a user writes: “My headache is a 7/10, but it’s specifically behind my left eye and it pulses when I stand up.”

Your name column just broke. You need location (left eye), quality (pulsating), and aggravating factors (standing up).

So you think, “Okay, I’ll switch to a NoSQL document store. I’ll just dump a JSON blob!”

{
  "symptom": "Headache",
  "meta": {
    "location": "Left retro-orbital",
    "quality": "Pulsating",
    "trigger": "Orthostatic"
  }
}

Great. Now try to query that across 10,000 users to find everyone with “head issues.” You can’t, because User A wrote “Headache,” User B wrote “Migraine,” and User C wrote “My head hurts.”

Enter the Ontologies: SNOMED‑CT and LOINC

To solve this, the industry invented standard terminologies. The big boss here is SNOMED‑CT, which assigns a unique code to basically every medical concept in existence. A headache isn’t “Headache”; it’s SCTID: 25064002.

This sounds like a developer’s dream (normalization!), but it turns into an implementation nightmare. Biology is a graph, not a list. A specific type of headache is a child of “Pain,” which is a child of “Clinical Finding.”

To store this properly, we usually look toward FHIR (Fast Healthcare Interoperability Resources). Here’s a simplified JSON for an observation of body temperature:

{
  "resourceType": "Observation",
  "status": "final",
  "category": [
    {
      "coding": [
        {
          "system": "http://terminology.hl7.org/CodeSystem/observation-category",
          "code": "vital-signs",
          "display": "Vital Signs"
        }
      ]
    }
  ],
  "code": {
    "coding": [
      {
        "system": "http://loinc.org",
        "code": "8310-5",
        "display": "Body temperature"
      }
    ]
  },
  "valueQuantity": {
    "value": 39.1,
    "unit": "degrees C",
    "system": "http://unitsofmeasure.org",
    "code": "Cel"
  }
  // ... timestamps, performers, device used ...
}

We went from temperature: 39.1 to a 30‑line JSON object because in biology, context is everything. A temperature of 39.1 °C taken orally is different from one taken axillary (armpit). If you don’t store the metadata, the data is medically dangerous.

The Map Is Not The Territory

The biggest philosophical hurdle we face as engineers is that human health is continuous, but databases are discrete.

When a user selects “Moderate Pain” from a dropdown, we collapse a complex, fluid biological sensation into a distinct enum, losing resolution.

I wrote more about this loss of data resolution on my personal blog—if you’re into deeper architectural theory, check out my other articles.

The challenge is creating a UI that feels human (e.g., “Where does it hurt?”) while mapping it to rigid, complex ontologies like FHIR and SNOMED without the user ever needing to know those standards.

So, What’s the Solution?

There isn’t a silver bullet, but the following patterns tend to work in the wild:

Store the Raw Intent: Keep what the user actually typed or clicked. Never discard the source of truth.
Map Later: Use an ingestion pipeline to translate “My head hurts” to SCTID: 25064002. Don’t force the user to speak SNOMED.
Hybrid Schemas: Use relational columns for high‑level data (patient ID, date) and JSONB columns for the clinical payload (the FHIR resource).

It’s messy, frustrating, and definitely harder than selling T‑shirts. But at least it’s never boring.

Anyone else suffering through HL7/FHIR integrations right now? Let’s commiserate in the comments. 😭

The Impossible Normalization: Why Your Database Hates Biology 🧬

User Objects

The “Simple” Symptom Trap

Enter the Ontologies: SNOMED‑CT and LOINC

The Map Is Not The Territory

So, What’s the Solution?

Related posts

Dual write problem in distributed systems

Academic Suite Database Design

Supabase vs Firebase: What I Chose and Why

easy-query: The Most Powerful ORM Subquery for Java