Consuming APIs from a Backend POV: Normalizing Data Across Multiple Endpoints

Published: 2 weeks ago (January 19, 2026 at 09:06 AM EST)

4 min read

Source: Dev.to

Source: Dev.to

Where It All Started

The first API I used was the Open Library API. It worked… but not consistently.

Some books came back without ISBNs.
Others had no descriptions.
In some cases, the author data felt a bit off or incomplete.

At first, I thought:

“Maybe this is just how it is.”

But I wanted richer data, so I added a second source: Google Books API. My thinking was simple:

“If one API is missing something, the other one probably has it.”

And that part was true. What I didn’t anticipate was the new set of problems that came with it.

Where Things Started Getting Messy

Once I started consuming data from both APIs, I noticed a few things almost immediately:

The same book showed up more than once.
Author names were formatted differently.
ISBNs existed in one response but not the other.
Descriptions didn’t always match.

Same book. Different versions of the truth.

A Simplified Example

Open Library response

{
  "title": "The Hobbit",
  "authors": [{ "name": "J.R.R. Tolkien" }],
  "isbn_10": ["0345339681"]
}

Google Books response

{
  "volumeInfo": {
    "title": "The Hobbit",
    "authors": ["J. R. R. Tolkien"],
    "industryIdentifiers": [
      { "type": "ISBN_13", "identifier": "9780345339683" }
    ]
  }
}

Both are correct and both describe the same book. But if you store this data as‑is, you’re asking for trouble.

The Real Problem (That Took Me a While to See)

The problem wasn’t Open Library, and it certainly wasn’t Google Books. The problem was me assuming external APIs would agree with each other. They don’t.

Each API has its own structure, priorities, and idea of what “complete” data looks like. That’s when I ran into the concept that quietly fixed everything: Normalization.

So… What Is Normalization?

In the simplest terms:

Normalization is deciding what your data should look
then forcing everything else to conform to it.

For non‑techies

It’s cleaning and standardizing information before saving it.
It’s making sure one book doesn’t end up with five slightly different identities.

For techies

It’s mapping external API responses into a single internal schema.

Either way, the idea is the same:

One system. One structure. One source of truth.

Why Normalization Actually Matters

Before normalization I had:

Duplicate books in my database.
Inconsistent author names.
Unreliable ISBN lookups.

After normalization I got:

One book = one record.
Predictable fields.
Much cleaner logic downstream.

It’s not flashy, but it quietly saves you hours of debugging later.

Achieving Normalization

Step One: Decide What a “Book” Means to You

Before touching any API logic, I had to answer a simple question:

“What does a book look inside my system?”

Here’s the structure I settled on:

Book = {
    "title": str,
    "authors": list[str],
    "isbn_10": str | None,
    "isbn_13": str | None,
    "description": str | None
}

This became my reference point. Anything coming from outside had to be reshaped to fit this.

Step Two: Normalize Each API Separately

Instead of mixing logic, I treated each API independently.

Open Library Normalization

def normalize_openlibrary(data):
    return {
        "title": data.get("title"),
        "authors": [a.get("name") for a in data.get("authors", [])],
        "isbn_10": data.get("isbn_10", [None])[0],
        "isbn_13": data.get("isbn_13", [None])[0],
        "description": data.get("description")
    }

Google Books Normalization

def normalize_googlebooks(data):
    info = data.get("volumeInfo", {})

    isbn_10 = None
    isbn_13 = None

    for identifier in info.get("industryIdentifiers", []):
        if identifier["type"] == "ISBN_10":
            isbn_10 = identifier["identifier"]
        elif identifier["type"] == "ISBN_13":
            isbn_13 = identifier["identifier"]

    return {
        "title": info.get("title"),
        "authors": info.get("authors", []),
        "isbn_10": isbn_10,
        "isbn_13": isbn_13,
        "description": info.get("description")
    }

At this point, both APIs were finally speaking the same language.

Step Three: Merging Without Duplicating

Normalization gets your data into the same shape. Merging decides which data wins.

My rules were simple:

Prefer ISBN‑13 when available.
Use Google Books as a fallback for missing descriptions.

def merge_books(primary, fallback):
    return {
        "title": primary["title"] or fallback["title"],
        "authors": primary["authors"] or fallback["authors"],
        "isbn_10": primary["isbn_10"] or fallback["isbn_10"],
        "isbn_13": primary["isbn_13"] or fallback["isbn_13"],
        "description": primary["description"] or fallback["description"],
    }

Nothing fancy—just clear rules.

The mental model that helped me was: APIs are raw ingredients, normalization is the recipe, and the database is the final dish. Skip the recipe, and you still get food, just not something you’d confidently serve.

What I Took Away From This

Never assume external APIs agree.
Define a single source of truth early.
Normalize each source before merging.
Keep merging logic simple and deterministic.
Invest in data hygiene now; it pays off later.

Happy coding, and may your data always be clean!

PIs don’t owe you consistency

More data sources = more responsibility
Normalization isn’t optional once you scale

Most importantly, I learned that backend work isn’t just about fetching data. It’s about deciding what truth looks like in your system and enforcing it.

If you’re consuming multiple APIs and things feel slightly off, normalization is probably the missing piece.

Happy building 🚀