I ship a lot of API/webhook integrations. Here’s how I make them NOT hurt in production 🔥

Published: 3 days ago (February 27, 2026 at 08:19 AM EST)

4 min read

Source: Dev.to

Source: Dev.to

I ship a lot of API/webhook integrations. Here’s how I make them NOT hurt in production 🔥

If you do freelance backend long enough, you notice a pattern:

Clients don’t pay for “beautiful code”.
They pay for it working tomorrow.

Webhook integrations are the fastest way to get random chaos:

duplicate events
out‑of‑order delivery
retries that DDoS you
the classic “it worked yesterday 🤡”

Below is a practical checklist + a simple architecture that scales. No theory, just what works in production.

1️⃣ Assume the webhook will be duplicated. Because it will. ✅

Rule: every webhook must be idempotent.

How:

Extract event_id from the payload (or generate a hash from stable fields).
Store it with a status.
On a repeat, return 200 OK and do nothing.

Returning 500 will trigger more retries.

2️⃣ Acknowledge fast. Process async. ⚡

A webhook handler that does real work inside the HTTP request is a trap.

My default flow:

Receive webhook.
Validate signature / basic checks.
Save the raw payload + metadata to the DB.
Return 200 OK immediately.
Process the event in a worker / job queue.

This keeps the system calm even when the DB is slow or the provider times out.

3️⃣ Store raw payloads. Future you will thank you 🧠

When something breaks, the client will say: “I don’t know, it just didn’t send.”
If you don’t store raw payloads, you have no evidence and no replay.

Store:

Full raw JSON payload
Relevant headers
Provider name
Received timestamp
Processing status
Error message (if failed)

Having this data lets you:

Replay events
Debug edge cases
Prove what happened

It turns “guessing” into “knowing”.

4️⃣ Security: verify signatures or don’t pretend it’s secure 🔒

If the provider supports signatures, verify them right away—not later or after the MVP.
Otherwise you expose a public endpoint that can be abused for spam or worse.

5️⃣ Rate limits and backoff: retries are not your enemy, your implementation is 😅

When processing fails, avoid instant retries. Use exponential backoff, e.g.:

1 min
5 min
30 min
2 h
dead‑letter queue (manual review)

Most integration failures are temporary (provider downtime, DB hiccups, network glitches). Backoff makes the system survive like a tank.

6️⃣ Logging that actually helps, not “we logged something” 📝

I log at two layers:

Request layer

request ID
provider
event ID
status returned

Job (worker) layer

event ID
job attempt number
result
full error stack (if any)

Extra rule: if a job fails, save a short human‑readable error near the event record. This lets you scan the DB later and instantly spot patterns.

7️⃣ Minimal scalable structure (simple but powerful)

webhook_controller   → accepts HTTP, validates, stores event, returns fast
event_store           → saves raw payloads, dedup keys, statuses
processor            → business logic (“what do we do with this event?”)
adapters              → provider‑specific mapping (CRM A vs CRM B)
queue / worker       → runs processing asynchronously with retry rules

Adding a new integration is just a matter of creating a new adapter; the rest stays untouched.

Common production “gotchas” (learned the annoying way) 🤝

Out‑of‑order events

You might receive an “updated” event before a “created” one.

Solution: allow upserts, store event history, and process based on the current state.

Provider sends partial data

Sometimes only IDs are sent and you must fetch the full details.

Solution: add a “hydration step” in the worker (API pull) and cache if needed.

Webhook timeouts

Processing inside the request leads to timeouts.

Solution: fast ACK, async processing (as described above).

TL;DR 🧾

To build webhook integrations that survive production:

Idempotency is mandatory.
Acknowledge fast, process asynchronously.
Store raw payloads for replay and debugging.
Verify signatures immediately.
Implement sane retry/backoff strategies.
Log with enough context to debug later.

If you’ve ever shipped webhooks in production, you already know: it’s never “done”; it’s “stable enough to survive real traffic” 😄

Drop your worst webhook horror story below 👇