Why MQTT Last Will Testament Isn't Enough for Production IoT (And What We Built Instead)

Published: (February 22, 2026 at 01:10 AM EST)
2 min read
Source: Dev.to

Source: Dev.to

I spent 7 years building cloud backends — but when I tried connecting real hardware (ESP32s in my home), I hit a wall:

“My device shows ‘connected’ in AWS IoT Core… but hasn’t reported data in 4 hours. Is it hung? Dead? Or just offline?”

Turns out: MQTT’s Last Will Testament (LWT) lies to you.

The Lie: “Connected” ≠ Alive

LWT triggers only on TCP disconnect. Real devices can fail silently:

  • Wi‑Fi drops but the TCP socket stays open (NAT timeout = 5 + minutes)
  • Device freezes but doesn’t reboot (watchdog failed)
  • Sensor loop crashes while the MQTT client still reports “connected”

Result? Your dashboard shows ✅ Online while the device hasn’t sent data since yesterday.

AWS IoT Core showing connected device with stale data

Our Fix: Application‑Level Heartbeats + Stateful ACKs

We built a lightweight Spring Boot backend (hear‑beat) that treats telemetry as heartbeat pulses — not just data.

Device → [temp=28°C, ts=1708512000] → Backend
Backend → "ACK @ 1708512000" → Device

Offline detection = missed heartbeat window

// DeviceRegistry.java
if (System.currentTimeMillis() - lastHeartbeat > OFFLINE_THRESHOLD) {
    markDeviceOffline(deviceId); // Not TCP disconnect — actual silence
}

Command safety via ACK loop

// CommandService.java
sendCommand(deviceId, "REBOOT");
waitForAck(deviceId, timeout = 30_000); // Did it *execute*? Not just "received"

REST control plane + MQTT data plane

  • Mobile apps talk REST (POST /devices/{id}/command)
  • Devices talk MQTT (iot/device/{id}/cmd)
  • Backend bridges both → clean separation

Why This Matters for Real Deployments

ScenarioLWT SaysOur Heartbeat Says
Device froze (no reboot)✅ Connected❌ Offline (no heartbeat in 90 s)
Wi‑Fi dropped in a rural field✅ Connected (TCP alive)❌ Offline (no data in 2 min)
Command sent but device crashed mid‑execution✅ Command delivered❌ No ACK → retry/fail‑safe

This isn’t theory. I run this for my home sensors — and it catches failures LWT misses daily.

Try It Yourself

git clone https://github.com/AnilSaithana/hear-beat
cd hear-beat
docker-compose up   # Runs Spring Boot + MQTT broker

ESP32 firmware example is included in the /firmware folder.

I built this because production IoT fails in the gaps between cloud and hardware.

0 views
Back to Blog

Related posts

Read more »

How to Design Reliable Data Pipelines

!Data pipeline architecture with four layers flowing from ingestion through staging, transformation, and servinghttps://media2.dev.to/dynamic/image/width=800%2C...

Does anyone here us HCL DX?

!pichttps://media2.dev.to/dynamic/image/width=256,height=,fit=scale-down,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farti...