From Spaghetti Code to the Lazarus Protocol

Published: (February 4, 2026 at 11:18 AM EST)
4 min read
Source: Dev.to

Source: Dev.to

I didn’t start by wanting to build software. I started by needing reliable systems where the internet fails.

This is a technical post‑mortem on building a production offline‑first engineering app under real‑world constraints: unreliable networks, low‑end devices, strict budgets, and legal responsibility.

Why Offline‑First Was Non‑Negotiable

K‑First started with a simple observation:

Construction sites don’t have reliable internet, but engineers still need reliable data.

Most apps assume:

  1. network first
  2. cache later
  3. sync as an optimization

That model breaks down when:

  • connectivity drops mid‑form
  • background processes are killed
  • devices reboot unexpectedly
  • data loss has legal or financial consequences

So the core constraint from day one was clear:

The app must function correctly even if the network never comes back.

Offline‑first wasn’t a feature. It was the architecture.

V1: The “Zero‑Burn” Monolith (December 2025)

The first version of K‑First had one brutal constraint: Zero burn.

  • No servers
  • No paid infrastructure
  • No background sync unless a user explicitly paid

The Initial Architecture (and the Trap)

To move fast, we built a single large controller—effectively a God Class—that handled:

  • navigation
  • state
  • database access
  • UI orchestration

All project data was eagerly loaded into memory at app startup.

On paper it worked. In reality it created three serious problems.

1. Memory Pressure

Large projects meant large in‑memory state. Low‑end Android devices didn’t appreciate that.

2. “Passive Builder” Navigation

Forms returned data using Navigator.pop(result). This failed when:

  • Android killed background activities
  • Users deep‑linked back into the app
  • The process restarted mid‑flow

3. The “Ghost Data” Bug

Users would save logs… and later discover they never actually persisted.
That’s unacceptable for an engineering logbook.

By late December the monolith was already collapsing under real usage.

Phase 2: The Lazarus Refactor (January 2026)

We stopped feature work and initiated what we internally called Operation Clean House.

The goal wasn’t elegance. The goal was survivability.

Breaking the God Class

We migrated to a strict MVVM structure:

LayerResponsibility
RepositoryPersistence
ViewModelState & lifecycle safety
UIPure rendering
  • State stopped flowing through navigation.
  • Navigation stopped being a data transport.

The Lazarus Protocol: Self‑Healing Local Storage

The biggest risk in an offline‑first app is silent database corruption.

Typical causes:

  • Crashes
  • Battery pulls
  • OEM‑specific quirks

Lazarus Protocol

  1. Every critical database write is checkpointed.
  2. On startup, the app validates the SQLite file.
  3. If corruption is detected:
    • The database is quarantined.
    • The last known‑good backup is restored.
    • The user is informed, but never left with an empty app.

Principle: A partially correct logbook is better than a wiped one.

Android 15 and the 16 KB Page‑Size Wall

In January 2026 Google Play rejected our builds. The reason had nothing to do with Flutter.

  • Android 15 introduced a mandatory 16 KB page size requirement for native libraries.
  • Our encryption stack (SQLCipher) was incompatible.

The Fix

  • Forced upgrade of sqflite_sqlcipher.
  • Migration safety nets to prevent existing users from being locked out.
  • Careful handling of encrypted database headers.

This reinforced a painful truth: Mobile platforms are not stable targets; they are moving ground. If you build offline‑first, you inherit that responsibility.

The Samsung “Zombie Key” Incident (S23 / S24)

Symptom

Samsung users updated the app and saw:

“0 Projects”

No crash, no error—just an empty state.

Root Cause

Samsung’s hardware‑backed keystore (Knox) is sometimes not ready during cold start.

Our app:

  1. Requested the encryption key.
  2. Received a new key.
  3. Attempted to open the existing database.
  4. Failed silently.

We called this a Zombie Key:

  • Valid (it exists)
  • Real (it’s the correct type)
  • Completely wrong (it doesn’t match the stored DB)

Fix: The Samsung Patience Protocol

Instead of assuming storage is instant, we implemented:

  • Retry loops with exponential back‑off (up to ~7.5 seconds).
  • Only after exhausting retries do we treat the app as a fresh install.

Lesson learned: Never assume hardware security modules wake up on time.

The Split‑Brain Problem & the Unified Core Decision

Original Plan

  • sqflite for free users (offline only)
  • PowerSync for paid users (sync enabled)

It looked clever, but it became a maintenance nightmare.

Two engines meant:

  • Double migrations
  • Double testing
  • Double failure modes

The Pivot

We chose a Unified Core:

  • PowerSync everywhere (the sync engine).

  • SQLite as the single source of truth.

  • Sync toggled by capability, not by architecture.

  • Free users run PowerSync in offline‑only mode.

  • Paid users simply enable connect().

This eliminated an entire class of future migrations.

Engineering Ethics: Trust Over Features

During compliance review we identified calculators whose results depended on:

  • Subjective land values
  • Inconsistent local rules

We removed them—not because we couldn’t implement them—but because shipping legally risky math is irresponsible.

We also added:

  • Global disclaimers
  • Consent‑gated analytics (DPDP Act)
  • Hard kill switches where confidence was insufficient

Engineering responsibility doesn’t end at correctness. It includes consequences.

The Final Stack (Early 2026)

Mobile

  • Flutter (Dart)
  • MVVM architecture
  • SQLite + SQLCipher
  • Argon2id key derivation
  • Firebase (Crashlytics, Auth)

Web

  • Astro (SSG, zero‑JS default)
  • Tailwind CSS
  • React islands (calculators only)
  • Motion One (mechanical animations)
  • Vercel (CI/CD)
  • Consent‑first analytics loading

What This Journey Taught Me

  • Offline‑first changes everything.
  • Architecture must be the first feature, not an afterthought.
  • Platform volatility demands defensive programming.
  • Ethical engineering is non‑negotiable.

The road was rough, but the result is a resilient, trustworthy tool that engineers can rely on—even when the network can’t.

Core Principles

  • Storage is not an optimization — it is the product
  • Hardware is unpredictable
    • Especially when security modules are involved
  • Architecture debt compounds faster than feature debt
  • Removing features can be a sign of maturity
  • Trust is the most expensive thing to lose — and the hardest to earn back

Closing

This architecture now powers K‑First, an offline‑first engineering logbook built for real site conditions where reliability matters more than polish.

If you’re building tools for the physical world:
Assume failure first — and design so your users never pay for it.

Back to Blog

Related posts

Read more »