From Spaghetti Code to the Lazarus Protocol
Source: Dev.to
I didn’t start by wanting to build software. I started by needing reliable systems where the internet fails.
This is a technical post‑mortem on building a production offline‑first engineering app under real‑world constraints: unreliable networks, low‑end devices, strict budgets, and legal responsibility.
Why Offline‑First Was Non‑Negotiable
K‑First started with a simple observation:
Construction sites don’t have reliable internet, but engineers still need reliable data.
Most apps assume:
- network first
- cache later
- sync as an optimization
That model breaks down when:
- connectivity drops mid‑form
- background processes are killed
- devices reboot unexpectedly
- data loss has legal or financial consequences
So the core constraint from day one was clear:
The app must function correctly even if the network never comes back.
Offline‑first wasn’t a feature. It was the architecture.
V1: The “Zero‑Burn” Monolith (December 2025)
The first version of K‑First had one brutal constraint: Zero burn.
- No servers
- No paid infrastructure
- No background sync unless a user explicitly paid
The Initial Architecture (and the Trap)
To move fast, we built a single large controller—effectively a God Class—that handled:
- navigation
- state
- database access
- UI orchestration
All project data was eagerly loaded into memory at app startup.
On paper it worked. In reality it created three serious problems.
1. Memory Pressure
Large projects meant large in‑memory state. Low‑end Android devices didn’t appreciate that.
2. “Passive Builder” Navigation
Forms returned data using Navigator.pop(result). This failed when:
- Android killed background activities
- Users deep‑linked back into the app
- The process restarted mid‑flow
3. The “Ghost Data” Bug
Users would save logs… and later discover they never actually persisted.
That’s unacceptable for an engineering logbook.
By late December the monolith was already collapsing under real usage.
Phase 2: The Lazarus Refactor (January 2026)
We stopped feature work and initiated what we internally called Operation Clean House.
The goal wasn’t elegance. The goal was survivability.
Breaking the God Class
We migrated to a strict MVVM structure:
| Layer | Responsibility |
|---|---|
| Repository | Persistence |
| ViewModel | State & lifecycle safety |
| UI | Pure rendering |
- State stopped flowing through navigation.
- Navigation stopped being a data transport.
The Lazarus Protocol: Self‑Healing Local Storage
The biggest risk in an offline‑first app is silent database corruption.
Typical causes:
- Crashes
- Battery pulls
- OEM‑specific quirks
Lazarus Protocol
- Every critical database write is checkpointed.
- On startup, the app validates the SQLite file.
- If corruption is detected:
- The database is quarantined.
- The last known‑good backup is restored.
- The user is informed, but never left with an empty app.
Principle: A partially correct logbook is better than a wiped one.
Android 15 and the 16 KB Page‑Size Wall
In January 2026 Google Play rejected our builds. The reason had nothing to do with Flutter.
- Android 15 introduced a mandatory 16 KB page size requirement for native libraries.
- Our encryption stack (SQLCipher) was incompatible.
The Fix
- Forced upgrade of
sqflite_sqlcipher. - Migration safety nets to prevent existing users from being locked out.
- Careful handling of encrypted database headers.
This reinforced a painful truth: Mobile platforms are not stable targets; they are moving ground. If you build offline‑first, you inherit that responsibility.
The Samsung “Zombie Key” Incident (S23 / S24)
Symptom
Samsung users updated the app and saw:
“0 Projects”
No crash, no error—just an empty state.
Root Cause
Samsung’s hardware‑backed keystore (Knox) is sometimes not ready during cold start.
Our app:
- Requested the encryption key.
- Received a new key.
- Attempted to open the existing database.
- Failed silently.
We called this a Zombie Key:
- Valid (it exists)
- Real (it’s the correct type)
- Completely wrong (it doesn’t match the stored DB)
Fix: The Samsung Patience Protocol
Instead of assuming storage is instant, we implemented:
- Retry loops with exponential back‑off (up to ~7.5 seconds).
- Only after exhausting retries do we treat the app as a fresh install.
Lesson learned: Never assume hardware security modules wake up on time.
The Split‑Brain Problem & the Unified Core Decision
Original Plan
sqflitefor free users (offline only)- PowerSync for paid users (sync enabled)
It looked clever, but it became a maintenance nightmare.
Two engines meant:
- Double migrations
- Double testing
- Double failure modes
The Pivot
We chose a Unified Core:
-
PowerSync everywhere (the sync engine).
-
SQLite as the single source of truth.
-
Sync toggled by capability, not by architecture.
-
Free users run PowerSync in offline‑only mode.
-
Paid users simply enable
connect().
This eliminated an entire class of future migrations.
Engineering Ethics: Trust Over Features
During compliance review we identified calculators whose results depended on:
- Subjective land values
- Inconsistent local rules
We removed them—not because we couldn’t implement them—but because shipping legally risky math is irresponsible.
We also added:
- Global disclaimers
- Consent‑gated analytics (DPDP Act)
- Hard kill switches where confidence was insufficient
Engineering responsibility doesn’t end at correctness. It includes consequences.
The Final Stack (Early 2026)
Mobile
- Flutter (Dart)
- MVVM architecture
- SQLite + SQLCipher
- Argon2id key derivation
- Firebase (Crashlytics, Auth)
Web
- Astro (SSG, zero‑JS default)
- Tailwind CSS
- React islands (calculators only)
- Motion One (mechanical animations)
- Vercel (CI/CD)
- Consent‑first analytics loading
What This Journey Taught Me
- Offline‑first changes everything.
- Architecture must be the first feature, not an afterthought.
- Platform volatility demands defensive programming.
- Ethical engineering is non‑negotiable.
The road was rough, but the result is a resilient, trustworthy tool that engineers can rely on—even when the network can’t.
Core Principles
- Storage is not an optimization — it is the product
- Hardware is unpredictable
- Especially when security modules are involved
- Architecture debt compounds faster than feature debt
- Removing features can be a sign of maturity
- Trust is the most expensive thing to lose — and the hardest to earn back
Closing
This architecture now powers K‑First, an offline‑first engineering logbook built for real site conditions where reliability matters more than polish.
If you’re building tools for the physical world:
Assume failure first — and design so your users never pay for it.