AI memory is broken. We built one that forgets.

Published: 1 month ago (April 4, 2026 at 09:35 PM EDT)

3 min read

Source: Dev.to

Source: Dev.to

What this actually looks like

Week 1: You tell the agent “we’re using React for the frontend.”
Week 2: You switch. “Moving to Svelte, React bundle is too big.”
Week 4: You ask “what’s our frontend stack?”

A normal retrieval system hands back both answers. React and Svelte sit side by side with equal weight. Nothing in the system knows one replaced the other, so the agent might reference React, Svelte, or some confused mix of both.

We kept running into this while building agent tooling, and it became clear the issue isn’t retrieval quality — it’s that these systems have no concept of time or obsolescence.

The numbers

We ran a 4‑week simulated project through both systems. 24 events total — decisions, corrections, errors, repeated observations. Two major direction changes mid‑project.

Metric	Naive	Sparsion
Top result correct	No	Yes
Pruned stale memories	0	2
Retrievable at week 4	24	22

Naive retrieval puts a stale entry on top. Sparsion puts the correction first — salience 1.65 vs 0.55 for the outdated original.

What Sparsion actually does

It treats memory as a lifecycle instead of a log.

Events → Salience Scoring → Hot → Warm → Cold → Forgotten

Old memories weaken over time (exponential decay, configurable half‑life)
Repeated events get stronger (log‑frequency)
You can flag things as critical — those survive 4× longer
Corrections score 3× higher than observations by default
Anything below a salience floor gets dropped from retrieval entirely

A critical correction enters the system at salience 13.18. A throwaway observation enters at 0.77. After six weeks with no reinforcement, the observation is gone while the correction remains.

Try it

from sparsion import Runtime

rt = Runtime("agent_memory.db")

# Week 1
rt.record("user", "decision", "Frontend framework: React", importance="high")

# Week 2
rt.record(
    "user",
    "correction",
    "Switching to Svelte — React bundle too large",
    importance="critical"
)

# Query
memories = rt.query(text="frontend", limit=3)
for m in memories:
    print(f"[{m['tier']}] {m['content']} (salience: {m['salience']:.2f})")
# [Hot] Switching to Svelte — React bundle too large (salience: 13.18)
# [Hot] Frontend framework: React (salience: 4.39)

# Age everything
result = rt.sweep()
print(f"Forgot {result['forgotten']} stale memories")

Under the hood

Rust core, Python bindings via PyO3/maturin, SQLite for storage. No model dependency — salience scoring is heuristic for now.

Rust core
  ├── Event store (SQLite)
  ├── Salience scorer
  ├── Tier manager (hot/warm/cold)
  ├── Decay engine
  └── Ranked retrieval
       ↓
  PyO3 → Python SDK (pip install sparsion)

Tests: 12 Rust unit, 5 integration (deterministic time via MockClock), 4 Python end‑to‑end.

What’s in v0.1

Temporal decay with configurable half‑life
Reinforcement through repetition
Importance hints (low/normal/high/critical)
Event‑type weighting — corrections > decisions > errors > actions > observations
Tier migration and forgetting loop through storage
Python SDK

What’s coming

Plugging into real agent workflows
Bigger benchmarks, longer time horizons
Contradiction‑aware updates
LangChain memory backend

If you’re building agents and keep hitting stale context problems, I’d like to hear about your use case.

Sparsion Runtime –

AI memory is broken. We built one that forgets.

What this actually looks like

The numbers

What Sparsion actually does

Try it

Under the hood

What’s in v0.1

What’s coming

Related posts

Making OpenClaw remember what it's doing after compaction

Top 10 Vector Databases in 2026

An Engineering-grade breakdown of RAG Pipeline

Why AI Agents Don't Follow Rules — The Case for Physical Governance