SuperLocalMemory V3: Mathematical Foundations for Production-Grade Agent Memory
Source: Dev.to
Overview
We applied information geometry, algebraic topology, and stochastic dynamics to AI‑agent memory.
- 74.8 % on LoCoMo with data staying local – the highest score reported without cloud dependency.
- 87.7 % in full‑power mode.
- 60.4 % with no LLM at any stage.
Open source under MIT.
The Memory Problem
Every AI coding assistant — Claude, Cursor, Copilot, ChatGPT — starts each session from scratch.
- Existing memory layers (e.g., Mem0, Zep, Letta) work well for individual developers and small teams.
- Production‑scale usage remains unsolved.
Symptoms at Scale
| Scale | Issue |
|---|---|
| 10 k memories | Cosine similarity stops discriminating between relevant and irrelevant results. |
| 100 k memories | Silent contradictions accumulate (e.g., “Alice moved to London” and “Alice lives in Paris”). |
| Enterprise | Hard‑coded lifecycle thresholds (“archive after 30 days”) break because usage patterns vary across teams, projects, and domains. |
Regulatory Dimension
The EU AI Act takes full effect 2 Aug 2026.
Any memory system that sends data to cloud LLMs for core operations faces a compliance question that engineering alone cannot resolve – it requires an architectural answer.
Our Mathematical Approach
1. Confidence‑Weighted Retrieval
Standard: Cosine similarity treats every embedding as equally confident.
Our model:
- Each memory embedding → diagonal Gaussian (learned mean & variance).
- Similarity measured by Fisher‑Rao geodesic distance (natural metric on statistical manifolds).
Key properties
- Repeated access → variance shrinks (Bayesian conjugate updates).
- More‑used memories become more precise.
- Proven to improve retrieval as usage grows.
Ablation – Removing Fisher‑Rao drops multi‑hop accuracy by 12 pp.
2. Algebraic Consistency Checking
Standard: Pairwise contradiction checking is O(n²) and misses transitive contradictions.
Our model:
- Represent the knowledge graph as a cellular sheaf (vector spaces on nodes & edges).
- Compute the first cohomology group (H^{1}(G,F)):
| Result | Interpretation |
|---|---|
| (H^{1}=0) | All memories are globally consistent |
| (H^{1}\neq0) | Contradictions exist, even if every local pair looks fine |
- Scales algebraically, not quadratically, catching contradictions that pairwise methods cannot.
3. Self‑Organizing Lifecycle Management
Standard: Hard‑coded thresholds (“archive after 30 days”, “promote after 10 accesses”).
Our model:
- Stochastic gradient flow on the Poincaré ball.
- Potential function encodes access frequency, trust score, recency.
- Dynamics converge to a stationary distribution → mathematically optimal allocation across lifecycle states
Active → Warm → Cold → Archived
- No manual tuning; the system self‑organizes based on actual usage patterns.
Benchmark Results
LoCoMo (Long Conversation Memory)
| Configuration | Score | What It Means |
|---|---|---|
| Mode A Retrieval | 74.8 % | Data stays on your machine. Highest local‑first score. |
| Mode C (Full Power) | 87.7 % | Cloud LLM at every layer. Comparable to industry systems. |
| Mode A Raw | 60.4 % | No LLM at any stage. First in the field. |
Competitive Landscape
| System | Score | Cloud LLM Required |
|---|---|---|
| EverMemOS | 92.3 % | Yes |
| MemMachine | 91.7 % | Yes |
| Hindsight | 89.6 % | Yes |
| SLM V3 Mode C | 87.7 % | Yes (every layer) |
| Zep | ~85 % | Yes |
| SLM V3 Mode A | 74.8 % | No |
| Mem0 | ~58‑66 % | Yes |
| SLM V3 Mode A Raw | 60.4 % | No (zero‑LLM) |
The gap between Mode A Raw (60.4 %) and Mode A Retrieval (74.8 %) shows that the four‑channel mathematical retrieval pipeline captures the vast majority of benchmark requirements without any cloud dependency. The remaining gap (74.8 % → 87.7 %) is due to answer synthesis quality, not knowledge retrieval.
Production‑Scale Benefits
| Problem | Traditional Approach | Our Solution | Impact |
|---|---|---|---|
| Retrieval quality at scale | Cosine similarity loses discriminative power | Fisher‑Rao distance | Keeps relevance when thousands of memories compete |
| Consistency at scale | Pairwise checks miss transitive contradictions | Sheaf cohomology (H¹) | Detects global inconsistencies algebraically |
| Lifecycle management | Fixed thresholds break under workload variation | Langevin dynamics on Poincaré ball | Self‑organizes memory allocation; no manual tuning |
These improvements are measurable on the benchmark and become more pronounced as memory count grows.
Privacy‑Accuracy Spectrum
| Mode | Description | Cloud Dependency | LoCoMo Score |
|---|---|---|---|
| Mode A – Local Guardian | All processing local. EU AI Act compliant by architecture. | No | 74.8 % |
| Mode B – Smart Local | Mode A + local LLM via Ollama. Still fully private. | No | (same as Mode A) |
| Mode C – Full Power | Cloud LLM at every layer. | Yes | 87.7 % |
Switch anytime – memories stay consistent across all modes.
Quick Start
npm install -g superlocalmemory # install the CLI
slm setup # initial configuration
slm warmup # optional: pre‑download embedding model
slm dashboard # launch 17‑tab web dashboard at http://localhost:8765
Compatibility
Works with 17+ AI tools, including:
- Claude Code
- Cursor
- VS Code Copilot
- Windsurf
- ChatGPT Desktop
- Gemini CLI
- JetBrains IDEs
- Zed
- Continue
- Cody
- …and many more.
Final Note
Current memory systems are impressive engineering feats. Our mathematical foundations (V3) address the three core production‑scale challenges—retrieval, consistency, and lifecycle—with provable, measurable improvements. Choose the mode that fits your privacy and performance needs, and let the system handle the rest.
Overview
The table below represents meaningful work solving real problems for real users.
Our contribution is mathematical. We believe the future of agent memory lies not in more heuristics, but in principled mathematics—techniques that provide guarantees, scale predictably, and can be adopted by any system.
Core Idea
The three techniques in V3 (Fisher‑Rao, sheaf cohomology, Langevin dynamics) are not specific to our product; they are general mathematical tools. We have open‑sourced everything under the MIT license because we believe the entire field benefits from solid mathematical foundations.
If these techniques make other memory systems better, we have succeeded.
Resources
- Paper: Zenodo DOI: 10.5281/zenodo.19038659
- Code:
- Website:
Author
Varun Pratap Bhardwaj — Independent Researcher
Part of Qualixar
Part of Qualixar | Author: Varun Pratap Bhardwaj