'Stop Treating All AI Memories the Same — Introducing Cortex, Who Forgot?'
Source: Dev.to
The Problem with Uniform AI Memory
A quick fact (e.g., “PostgreSQL runs on port 5432”) is not the same as a learned pattern (e.g., “always use connection pooling for high‑traffic services”).
A deployment event is not the same as a user preference.
Most AI memory solutions—RAG, vector stores, simple key‑value caches—dump everything into the same bucket. A one‑time debug note sits next to a critical architectural decision with the same priority, the same retrieval weight, and the same lifespan.
Result: bloated context windows full of irrelevant noise. Your AI retrieves a bug fix from six months ago with the same confidence as a pattern you use daily.
Cortex: Multi‑Stage Classification
Titan Memory includes Cortex, a multi‑stage classifier that routes every incoming memory into one of five cognitive categories.
| Category | What It Stores | Decay Rate |
|---|---|---|
| Knowledge | Facts, definitions, technical info | Slow — facts persist |
| Profile | Preferences, settings, user context | Very slow — preferences stick |
| Event | Sessions, deployments, incidents | Fast — events age out |
| Behavior | Patterns, habits, workflows | Slow — patterns are valuable |
| Skill | Techniques, solutions, best practices | Very slow — skills are durable |
Each category decays at a different rate. An error you hit last Tuesday fades quickly, while a deployment pattern used across multiple projects persists.
Retrieval Pipeline
On recall, Cortex does more than return the top‑K vectors:
- Hybrid search (dense vectors + BM25) to retrieve top candidates.
- Sentence splitting of the retrieved documents.
- Semantic scoring of each sentence with a 0.6 B‑parameter encoder.
- Pruning of sentences below a relevance threshold.
- Temporal conflict resolution (newer info wins).
- Category coverage check to ensure balanced recall, not just the highest‑scoring embeddings.
Outcome: 70–80 % token compression on every recall; only the most relevant (“gold”) sentences reach the LLM.
Installation
claude mcp add titan-memory -- node ~/.claude/titan-memory/bin/titan-mcp.js
Usage Examples
Storing a Skill
titan_add("Always use connection pooling for high‑traffic Postgres services")
# → Classified: Skill (confidence: 0.94)
# → Routed to Layer 4 (Semantic Memory)
# → Decay half‑life: 270 days
Storing an Event
titan_add("Deployed v2.3 to production, rolled back due to memory leak")
# → Classified: Event (confidence: 0.91)
# → Routed to Layer 5 (Episodic Memory)
# → Decay half‑life: 90 days
Recalling Information
titan_recall("Postgres performance best practices")
# → Returns the connection‑pooling skill (still strong after 6 months)
# → The deployment event has decayed unless explicitly requested
Architecture Overview
Titan Memory is a 5‑layer cognitive memory system delivered as an MCP server:
| Layer | Description |
|---|---|
| Layer 1 – Working Memory | Your active context window |
| Layer 2 – Factual Memory | O(1) hash lookup, sub‑10 ms latency |
| Layer 3 – Long‑Term Memory | Surprise‑filtered, adaptive decay |
| Layer 4 – Semantic Memory | Patterns, reasoning chains |
| Layer 5 – Episodic Memory | Session logs, timestamps |
Cortex is one component; the system also includes:
- Semantic highlighting
- Surprise‑based storage filtering
- Hybrid search with RRF reranking
- Cross‑project pattern transfer
The test suite reports 914 passing tests, and the system works with Claude Code, Cursor, or any MCP‑compatible client.
License & Source
- GitHub:
- License: Apache 2.0