'Stop Treating All AI Memories the Same — Introducing Cortex, Who Forgot?'

Published: (February 4, 2026 at 02:52 PM EST)
3 min read
Source: Dev.to

Source: Dev.to

The Problem with Uniform AI Memory

A quick fact (e.g., “PostgreSQL runs on port 5432”) is not the same as a learned pattern (e.g., “always use connection pooling for high‑traffic services”).
A deployment event is not the same as a user preference.

Most AI memory solutions—RAG, vector stores, simple key‑value caches—dump everything into the same bucket. A one‑time debug note sits next to a critical architectural decision with the same priority, the same retrieval weight, and the same lifespan.

Result: bloated context windows full of irrelevant noise. Your AI retrieves a bug fix from six months ago with the same confidence as a pattern you use daily.

Cortex: Multi‑Stage Classification

Titan Memory includes Cortex, a multi‑stage classifier that routes every incoming memory into one of five cognitive categories.

CategoryWhat It StoresDecay Rate
KnowledgeFacts, definitions, technical infoSlow — facts persist
ProfilePreferences, settings, user contextVery slow — preferences stick
EventSessions, deployments, incidentsFast — events age out
BehaviorPatterns, habits, workflowsSlow — patterns are valuable
SkillTechniques, solutions, best practicesVery slow — skills are durable

Each category decays at a different rate. An error you hit last Tuesday fades quickly, while a deployment pattern used across multiple projects persists.

Retrieval Pipeline

On recall, Cortex does more than return the top‑K vectors:

  1. Hybrid search (dense vectors + BM25) to retrieve top candidates.
  2. Sentence splitting of the retrieved documents.
  3. Semantic scoring of each sentence with a 0.6 B‑parameter encoder.
  4. Pruning of sentences below a relevance threshold.
  5. Temporal conflict resolution (newer info wins).
  6. Category coverage check to ensure balanced recall, not just the highest‑scoring embeddings.

Outcome: 70–80 % token compression on every recall; only the most relevant (“gold”) sentences reach the LLM.

Installation

claude mcp add titan-memory -- node ~/.claude/titan-memory/bin/titan-mcp.js

Usage Examples

Storing a Skill

titan_add("Always use connection pooling for high‑traffic Postgres services")
# → Classified: Skill (confidence: 0.94)
# → Routed to Layer 4 (Semantic Memory)
# → Decay half‑life: 270 days

Storing an Event

titan_add("Deployed v2.3 to production, rolled back due to memory leak")
# → Classified: Event (confidence: 0.91)
# → Routed to Layer 5 (Episodic Memory)
# → Decay half‑life: 90 days

Recalling Information

titan_recall("Postgres performance best practices")
# → Returns the connection‑pooling skill (still strong after 6 months)
# → The deployment event has decayed unless explicitly requested

Architecture Overview

Titan Memory is a 5‑layer cognitive memory system delivered as an MCP server:

LayerDescription
Layer 1 – Working MemoryYour active context window
Layer 2 – Factual MemoryO(1) hash lookup, sub‑10 ms latency
Layer 3 – Long‑Term MemorySurprise‑filtered, adaptive decay
Layer 4 – Semantic MemoryPatterns, reasoning chains
Layer 5 – Episodic MemorySession logs, timestamps

Cortex is one component; the system also includes:

  • Semantic highlighting
  • Surprise‑based storage filtering
  • Hybrid search with RRF reranking
  • Cross‑project pattern transfer

The test suite reports 914 passing tests, and the system works with Claude Code, Cursor, or any MCP‑compatible client.

License & Source

  • GitHub:
  • License: Apache 2.0
Back to Blog

Related posts

Read more »

Claude Opus 4.6

Article URL: https://www.anthropic.com/news/claude-opus-4-6 Comments URL: https://news.ycombinator.com/item?id=46902223 Points: 388 Comments: 179...