I built a cognitive layer for AI agents that learns without LLM calls

Published: (March 17, 2026 at 08:19 AM EDT)
3 min read
Source: Dev.to

Source: Dev.to

The problem

Every time your agent starts a conversation, it starts from zero.
Sure, you can stuff a summary into the system prompt, use RAG, or call Mem0 or Zep.

But all of these have the same problem: they need LLM calls to learn. To extract facts, build a user profile, or understand what matters you’re paying per token, adding latency, and depending on a cloud service.

What if the learning happened locally, automatically, without any LLM involvement?

What AuraSDK does differently

AuraSDK is a cognitive layer that runs alongside any LLM. It observes interactions and—without any LLM calls—builds up a structured understanding of patterns, causes, and behavioral rules.

from aura import Aura, Level

brain = Aura("./agent_memory")
brain.enable_full_cognitive_stack()

# store what happens
brain.store(
    "User always deploys to staging first",
    level=Level.Domain,
    tags=["workflow"]
)
brain.store(
    "Staging deploy prevented 3 production incidents",
    level=Level.Domain,
    tags=["workflow"]
)

# sub-millisecond recall — inject into any LLM prompt
context = brain.recall("deployment decision")

# after enough interactions, the system derives this on its own:
hints = brain.get_surfaced_policy_hints()
# [{"action": "Prefer", "domain": "workflow", "description": "deploy to staging first"}]

Nobody wrote that policy rule; the system derived it from the pattern of stored observations.

The cognitive pipeline

AuraSDK processes every stored record through five deterministic layers:

Record → Belief → Concept → Causal → Policy
  • Belief – groups related observations, resolves contradictions
  • Concept – discovers stable topic clusters across beliefs
  • Causal – finds cause‑effect patterns from temporal and explicit links
  • Policy – derives behavioral hints (Prefer / Avoid / Warn) from causal patterns

The entire pipeline runs in milliseconds—no LLM, no cloud, no embeddings required.

Try it in 60 seconds

pip install aura-memory
python examples/demo.py

Sample output

Phase 4 - Recall in action

  Query: "deployment decision"  [0.29ms]
    1. Staging deploy prevented database migration failure
    2. Direct prod deploy skipped staging -- caused data loss

  Query: "code review"  [0.18ms]
    1. Code review caught SQL injection before merge
    2. Code review found performance regression early

5 learning cycles completed in 16 ms. Recall at 0.29 ms.

How it compares

FeatureAuraSDKMem0ZepLetta
LLM required for learningNoYesYesYes
Works offlineFullyPartialNoWith local LLM
Recall latency
  • Install: pip install aura-memory
  • Web: (link not provided in original)

If you’re building AI agents and want deterministic, explainable, offline‑capable memory—give it a try and let me know what you think.

0 views
Back to Blog

Related posts

Read more »

Understanding How AI Agents Work

markdown !Cover image for Understanding How AI Agents Workhttps://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=auto/https%3A%...

Context Engineering Has a Blind Spot

The biggest shift in agent design over the past year has been context engineering rather than improved models Most of the published guidance focuses on codebas...