I got tired of my AI forgetting everything. So I built it a brain.
Source: Dev.to
Introduction
Hello 👋
First post here. Been building in public for a bit but never really sat down to write properly about what my team and I are working on. Figured it’s time… and chose the right platform for it.
I’m one of the devs at TinyHumans and for a while now our whole team has been deep in AI tooling. The one thing that kept bugging us more than anything else was memory. Not the flashy stuff—models, inference speed, prompting tricks—just the boring, unglamorous, completely‑broken part of almost every AI app we touched.
The problem
Every time we built something with persistent context—a support bot, a personal assistant, an agent workflow—we hit the same wall. Either the AI remembered nothing (new session, clean slate) or it remembered everything so poorly that the context became noise: stale facts, outdated decisions, irrelevant history injected into every prompt.
Vector similarity search retrieves what’s similar, not what’s important or current. That distinction kept bothering us, so we went down a rabbit hole.
Turns out the brain solved this millions of years ago.
Hermann Ebbinghaus figured it out in 1885. Memory retention drops roughly 50 % within an hour unless it’s reinforced. He called it the Forgetting Curve—a feature, not a flaw. The brain compresses experiences into patterns, strengthens what gets recalled and acted on, and quietly drops the rest. You remember the architecture decision that shaped six months of work, but not the Slack message about lunch that day.
Forgetting is the feature. AI memory systems just… don’t do this.
That’s what we set out to fix with Neocortex.
What Neocortex actually does
At its core, Neocortex is a brain‑inspired memory layer for AI apps. You store knowledge, the system figures out what’s worth keeping, and everything else naturally fades.
- Time‑decay retention scores – every memory item has a score that decreases over time. Old, unaccessed memories fade on their own. No cron jobs, no manual cleanup.
- Interaction‑weighted importance – not all signals are equal. Something that gets referenced, updated, and built upon becomes more durable.
- Noise pruning – low‑value memories decay and are removed automatically, allowing Neocortex to handle 10 M+ tokens without quality degradation.
- GraphRAG – instead of a flat list of embeddings, Neocortex builds a knowledge graph (entities, relationships, context). Queries traverse the graph to get structured, rich answers—not just “here are 5 similar chunks”.
Getting started
import tinyhumansai as api
client = api.TinyHumanMemoryClient("YOUR_APIKEY_HERE")
# Store a single memory
client.ingest_memory({
"key": "user-preference-theme",
"content": "User prefers dark mode",
"namespace": "preferences",
"metadata": {"source": "onboarding"},
})
# Ask a LLM something from the memory
response = client.recall_with_llm(
prompt="What is the user's preference for theme?",
api_key="OPENAI_API_KEY"
)
print(response.text) # The user prefers dark mode
Exciting use cases
- Support bots that actually learn – ingest ticket history, let outdated workarounds decay naturally, give agents per‑customer context without re‑reading entire conversation logs every time.
- Company knowledge agents – a graph‑based memory layer that understands who decided what and why is far more useful than semantic search over a pile of docs.
- Personal assistants that remember – not just within a session but across weeks and months. You told it you’re vegetarian in January; it filters restaurants in March without a reminder.
Get involved
If you want access or just want to follow along, reach out:
Drop a comment if you’ve run into this problem before—I’m curious how other devs are handling memory in their AI apps right now. Most people seem to be either ignoring it or duct‑taping something together.
— neocoder (dev @ tinyhumansai)