I Built a Persistent Memory API for AI Bots in One Day (Here's How)

Published: (February 24, 2026 at 01:15 PM EST)
4 min read
Source: Dev.to

Source: Dev.to

The Problem: Bots Have Amnesia

Every AI bot conversation starts from zero. No context, no memory, no learning. Your customer‑service bot doesn’t remember the user it talked to yesterday. Your sales assistant forgets every preference the moment the session ends. Your coding agent has no idea what you built last week.

This isn’t a model problem. GPT‑4, Claude, Gemini — they’re all stateless by design. The memory problem is an infrastructure problem.

So I built EngramPort to fix it.

What Is EngramPort?

EngramPort is a Memory‑as‑a‑Service API. Any AI bot can connect to it and get persistent, semantic memory in three API calls. No vector database to manage, no embedding pipeline to build, no infrastructure to scale. Just HTTP and JSON.

The 3 Core Endpoints

1. Register your bot

curl https://engram.eideticlab.com/api/v1/portal/register \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "bot_name": "my-support-bot",
    "bot_type": "support",
    "owner_email": "you@company.com"
  }'

Returns

{
  "namespace": "bot:my-support-bot:a8f3k2",
  "api_key": "ek_bot_...",
  "manifest": {
    "memory_active": true,
    "capabilities": [
      "I now remember our conversations across sessions",
      "I can surface patterns from our history",
      "I learn what matters to you over time"
    ]
  }
}

2. Store a memory

curl https://engram.eideticlab.com/api/v1/portal/remember \
  -X POST \
  -H "X-API-Key: ek_bot_..." \
  -H "Content-Type: application/json" \
  -d '{
    "content": "User prefers concise answers, works in roofing sales",
    "session_id": "session-001"
  }'

3. Recall relevant memories

curl https://engram.eideticlab.com/api/v1/portal/recall \
  -X POST \
  -H "X-API-Key: ek_bot_..." \
  -H "Content-Type: application/json" \
  -d '{
    "query": "what do I know about this user?",
    "limit": 5
  }'

Returns semantically relevant memories ranked by cosine similarity (meaning search, not keyword search).

How It Works Under The Hood

The Memory Stack

Every memory goes through three layers:

  1. Embedding – OpenAI text-embedding-3-large converts your content into a 3072‑dimensional vector.
  2. Storage – The vector is upserted to Pinecone with full metadata, scoped to your bot’s namespace.
  3. Provenance – AEGIS security layer mints a dual‑strand SHA‑256 + RSA signature on every memory, providing a cryptographic receipt.

Namespace Isolation

Each bot lives in its own namespace, e.g.:

bot:my-support-bot:a8f3k2

Zero cross‑contamination: Bot A can never read Bot B’s memories, enforced at the vector‑database level.

The /reflect Endpoint

Call /reflect and the system:

  • Pulls your top 20 memories
  • Sends them to GPT‑4o‑mini
  • Synthesizes 3‑5 cross‑cutting insights
  • Stores them back as INSIGHT nodes

Run it on a nightly cron and your bot gets smarter automatically, without human intervention.

curl https://engram.eideticlab.com/api/v1/portal/reflect \
  -X POST \
  -H "X-API-Key: ek_bot_..." \
  -H "Content-Type: application/json" \
  -d '{"topic": "user preferences"}'

Returns

{
  "insights": [
    {
      "content": "User consistently prefers bullet points over paragraphs for technical explanations",
      "confidence": 0.91,
      "source_memory_count": 8
    }
  ],
  "synthesis_cost_usd": 0.000042
}

Integrating With LangChain

from langchain.memory import ConversationBufferMemory
import requests

ENGRAMPORT_KEY = "ek_bot_..."
ENGRAMPORT_URL = "https://engram.eideticlab.com/api/v1/portal"

def remember(content: str, session_id: str):
    requests.post(f"{ENGRAMPORT_URL}/remember",
        headers={"X-API-Key": ENGRAMPORT_KEY},
        json={"content": content, "session_id": session_id}
    )

def recall(query: str) -> list:
    r = requests.post(f"{ENGRAMPORT_URL}/recall",
        headers={"X-API-Key": ENGRAMPORT_KEY},
        json={"query": query, "limit": 5}
    )
    return r.json()["memories"]

# Before LLM call — inject relevant memories
memories = recall(user_message)
context = "\n".join([m["content"] for m in memories])

# After LLM response — store the exchange
remember(f"User asked: {user_message}", session_id)

That’s the entire integration—about 20 lines. Your LangChain bot now has persistent memory.

The Infrastructure

  • Google Cloud Run – serverless, auto‑scales to zero
  • Pinecone – vector storage, namespaced per bot
  • Supabase – API keys, tenant records, audit logs
  • OpenAI – embeddings + GPT‑4o‑mini for synthesis
  • AEGIS – zero‑trust security, RSA provenance

Pricing

PlanPriceMemories / moNamespacesReflect
Free$01001
Starter$29/mo10,0003Yes
Pro$99/mo100,00010Yes
EnterpriseCustomUnlimitedUnlimitedYes

Try It Now

Free tier is live. Register your bot in 60 seconds:

https://engram.eideticlab.com/

Happy to answer any technical questions in the comments.

0 views
Back to Blog

Related posts

Read more »

DevOps and Vibe Coding: A Journey

Things to Do Map Your Application - Map your application on paper, in a spreadsheet, or using graphics/flowcharts. This is the first step. - Understanding the...

OpenAI just raised $110 billion. Wow

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as we...