Never Repeat Yourself: Give Your LLM Apps Persistent Memory with ContextMD

Published: 2 months ago (March 1, 2026 at 05:36 PM EST)

5 min read

Source: Dev.to

Source: Dev.to

TL;DR: ContextMD is a Python middleware that adds persistent memory to OpenAI, Anthropic, and LiteLLM API calls. Store conversations in human‑readable Markdown files, automatically extract facts, and bootstrap them back into future requests.

The Problem: LLMs Have No Memory

If you’ve built anything with LLM APIs, you’ve hit this wall:

# Conversation 1
response = openai.chat.completions.create(
    model="gpt-4.1",
    messages=[{"role": "user", "content": "I prefer TypeScript over JavaScript"}]
)

# Conversation 2 (hours later)
response = openai.chat.completions.create(
    model="gpt-4.1",
    messages=[{"role": "user", "content": "Help me build a React component"}]
)
# Assistant suggests JavaScript... again! 😤

LLMs are stateless. Each request starts fresh. You have to manually pass conversation history, and even then, it’s temporary. What if you want your AI to remember:

User preferences across sessions?
Decisions made weeks ago?
Project context that was established months back?

Enter ContextMD: Persistent Memory for LLMs

ContextMD is a lightweight Python middleware that sits between your code and the LLM provider. It:

Automatically injects stored memory into every API request
Extracts memorable facts from responses and saves them
Stores everything in human‑readable Markdown files (no database required!)

Quick Start: 3 Lines to Add Memory

from openai import OpenAI
from contextmd import ContextMD

# Wrap your existing client
client = ContextMD(OpenAI(), memory_dir=".contextmd/")

# Use exactly like normal – memory is automatic!
response = client.chat.completions.create(
    model="gpt-4.1",
    messages=[{"role": "user", "content": "I prefer TypeScript over JavaScript"}]
)

# The fact is now saved and will be injected into future requests

That’s it! Your OpenAI client now has persistent memory. No database setup, no extra API keys—just local Markdown files.

How It Works Under the Hood

ContextMD creates a .contextmd/ directory in your project:

.contextmd/
├── MEMORY.md              # Semantic facts (200‑line cap)
├── config.md              # Configuration
├── memory/
│   ├── 2024-03-01.md      # Daily episodic logs
│   └── 2024-03-02.md
└── sessions/
    └── 2024-03-01-auth.md # Session snapshots

Three Types of Memory

Semantic Memory – Permanent facts about the user/project

## User Preferences
- Prefers TypeScript over JavaScript
- Uses dark mode themes
- Follows atomic git commits

Episodic Memory – Time‑stamped events

## 2024-03-01 14:30
- Decided to use Next.js for the frontend
- Completed authentication feature

Procedural Memory – Learned workflows

## Workflows
- Always run tests before committing
- Use pnpm for package management

Real‑World Example: AI Coding Assistant

from openai import OpenAI
from contextmd import ContextMD

client = ContextMD(OpenAI())

# First conversation – establish context
response = client.chat.completions.create(
    model="gpt-4.1",
    messages=[{
        "role": "user",
        "content": "I'm building a React app with TypeScript, Tailwind CSS, and Next.js. I prefer functional components with hooks."
    }]
)

# ContextMD automatically saves these facts

# Week later – new conversation
response = client.chat.completions.create(
    model="gpt-4.1",
    messages=[{
        "role": "user",
        "content": "Help me create a user profile component"
    }]
)

# Response automatically uses TypeScript, Tailwind, functional components!

No need to repeat your tech stack. No need to specify preferences. ContextMD remembers everything.

Manual Memory Control

Sometimes you want to explicitly remember something:

# Remember a decision
client.remember(
    "Chose PostgreSQL over MongoDB for better ACID compliance",
    memory_type="semantic"
)

# Remember a completed task
client.remember(
    "Implemented OAuth2 authentication with refresh tokens",
    memory_type="episodic"
)

# Remember a workflow rule
client.remember(
    "Always write tests before refactoring",
    memory_type="procedural"
)

CLI Tools for Memory Management

ContextMD comes with a handy CLI:

# Initialize in your project
contextmd init

# View what the AI remembers about you
contextmd show

# See recent activity
contextmd history --hours 24

# List all sessions
contextmd sessions

# Manually add a fact
contextmd add "User loves Vim keybindings" --memory_type semantic

# Get statistics
contextmd stats

Works with All Major Providers

ContextMD is provider‑agnostic:

# OpenAI
client = ContextMD(OpenAI())

# Anthropic
client = ContextMD(Anthropic())

# LiteLLM (100+ providers)
import litellm
client = ContextMD(litellm)

Advanced Features

Session Management

Group related conversations:

with client.session("project‑setup"):
    response = client.chat.completions.create(
        model="gpt-4.1",
        messages=[{"role": "user", "content": "Set up the initial repo structure"}]
    )

(Further session‑related APIs are documented in the official README.)

t.new_session("feature-auth") as session:
    # All conversations here are grouped
    response = client.chat.completions.create(...)
    # Session snapshot saved automatically

Custom Configuration

Fine‑tune how ContextMD works:

from contextmd import ContextMD, ContextMDConfig

config = ContextMDConfig(
    memory_line_cap=200,           # Max lines in MEMORY.md
    bootstrap_window_hours=48,     # Hours of episodic memory to load
    compaction_threshold=0.8,      # Token threshold for extraction
    extraction_frequency="session_end",  # When to extract facts
)

client = ContextMD(openai_client, config=config)

Why Markdown Files?

Human‑readable – You can actually read and edit the memory.
Git‑friendly – Version‑control your AI’s memory.
No vendor lock‑in – Your data stays local.
Debuggable – See exactly what the AI remembers.

What’s Next?

ContextMD is actively developed with exciting features coming:

Get Started

pip install contextmd

Check out the GitHub repo for full documentation and examples.

Build Something Amazing

With ContextMD, you can build:

AI assistants that remember user preferences
Code generators that know your project’s patterns
Chatbots that maintain context across days
Learning tools that track progress over time

The possibilities are endless when your LLM finally has a memory.

What will you build with persistent memory? Share in the comments below!

P.S. Star the repo on GitHub – it helps more people discover ContextMD!