在150行代码中构建自进化记忆代理

发布: 3周前 (2026年1月19日 GMT+8 02:00)

6 分钟阅读

Source: Dev.to

介绍

可运行的 Memory Architecture 系列伴随代码 – 无外部依赖。复制、粘贴并运行。

骨架演示

Inner loop – 运行时行为（encode → store → retrieve → manage）
Outer loop – 架构演进（根据性能调整配置）
Four rooms – 将 encode、store、retrieve、manage 视为独立关注点

python self_evolving_agent.py

"""
self_evolving_agent.py

A minimal, runnable skeleton of a self‑evolving memory agent.
No external dependencies. Uses fake embeddings so you can see
the loop behavior end‑to‑end before swapping in real components.
"""

import json
import math
import random
from typing import List, Dict, Any, Tuple

# ----------------------------------------------------------------------
# Utility: fake embedding + similarity
# ----------------------------------------------------------------------

def fake_embed(text: str) -> List[float]:
    """Naïve embedding: character‑frequency vector. Replace with a real model."""
    counts = [0.0] * 26
    for ch in text.lower():
        if "a"  float:
    """Cosine similarity for two normalized vectors."""
    return sum(x * y for x, y in zip(a, b))

# ----------------------------------------------------------------------
# Memory architecture (The Four Rooms)
# ----------------------------------------------------------------------

class MemoryItem:
    def __init__(self, text: str, vector: List[float], label: str = ""):
        self.text = text
        self.vector = vector
        self.label = label

class Memory:
    """Container that holds items and provides the four‑room API."""

    def __init__(self):
        # Config knobs — these are what the outer loop evolves
        self.top_k = 3
        self.sim_threshold = 0.2
        self.decay_prob = 0.0
        self.items: List[MemoryItem] = []

        # Stats for drift detection
        self.total_retrievals = 0
        self.successful_retrievals = 0

    # ------------------------------------------------------------------
    # ROOM 1: ENCODE
    # ------------------------------------------------------------------
    def encode(self, text: str) -> List[float]:
        return fake_embed(text)

    # ------------------------------------------------------------------
    # ROOM 2: STORE
    # ------------------------------------------------------------------
    def store(self, text: str, label: str = "") -> None:
        vec = self.encode(text)
        self.items.append(MemoryItem(text, vec, label))

    # ------------------------------------------------------------------
    # ROOM 3: RETRIEVE
    # ------------------------------------------------------------------
    def retrieve(self, query: str) -> List[MemoryItem]:
        if not self.items:
            return []

        q_vec = self.encode(query)
        scored: List[Tuple[float, MemoryItem]] = []
        for item in self.items:
            sim = cosine_sim(q_vec, item.vector)
            if sim >= self.sim_threshold:
                scored.append((sim, item))

        scored.sort(key=lambda x: x[0], reverse=True)
        results = [it for _, it in scored[: self.top_k]]

        # Update diagnostics
        self.total_retrievals += 1
        if results:
            self.successful_retrievals += 1

        return results

    # ------------------------------------------------------------------
    # ROOM 4: MANAGE
    # ------------------------------------------------------------------
    def manage(self) -> None:
        """Randomly decay items according to `decay_prob`."""
        if self.decay_prob  self.decay_prob
        ]

    # ------------------------------------------------------------------
    # DIAGNOSTICS
    # ------------------------------------------------------------------
    def retrieval_success_rate(self) -> float:
        if self.total_retrievals == 0:
            return 1.0
        return self.successful_retrievals / se```

```python
lf.total_retrievals

    def size(self) -> int:
        return len(self.items)

    def to_config(self) -> Dict[str, Any]:
        return {
            "top_k": self.top_k,
            "sim_threshold": round(self.sim_threshold, 3),
            "decay_prob": round(self.decay_prob, 3),
            "size": self.size(),
            "retrieval_success_rate": round(self.retrieval_success_rate(), 3),
        }

# ----------------------------------------------------------------------
# Model stub
# ----------------------------------------------------------------------

class DummyModel:
    """Stub LLM: echoes query + context. Replace with a real model."""

    def run(self, query: str, context: List[MemoryItem]) -> str:
        ctx_texts = [f"  [{i.label}] {i.text}" for i in context]
        if ctx_texts:
            return f"Q: {query}\nContext:\n" + "\n".join(ctx_texts)
        return f"Q: {query}\nContext: (none)"

# ----------------------------------------------------------------------
# Agent: Inner Loop + Outer Loop
# ----------------------------------------------------------------------

class Agent:
    def __init__(self, memory: Memory, model: DummyModel):
        self.memory = memory
        self.model = model
        self.history: List[Dict[str, Any]] = []

    # ------------------------------------------------------------------
    # INNER LOOP (runtime)
    # ------------------------------------------------------------------
    def handle_task(self, query: str, label: str) -> str:
        """Process a single query: store → retrieve → run model → manage."""
        self.memory.store(query, label=label)
        context = self.memory.retrieve(query)
        output = self.model.run(query, context)
        self.memory.manage()

        success = any(item.label == label for item in context)
        self.history.append({"query": query, "label": label, "success": success})
        return output

    # ------------------------------------------------------------------
    # OUTER LOOP (architecture evolution)
    # ------------------------------------------------------------------
    def evolve_memory_architecture(self) -> None:
        """Adapt the memory configuration based on recent performance."""
        success_rate = self.memory.retrieval_success_rate()
        size = self.memory.size()

        print("\n>>> OUTER LOOP: Evaluating memory architecture")
        print(f"    Before: {self.memory.to_config()}")

        # Adapt retrieval aggressiveness
        if success_rate  0.9:
            self.memory.top_k = max(self.memory.top_k - 1, 1)
            self.memory.sim_threshold = min(self.memory.sim_threshold + 0.02, 0.8)

        # Adapt decay based on size
        if size > 100:
            self.memory.decay_prob = min(self.memory.decay_prob + 0.05, 0.5)
        elif size  None:
        """Write the agent's query history to a JSON‑Lines file."""
        with open(path, "w", encoding="utf-8") as f:
            for record in self.history:
                f.write(json.dumps(record) + "\n")

# ----------------------------------------------------------------------
# Demo / entry point
# ----------------------------------------------------------------------
if __name__ == "__main__":
    mem = Memory()
    model = DummyModel()
    agent = Agent(mem, model)

    # Simple demo: a few labelled queries
    demo_tasks = [
        ("What is the capital of France?", "geography"),
        ("Explain Newton's second law.", "physics"),
        ("Who wrote 'Pride and Prejudice'?", "literature"),
        ("What is the capital of France?", "geography"),  # repeat to test retrieval
    ]

    for q, lbl in demo_tasks:
        print("\n---")
        print(agent.handle_task(q, lbl))

        # Periodically evolve the architecture (e.g., every 2 tasks)
        if len(agent.history) % 2 == 0:
            agent.evolve_memory_architecture()

    # Persist the interaction log
    agent.dump_history()

演示

def main():
    memory = Memory()
    model = DummyModel()
    agent = Agent(memory, model)

    # Toy dataset: queries with category labels
    tasks = [
        ("How do I process a refund?", "refund"),
        ("Steps to issue a refund via card", "refund"),
        ("How to troubleshoot a login error?", "login"),
        ("User cannot sign in, what now?", "login"),
        ("How to update user email address?", "account"),
        ("Change account email for a customer", "account"),
    ] * 3

    random.shuffle(tasks)

    for i, (query, label) in enumerate(tasks, start=1):
        print(f"\n--- Task {i} ---")
        output = agent.handle_task(query, label)
        print(output)

        # Run outer loop every 5 tasks
        if i % 5 == 0:
            agent.evolve_memory_architecture()

    agent.dump_history()
    print("\n✓ Done. History written to agent_history.jsonl")

if __name__ == "__main__":
    main()

运行时的预期结果

任务 1‑5 – 内部循环运行，记忆填充，检索效果提升。
外部循环触发 – 根据检索成功率调整配置。
任务 6‑10 – 行为会因架构变化而改变。
重复 – 代理会自行进化记忆策略。

to_config() 的输出会准确展示哪些内容发生了变化以及原因。

组件替换指南

组件	选项
`fake_embed()`	OpenAI、Cohere 或本地嵌入模型
`self.items`	Pinecone、Weaviate、Chroma、pgvector
`DummyModel`	任意通过 API 或本地运行的 LLM
`evolve_memory_architecture()`	您自己的适配逻辑

架构保持不变，组件可扩展。

Why Memory Architecture Matters More Than Your Model – 概念
How To Detect Memory Drift In Production Agents – 指标 + 警报
Build a Self‑Evolving Memory Agent in 150 Lines – 你在这里
The Two Loops – Substack 上的概念框架

在150行代码中构建自进化记忆代理

介绍

演示

运行时的预期结果

组件替换指南

相关文章

设计了一个可投入生产的多区域 AWS 架构 EKS | CI/CD | Canary Deployments | DR Failover

使用 AWS Bedrock 知识库创建 AI 驱动的 Slackbot

我创建了一个广泛的免费培训平台，学习 Claude Code、Cursor、Codex CLI 和 Gemini CLI

调试代理很困难：我如何为 AI Kernel 构建“Flight Recorder”

介绍

演示

运行时的预期结果

组件替换指南

Related Reading

相关文章

设计了一个可投入生产的多区域 AWS 架构 EKS | CI/CD | Canary Deployments | DR Failover

使用 AWS Bedrock 知识库创建 AI 驱动的 Slackbot

我创建了一个广泛的免费培训平台，学习 Claude Code、Cursor、Codex CLI 和 Gemini CLI

调试代理很困难：我如何为 AI Kernel 构建“Flight Recorder”