150줄로 Self-Evolving Memory Agent 구축하기

발행: 1일 전 (2026년 1월 19일 오전 03:00 GMT+9)

6 min read

Source: Dev.to

소개

실행 가능한 Memory Architecture 시리즈의 동반자 – 외부 의존성이 없습니다. 복사하고 붙여넣어 실행하세요.

스켈레톤은 다음을 보여줍니다

Inner loop – 런타임 동작 (encode → store → retrieve → manage)
Outer loop – 아키텍처 진화 (성능에 따라 구성 조정)
Four rooms – encode, store, retrieve, manage를 별개의 관심사로 구분

python self_evolving_agent.py

"""
self_evolving_agent.py

A minimal, runnable skeleton of a self‑evolving memory agent.
No external dependencies. Uses fake embeddings so you can see
the loop behavior end‑to‑end before swapping in real components.
"""

import json
import math
import random
from typing import List, Dict, Any, Tuple

# ----------------------------------------------------------------------
# Utility: fake embedding + similarity
# ----------------------------------------------------------------------

def fake_embed(text: str) -> List[float]:
    """Naïve embedding: character‑frequency vector. Replace with a real model."""
    counts = [0.0] * 26
    for ch in text.lower():
        if "a"  float:
    """Cosine similarity for two normalized vectors."""
    return sum(x * y for x, y in zip(a, b))

# ----------------------------------------------------------------------
# Memory architecture (The Four Rooms)
# ----------------------------------------------------------------------

class MemoryItem:
    def __init__(self, text: str, vector: List[float], label: str = ""):
        self.text = text
        self.vector = vector
        self.label = label

class Memory:
    """Container that holds items and provides the four‑room API."""

    def __init__(self):
        # Config knobs — these are what the outer loop evolves
        self.top_k = 3
        self.sim_threshold = 0.2
        self.decay_prob = 0.0
        self.items: List[MemoryItem] = []

        # Stats for drift detection
        self.total_retrievals = 0
        self.successful_retrievals = 0

    # ------------------------------------------------------------------
    # ROOM 1: ENCODE
    # ------------------------------------------------------------------
    def encode(self, text: str) -> List[float]:
        return fake_embed(text)

    # ------------------------------------------------------------------
    # ROOM 2: STORE
    # ------------------------------------------------------------------
    def store(self, text: str, label: str = "") -> None:
        vec = self.encode(text)
        self.items.append(MemoryItem(text, vec, label))

    # ------------------------------------------------------------------
    # ROOM 3: RETRIEVE
    # ------------------------------------------------------------------
    def retrieve(self, query: str) -> List[MemoryItem]:
        if not self.items:
            return []

        q_vec = self.encode(query)
        scored: List[Tuple[float, MemoryItem]] = []
        for item in self.items:
            sim = cosine_sim(q_vec, item.vector)
            if sim >= self.sim_threshold:
                scored.append((sim, item))

        scored.sort(key=lambda x: x[0], reverse=True)
        results = [it for _, it in scored[: self.top_k]]

        # Update diagnostics
        self.total_retrievals += 1
        if results:
            self.successful_retrievals += 1

        return results

    # ------------------------------------------------------------------
    # ROOM 4: MANAGE
    # ------------------------------------------------------------------
    def manage(self) -> None:
        """Randomly decay items according to `decay_prob`."""
        if self.decay_prob  self.decay_prob
        ]

    # ------------------------------------------------------------------
    # DIAGNOSTICS
    # ------------------------------------------------------------------
    def retrieval_success_rate(self) -> float:
        if self.total_retrievals == 0:
            return 1.0
        return self.successful_retrievals / se```

lf.total_retrievals

    def size(self) -> int:
        return len(self.items)

    def to_config(self) -> Dict[str, Any]:
        return {
            "top_k": self.top_k,
            "sim_threshold": round(self.sim_threshold, 3),
            "decay_prob": round(self.decay_prob, 3),
            "size": self.size(),
            "retrieval_success_rate": round(self.retrieval_success_rate(), 3),
        }

# ----------------------------------------------------------------------
# Model stub
# ----------------------------------------------------------------------

class DummyModel:
    """Stub LLM: echoes query + context. Replace with a real model."""

    def run(self, query: str, context: List[MemoryItem]) -> str:
        ctx_texts = [f"  [{i.label}] {i.text}" for i in context]
        if ctx_texts:
            return f"Q: {query}\nContext:\n" + "\n".join(ctx_texts)
        return f"Q: {query}\nContext: (none)"

# ----------------------------------------------------------------------
# Agent: Inner Loop + Outer Loop
# ----------------------------------------------------------------------

class Agent:
    def __init__(self, memory: Memory, model: DummyModel):
        self.memory = memory
        self.model = model
        self.history: List[Dict[str, Any]] = []

    # ------------------------------------------------------------------
    # INNER LOOP (runtime)
    # ------------------------------------------------------------------
    def handle_task(self, query: str, label: str) -> str:
        """Process a single query: store → retrieve → run model → manage."""
        self.memory.store(query, label=label)
        context = self.memory.retrieve(query)
        output = self.model.run(query, context)
        self.memory.manage()

        success = any(item.label == label for item in context)
        self.history.append({"query": query, "label": label, "success": success})
        return output

    # ------------------------------------------------------------------
    # OUTER LOOP (architecture evolution)
    # ------------------------------------------------------------------
    def evolve_memory_architecture(self) -> None:
        """Adapt the memory configuration based on recent performance."""
        success_rate = self.memory.retrieval_success_rate()
        size = self.memory.size()

        print("\n>>> OUTER LOOP: Evaluating memory architecture")
        print(f"    Before: {self.memory.to_config()}")

        # Adapt retrieval aggressiveness
        if success_rate  0.9:
            self.memory.top_k = max(self.memory.top_k - 1, 1)
            self.memory.sim_threshold = min(self.memory.sim_threshold + 0.02, 0.8)

        # Adapt decay based on size
        if size > 100:
            self.memory.decay_prob = min(self.memory.decay_prob + 0.05, 0.5)
        elif size  None:
        """Write the agent's query history to a JSON‑Lines file."""
        with open(path, "w", encoding="utf-8") as f:
            for record in self.history:
                f.write(json.dumps(record) + "\n")

# ----------------------------------------------------------------------
# Demo / entry point
# ----------------------------------------------------------------------
if __name__ == "__main__":
    mem = Memory()
    model = DummyModel()
    agent = Agent(mem, model)

    # Simple demo: a few labelled queries
    demo_tasks = [
        ("What is the capital of France?", "geography"),
        ("Explain Newton's second law.", "physics"),
        ("Who wrote 'Pride and Prejudice'?", "literature"),
        ("What is the capital of France?", "geography"),  # repeat to test retrieval
    ]

    for q, lbl in demo_tasks:
        print("\n---")
        print(agent.handle_task(q, lbl))

        # Periodically evolve the architecture (e.g., every 2 tasks)
        if len(agent.history) % 2 == 0:
            agent.evolve_memory_architecture()

    # Persist the interaction log
    agent.dump_history()

## 데모

```python
def main():
    memory = Memory()
    model = DummyModel()
    agent = Agent(memory, model)

    # Toy dataset: queries with category labels
    tasks = [
        ("How do I process a refund?", "refund"),
        ("Steps to issue a refund via card", "refund"),
        ("How to troubleshoot a login error?", "login"),
        ("User cannot sign in, what now?", "login"),
        ("How to update user email address?", "account"),
        ("Change account email for a customer", "account"),
    ] * 3

    random.shuffle(tasks)

    for i, (query, label) in enumerate(tasks, start=1):
        print(f"\n--- Task {i} ---")
        output = agent.handle_task(query, label)
        print(output)

        # Run outer loop every 5 tasks
        if i % 5 == 0:
            agent.evolve_memory_architecture()

    agent.dump_history()
    print("\n✓ Done. History written to agent_history.jsonl")

if __name__ == "__main__":
    main()

실행 시 기대할 수 있는 사항

작업 1‑5 – 내부 루프가 실행되고, 메모리가 채워지며, 검색이 향상됩니다.
외부 루프가 작동 – 검색 성공률에 따라 설정이 조정됩니다.
작업 6‑10 – 아키텍처가 변경되어 동작이 바뀝니다.
반복 – 에이전트가 자체 메모리 전략을 진화시킵니다.

to_config() 출력은 무엇이 어떻게 바뀌었는지 정확히 보여줍니다.

컴포넌트 교체 가이드

Component	Options
`fake_embed()`	OpenAI, Cohere, 또는 로컬 임베딩 모델
`self.items`	Pinecone, Weaviate, Chroma, pgvector
`DummyModel`	API를 통한 모든 LLM 또는 로컬
`evolve_memory_architecture()`	귀하의 적응 로직

아키텍처는 동일하게 유지되며, 컴포넌트는 확장됩니다.

왜 메모리 아키텍처가 모델보다 더 중요한가 – 개념
프로덕션 에이전트에서 메모리 드리프트를 감지하는 방법 – 메트릭 + 알림
150줄로 자체 진화 메모리 에이전트 구축 – 여기서 읽을 수 있습니다
두 개의 루프 – Substack의 개념적 프레임워크

150줄로 Self-Evolving Memory Agent 구축하기

소개

실행 시 기대할 수 있는 사항

컴포넌트 교체 가이드

관련 글

기술은 구원자가 아니라 촉진자다

업계 설문조사: 코딩은 빨라지고 디버깅은 느려진다

에이전틱 코딩에 입문하기

npm Classic Tokens 사라짐: 배포를 지속하는 두 가지 저 유지보수 방법

소개

실행 시 기대할 수 있는 사항

컴포넌트 교체 가이드

Related Reading

관련 글

기술은 구원자가 아니라 촉진자다

업계 설문조사: 코딩은 빨라지고 디버깅은 느려진다

에이전틱 코딩에 입문하기

npm Classic Tokens 사라짐: 배포를 지속하는 두 가지 저 유지보수 방법