스스로 진화하는 에이전트: 코드를 직접 작성·최적화하는 폐쇄형 AI 시스템 구축법

발행: 5일 전 (2026년 6월 6일 AM 05:00 GMT+9)

7 분 소요

Source: Dev.to

우리는 모두 이런 경험을 해봤습니다. AI 에이전트를 위해 완벽한 시스템 프롬프트나 도구 설명을 몇 시간씩 정성스럽게 만들고, 초기 테스트에서는 아름답게 동작합니다. 하지만 일주일 뒤, 실제 운영 데이터가 예상치 못한 상황을 던져줍니다. 팀의 코딩 표준이 바뀌거나, 엣지 케이스가 등장하거나, 기반 LLM이 업데이트되면서 에이전트의 성능이 급격히 떨어집니다.

이를 고치려면 로그를 직접 살펴보고, 실패 패턴을 진단하고, 프롬프트를 다시 작성한 뒤 수동 테스트를 실행해야 합니다.

이것은 오픈 루프 시스템입니다. 성능 피드백과 행동 조정 사이의 루프를 닫는 역할을 전적으로 외부 컨트롤러—즉 인간 엔지니어—에게 맡기고 있기 때문이죠.

하지만 에이전트가 스스로 이 루프를 닫을 수 있다면 어떨까요? 자신의 성능을 측정하고, 실패를 반성하며, 새로운 환경에 맞게 지시문, 도구 설명, 코드를 자율적으로 다시 작성할 수 있다면요?

이는 공상 과학이 아니라 자율 진화입니다. 이 글에서는 자기 개선 에이전트의 엔지니어링 원리를 풀어보고, DSPy와 유전 알고리즘을 활용해 에이전트가 스스로 자신의 스킬을 최적화할 수 있는 완전한 프로덕션‑급 파이썬 라이브러리를 구축합니다.

(여기에 소개된 개념과 코드는 제 전자책 Hermes Agent, The Self-Evolving AI Workforce 에서 발췌했습니다.)

소프트웨어의 열역학: 폐쇄형 학습 루프

자율 진화가 왜 필요한지 이해하려면 고전 물리학의 비유를 빌려봅시다: 증기 기관.

원시 증기 기관은 부하가 변할 때마다 인간 조작자가 밸브를 지속적으로 조정해야 압력과 속도를 안정시켰습니다. 이것이 바로 오픈 루프 시스템이죠. 산업 혁명을 진정으로 열어젖힌 발명은 제임스 와트의 원심 조절기였습니다. 이 단순한 기계 장치는 피드백을 이용했습니다: 엔진이 빨라지면 원심력이 플라이볼을 바깥으로 내보내 밸브를 기계적으로 조여 엔진 속도를 낮추고, 엔진이 느려지면 공이 떨어져 밸브가 다시 열리게 했습니다.

엔진은 인간이 생각할 필요가 없었습니다. 현재 부하에 따라 자체 입력을 조절하는 내부 피드백 메커니즘을 갖추고 있었던 것이죠.

+-------------------------------------------------------------+
|                      폐쇄형 학습 루프                       |
|                                                             |
|   +------------------+           +----------------------+   |
|   |  현재 스킬       | --------> |  적합도 평가          |   |
|   |   (프롬프트/코드) |           | (휴리스틱 / LLM)    |   |
|   +------------------+           +----------------------+   |
|            ^                                |               |
|            |                                v               |
|   +------------------+           +----------------------+   |
|   |  검증된 변이      |           |  지속 메모리          |   |
|   |                  |           | (피드백 / 점수)      |   |
|   +------------------+           +----------------------+   |
|            ^                                |               |
|            |                                v               |
|   +------------------+           +----------------------+   |
|   |  제약            |  response")                         |
+-------------------------------------------------------------+

def forward(self, task: str) -> dspy.Prediction:
    # Inject the instruction dynamically into the predictor's context
    with dspy.settings.context(instruction=self.instruction.get()):
        return self.predictor(task=task)

class ConstraintValidator:
    """Ensures evolved skills do not break safety, structural, or length constraints."""
    def __init__(self, max_chars: int = 1500):
        self.max_chars = max_chars

    def validate(self, original_skill: str, evolved_skill: str) -> Tuple[bool, str]:
        if len(evolved_skill) > self.max_chars:
            return False, f"Evolved skill length ({len(evolved_skill)}) exceeds limit of {self.max_chars} characters."

        # Prevent wiping out core functional hooks
        if "DO NOT" in original_skill and "DO NOT" not in evolved_skill:
            return False, "Evolved skill stripped out critical safety constraints ('DO NOT' clauses)."

        return True, "Passed all structural constraints."

class SyntheticDatasetBuilder:
    """Generates synthetic test cases based on the skill's description to evaluate performance."""
    def __init__(self, model_name: str):
        self.model_name = model_name

    def generate(self, skill_text: str, num_examples: int = 5) -> List[Dict[str, str]]:
        console.print(f"[bold blue]\\[Dataset]\\[/bold blue] Generating {num_examples} synthetic test cases using {self.model_name}...")
        # In practice, this calls an LLM to generate diverse inputs and expected outputs
        # We return a structured mock dataset representing a code-review task
        return [
            {
                "task": "def add(a,b):\nreturn a+b", 
                "expected": "Error: Missing spaces around operators, missing docstring, missing type hints."
            },
            {
                "task": "import os\ndef run_sys(cmd):\n    os.system(cmd)", 
                "expected": "Error: Security vulnerability: os.system call detected. Use subprocess with safety checks."
            },
            {
                "task": "class user:\n    def __init__(self, name):\n        self.name=name", 
                "expected": "Error: Class name 'user' should follow CamelCase naming conventions."
            },
            {
                "task": "def calculate_area(radius):\n    return 3.14 * radius ** 2",
                "expected": "Error: Missing type hints and docstrings. Consider using math.pi instead of a hardcoded float."
            },
            {
                "task": "def get_data(timeout=10):\n    pass",
                "expected": "Error: Missing docstring, missing return type hint."
            }
        ][:num_examples]

# --- Main SkillEvolver Implementation ---

class SkillEvolver:
    """
    Orchestrates the autonomous evolution of an agent's skill.
    Loads a skill -> Generates a test suite -> Iteratively mutates instruction -> Validates -> Saves.
    """
    def __init__(
        self,
        skill_name: str,
        initial_instruction: str,
        iterations: int = 3,
        eval_model: str = "gpt-4o-mini",
        max_instruction_length: int = 1000,
    ):
        self.skill_name = skill_name
        self.instruction = initial_instruction
        self.iterations = iterations
        self.eval_model = eval_model

        self.validator = ConstraintValidator(max_chars=max_instruction_length)
        self.dataset_builder = SyntheticDatasetBuilder(model_name=eval_model)

        self.history: List[Dict[str, Any]] = []
        self.best_instruction = initial_instruction
        self.best_score = 0.0

    def heuristic_fitness(self, expectation: str, actual_output: str) -> float:
        """
        Fast, cheap evaluation metric.
        Measures semantic overlap and length penalties to score agent responses.
        """
        words_expected = set(expectation.lower().split())
        words_actual = set(actual_output.lower().split())

        if not words_actual:
            return 0.0

        intersection = words_expected.intersection(words_actual)
        overlap_score = len(intersection) / max(len(words_expected), 1)

        # Length penalty: discourage overly verbose or completely empty answers
        length_ratio = len(actual_output)

스스로 진화하는 에이전트: 코드를 직접 작성·최적화하는 폐쇄형 AI 시스템 구축법

소프트웨어의 열역학: 폐쇄형 학습 루프

관련 글

애자일 옥토퍼스 가격제는 실제로 어떻게 작동하고, 번거로움에 비해 가치가 있을까?

모바일 한여름 열풍

저자는 엔지니어일 필요 없다: 하네스가 품질을 유지하는 방법 (시리즈 5)

하드웨어 영감을 받은 UI 컴포넌트 라이브러리를 순수 바닐라 JS로 만들었습니다—방법 공개