Cord: Coordinating Trees of AI Agents

Published: 3 days ago (February 20, 2026 at 08:27 PM EST)

9 min read

Source: Hacker News

The Challenge of Multi‑Agent Coordination

AI agents excel at single‑task execution: give Claude a focused instruction, and it delivers.
However, real‑world work is rarely a solitary task. It resembles a tree of interdependent tasks, featuring:

Dependencies – later steps rely on the outcomes of earlier ones.
Parallelism – multiple subtasks can run simultaneously.
Context flow – information must be shared seamlessly across the entire workflow.

Why Current Multi‑Agent Frameworks Miss the Mark

Over‑specialization – they often target isolated problems rather than the broader orchestration challenge.
Fragmented communication – limited mechanisms for passing context between agents.
Scalability gaps – difficulty handling complex dependency graphs and parallel execution at scale.

Bottom line: To unlock the true potential of AI agents, we need frameworks that manage task trees, dependency resolution, parallel execution, and context propagation—not just isolated, single‑task solutions.

What’s Out There

Framework	Coordination Model	Strengths	Limitations
LangGraph	State‑machine graph (nodes + edges defined in Python)	• Precise, deterministic workflows • Good for fixed pipelines	• Graph is static – agents cannot re‑route mid‑task • Developer must anticipate every possible decomposition
CrewAI	Role‑based crew (e.g., researcher, analyst, writer)	• Intuitive, human‑like team metaphor • Easy to assign high‑level responsibilities	• Roles are fixed at design time • Crew cannot discover it needs more agents or split a role dynamically
AutoGen	Group‑chat where agents converse to coordinate	• Very flexible, emergent behavior • No need to pre‑declare dependencies	• No explicit structure or dependency tracking • Authority, scoping, and typed results are missing • Hard to inspect or debug
OpenAI Swarm	Minimal hand‑offs (Agent A → Agent B)	• Lightweight, simple to implement	• Linear only – no parallelism or tree‑like task spawning
Claude’s Tool‑Use Loops (Anthropic)	Single agent loops with tool calls	• Handles sequential complexity well • Good for tool‑driven pipelines	• Suffers from context‑window limits on large tasks • No parallel execution – one agent, one thread

Common Thread

All of these frameworks require the developer to predefine the coordination structure:

The workflow graph, agent roles, and hand‑off patterns are decided up‑front.
Agents operate strictly within those boundaries.

Why Does This Matter Now?

Historically, hard‑coding decomposition made sense because early LLMs were unreliable at planning. Today’s models:

Plan effectively and can break problems into sub‑problems on their own.
Understand dependencies between subtasks.
Recognize when a task is too large for a single pass.

Given these capabilities, the question arises:

Why are we still hard‑coding the decomposition?

Let the agent build the tree

I built Cord. You give it a goal:

cord run "Should we migrate our API from REST to GraphQL? Evaluate and recommend."

One agent launches, reads the goal, decides it needs research before it can answer, and creates the following subtasks:

● #1 [active] GOAL Should we migrate our API from REST to GraphQL?
  ● #2 [active] SPAWN Audit current REST API surface
  ● #3 [active] SPAWN Research GraphQL trade‑offs for our stack
  ○ #4 [pending] ASK How many concurrent users do you serve?
      blocked‑by: #2
  ○ #5 [pending] FORK Comparative analysis
      blocked‑by: #3, #4
  ○ #6 [pending] SPAWN Write migration recommendation
      blocked‑by: #5

No workflow was hard‑coded.
The agent decided this structure at runtime.
It parallelised the API audit (#2) and the GraphQL research (#3).
It created an ask node (#4) – a question for the human – because the recommendation depends on scale, which it can’t discover on its own.
It blocked #4 on #2 so the question makes more sense with the audit results as context.
It made #5 a fork so the analysis inherits everything learned so far.
It sequenced the final recommendation (#6) after the analysis.

Execution trace

✓ #2 [complete] SPAWN Audit current REST API surface
  result: 47 endpoints. 12 heavily nested resources...

✓ #3 [complete] SPAWN Research GraphQL trade‑offs
  result: Key advantages: reduced over‑fetching...

? How many concurrent users do you serve?
  Options: 100K
> 10K‑100K

● #5 [active] FORK Comparative analysis
  blocked‑by: #3, #4

The research runs in parallel.
When both finish and you answer the question, the analysis launches with all three results in its context.
It then produces a recommendation tailored to your actual scale and API surface — not a generic blog post about GraphQL.

Spawn vs. Fork

Key idea: Treat spawn and fork as distinct context‑flow primitives.

Aspect	Spawn	Fork
Context	Starts with a clean slate: only the prompt and the results of nodes it explicitly depends on.	Inherits all completed sibling results, giving the child full knowledge of prior work.
Analogy	Hiring a contractor: “Here’s the spec, go.”	Briefing a team member: “You know everything the team has learned so far.”
Cost	Cheap to restart; easy to reason about.	More expensive, but essential when later analysis builds on earlier results.
Concurrency	Irrelevant – both can run in parallel or sequentially. The distinction is about what the child knows, not about execution order.

How it works in practice

Independent research tasks → use spawn.
The agent gets only the minimal required context, making each task isolated and easy to rerun.
Analysis that depends on prior work → use fork.
The agent receives the full set of completed sibling results, allowing it to synthesize everything learned so far.

The model can decide which primitive to use on its own, because it intuitively understands the difference between “starting fresh” and “building on what’s already known.”

Under the Hood

Each agent runs as a Claude Code CLI process equipped with MCP tools and backed by a shared SQLite database.

Core MCP Operations

Operation	Signature	Description
spawn	`spawn(goal, prompt, blocked_by)`	Create a new child task.
fork	`fork(goal, prompt, blocked_by)`	Create a child that inherits the parent’s context.
ask	`ask(question, options)`	Prompt the human for input.
complete	`complete(result)`	Mark the current task as finished.
read_tree	`read_tree()`	Retrieve the full coordination tree.

How It Works

Agents are agnostic about the coordination tree; they simply see the tools above and invoke them as needed.
The MCP server enforces the protocol: dependency resolution, authority scoping, and result injection.
When an ask node becomes ready, the engine pauses and displays a prompt in the terminal.
The human’s answer is stored as a result, unblocking downstream nodes. In this model, the human is an active participant in the tree rather than a passive observer.

Implementation Details

Roughly 500 lines of Python.
Uses SQLite for persistence and MCP for inter‑process communication.

The Moment It Clicked

Before writing the runtime, I had to answer one question: Would Claude actually get this?
The whole design hinges on agents understanding coordination tools they’ve never seen before. If they couldn’t differentiate spawn vs. fork, or ignored blocked_by, the protocol would be dead on arrival.

The Experiment

Set up a throw‑away MCP server with the five tools.
Pointed Claude Code at the server.
Ran 15 tests—no runtime, no engine, just Claude, the tools, and a task.

Test 1 – Decompose a Project

Prompt: “Decompose this project into subtasks.”
What Claude did (unprompted):
- Called read_tree() to inspect the current state.
- Created five child nodes with the correct dependency ordering.
- Wrote detailed prompts for each child.
- Called read_tree() again to verify the nodes were created.

I hadn’t instructed any of that.

Test 2 – Fork vs. Spawn

Scenario: independent research tasks and a synthesis step.
Claude’s choice:
- spawn for the research (contractor‑style work).
- fork for the analysis (briefing a team member).

When asked why, Claude gave the exact metaphor from my spec—even though it had never seen the spec.

Results

15/15 tests passed.
Tasks were broken into 3–6 subtasks with correct dependencies.
Claude asked well‑scoped human questions instead of guessing.
When it tried to stop a sibling node and was rejected for lack of authority, it didn’t retry or hack around it; it escalated via ask parent—the intended pattern.

Takeaway

That moment proved the runtime was worth building. The model already understood the protocol; I only needed to construct the surrounding infrastructure.

What this is not

This implementation uses Claude Code CLI and SQLite, but the protocol — five primitives, dependency resolution, authority scoping, two‑phase lifecycle — is independent of all that.

You could implement Cord over Postgres for multi‑machine coordination.
Over the Claude API directly, without the CLI overhead.
With multiple LLM providers — GPT for cheap tasks, Claude for complex ones.
With human workers for some nodes.

The protocol is the contribution. This repo is a proof of concept.

Try it

git clone https://github.com/kimjune01/cord.git
cd cord
uv sync
cord run "your goal here" --budget 2.0

You can also point it at a planning document:

cord run plan.md --budget 5.0

The root agent reads the markdown and decomposes it into a coordination tree. Write your plan however you like—bullet points, sections, prose—and the agent will infer the task structure, dependencies, and parallelism.

Note: Requires the Claude Code CLI and a subscription that includes it.

GitHub • RFC

Cord: Coordinating Trees of AI Agents

The Challenge of Multi‑Agent Coordination

Why Current Multi‑Agent Frameworks Miss the Mark

What’s Out There

Common Thread

Why Does This Matter Now?

Let the agent build the tree

Execution trace

Spawn vs. Fork

How it works in practice

Under the Hood

Core MCP Operations

How It Works

Implementation Details

The Moment It Clicked

The Experiment

Test 1 – Decompose a Project

Test 2 – Fork vs. Spawn

Results

Takeaway

What this is not

Try it

Related posts

Why Your AI Trading Agent Needs a Memory — and How We Built One

How Claude Multi-Agents work

Anthropic accuses three Chinese AI labs of abusing Claude to improve their own models

Anthropic's Claude Code Security is available now after finding 500+ vulnerabilities: how security leaders should respond

The Challenge of Multi‑Agent Coordination

Why Current Multi‑Agent Frameworks Miss the Mark

What’s Out There

Common Thread

Why Does This Matter Now?

Let the agent build the tree

Execution trace

Spawn vs. Fork

How it works in practice

Under the Hood

Core MCP Operations

How It Works

Implementation Details

The Moment It Clicked

The Experiment

Test 1 – Decompose a Project

Test 2 – Fork vs. Spawn

Results

Takeaway

What this is not

Try it

Related posts

Why Your AI Trading Agent Needs a Memory — and How We Built One

How Claude Multi-Agents work

Anthropic accuses three Chinese AI labs of abusing Claude to improve their own models

Anthropic's Claude Code Security is available now after finding 500+ vulnerabilities: how security leaders should respond

Test 1 – Decompose a Project

Test 2 – Fork vs. Spawn