Cord: Coordinating Trees of AI Agents
Source: Hacker News
The Challenge of Multi‑Agent Coordination
AI agents excel at single‑task execution: give Claude a focused instruction, and it delivers.
However, real‑world work is rarely a solitary task. It resembles a tree of interdependent tasks, featuring:
- Dependencies – later steps rely on the outcomes of earlier ones.
- Parallelism – multiple subtasks can run simultaneously.
- Context flow – information must be shared seamlessly across the entire workflow.
Why Current Multi‑Agent Frameworks Miss the Mark
- Over‑specialization – they often target isolated problems rather than the broader orchestration challenge.
- Fragmented communication – limited mechanisms for passing context between agents.
- Scalability gaps – difficulty handling complex dependency graphs and parallel execution at scale.
Bottom line: To unlock the true potential of AI agents, we need frameworks that manage task trees, dependency resolution, parallel execution, and context propagation—not just isolated, single‑task solutions.
What’s Out There
| Framework | Coordination Model | Strengths | Limitations |
|---|---|---|---|
| LangGraph | State‑machine graph (nodes + edges defined in Python) | • Precise, deterministic workflows • Good for fixed pipelines | • Graph is static – agents cannot re‑route mid‑task • Developer must anticipate every possible decomposition |
| CrewAI | Role‑based crew (e.g., researcher, analyst, writer) | • Intuitive, human‑like team metaphor • Easy to assign high‑level responsibilities | • Roles are fixed at design time • Crew cannot discover it needs more agents or split a role dynamically |
| AutoGen | Group‑chat where agents converse to coordinate | • Very flexible, emergent behavior • No need to pre‑declare dependencies | • No explicit structure or dependency tracking • Authority, scoping, and typed results are missing • Hard to inspect or debug |
| OpenAI Swarm | Minimal hand‑offs (Agent A → Agent B) | • Lightweight, simple to implement | • Linear only – no parallelism or tree‑like task spawning |
| Claude’s Tool‑Use Loops (Anthropic) | Single agent loops with tool calls | • Handles sequential complexity well • Good for tool‑driven pipelines | • Suffers from context‑window limits on large tasks • No parallel execution – one agent, one thread |
Common Thread
All of these frameworks require the developer to predefine the coordination structure:
- The workflow graph, agent roles, and hand‑off patterns are decided up‑front.
- Agents operate strictly within those boundaries.
Why Does This Matter Now?
Historically, hard‑coding decomposition made sense because early LLMs were unreliable at planning. Today’s models:
- Plan effectively and can break problems into sub‑problems on their own.
- Understand dependencies between subtasks.
- Recognize when a task is too large for a single pass.
Given these capabilities, the question arises:
Why are we still hard‑coding the decomposition?
Let the agent build the tree
I built Cord. You give it a goal:
cord run "Should we migrate our API from REST to GraphQL? Evaluate and recommend."
One agent launches, reads the goal, decides it needs research before it can answer, and creates the following subtasks:
● #1 [active] GOAL Should we migrate our API from REST to GraphQL?
● #2 [active] SPAWN Audit current REST API surface
● #3 [active] SPAWN Research GraphQL trade‑offs for our stack
○ #4 [pending] ASK How many concurrent users do you serve?
blocked‑by: #2
○ #5 [pending] FORK Comparative analysis
blocked‑by: #3, #4
○ #6 [pending] SPAWN Write migration recommendation
blocked‑by: #5
- No workflow was hard‑coded.
- The agent decided this structure at runtime.
- It parallelised the API audit (#2) and the GraphQL research (#3).
- It created an ask node (#4) – a question for the human – because the recommendation depends on scale, which it can’t discover on its own.
- It blocked #4 on #2 so the question makes more sense with the audit results as context.
- It made #5 a fork so the analysis inherits everything learned so far.
- It sequenced the final recommendation (#6) after the analysis.
Execution trace
✓ #2 [complete] SPAWN Audit current REST API surface
result: 47 endpoints. 12 heavily nested resources...
✓ #3 [complete] SPAWN Research GraphQL trade‑offs
result: Key advantages: reduced over‑fetching...
? How many concurrent users do you serve?
Options: 100K
> 10K‑100K
● #5 [active] FORK Comparative analysis
blocked‑by: #3, #4
- The research runs in parallel.
- When both finish and you answer the question, the analysis launches with all three results in its context.
- It then produces a recommendation tailored to your actual scale and API surface — not a generic blog post about GraphQL.
Spawn vs. Fork
Key idea: Treat spawn and fork as distinct context‑flow primitives.
| Aspect | Spawn | Fork |
|---|---|---|
| Context | Starts with a clean slate: only the prompt and the results of nodes it explicitly depends on. | Inherits all completed sibling results, giving the child full knowledge of prior work. |
| Analogy | Hiring a contractor: “Here’s the spec, go.” | Briefing a team member: “You know everything the team has learned so far.” |
| Cost | Cheap to restart; easy to reason about. | More expensive, but essential when later analysis builds on earlier results. |
| Concurrency | Irrelevant – both can run in parallel or sequentially. The distinction is about what the child knows, not about execution order. |
How it works in practice
- Independent research tasks → use
spawn.
The agent gets only the minimal required context, making each task isolated and easy to rerun. - Analysis that depends on prior work → use
fork.
The agent receives the full set of completed sibling results, allowing it to synthesize everything learned so far.
The model can decide which primitive to use on its own, because it intuitively understands the difference between “starting fresh” and “building on what’s already known.”
Under the Hood
Each agent runs as a Claude Code CLI process equipped with MCP tools and backed by a shared SQLite database.
Core MCP Operations
| Operation | Signature | Description |
|---|---|---|
| spawn | spawn(goal, prompt, blocked_by) | Create a new child task. |
| fork | fork(goal, prompt, blocked_by) | Create a child that inherits the parent’s context. |
| ask | ask(question, options) | Prompt the human for input. |
| complete | complete(result) | Mark the current task as finished. |
| read_tree | read_tree() | Retrieve the full coordination tree. |
How It Works
- Agents are agnostic about the coordination tree; they simply see the tools above and invoke them as needed.
- The MCP server enforces the protocol: dependency resolution, authority scoping, and result injection.
- When an
asknode becomes ready, the engine pauses and displays a prompt in the terminal. - The human’s answer is stored as a result, unblocking downstream nodes. In this model, the human is an active participant in the tree rather than a passive observer.
Implementation Details
- Roughly 500 lines of Python.
- Uses SQLite for persistence and MCP for inter‑process communication.
The Moment It Clicked
Before writing the runtime, I had to answer one question: Would Claude actually get this?
The whole design hinges on agents understanding coordination tools they’ve never seen before. If they couldn’t differentiate spawn vs. fork, or ignored blocked_by, the protocol would be dead on arrival.
The Experiment
- Set up a throw‑away MCP server with the five tools.
- Pointed Claude Code at the server.
- Ran 15 tests—no runtime, no engine, just Claude, the tools, and a task.
Test 1 – Decompose a Project
- Prompt: “Decompose this project into subtasks.”
- What Claude did (unprompted):
- Called
read_tree()to inspect the current state. - Created five child nodes with the correct dependency ordering.
- Wrote detailed prompts for each child.
- Called
read_tree()again to verify the nodes were created.
- Called
I hadn’t instructed any of that.
Test 2 – Fork vs. Spawn
- Scenario: independent research tasks and a synthesis step.
- Claude’s choice:
spawnfor the research (contractor‑style work).forkfor the analysis (briefing a team member).
When asked why, Claude gave the exact metaphor from my spec—even though it had never seen the spec.
Results
- 15/15 tests passed.
- Tasks were broken into 3–6 subtasks with correct dependencies.
- Claude asked well‑scoped human questions instead of guessing.
- When it tried to stop a sibling node and was rejected for lack of authority, it didn’t retry or hack around it; it escalated via
ask parent—the intended pattern.
Takeaway
That moment proved the runtime was worth building. The model already understood the protocol; I only needed to construct the surrounding infrastructure.
What this is not
This implementation uses Claude Code CLI and SQLite, but the protocol — five primitives, dependency resolution, authority scoping, two‑phase lifecycle — is independent of all that.
- You could implement Cord over Postgres for multi‑machine coordination.
- Over the Claude API directly, without the CLI overhead.
- With multiple LLM providers — GPT for cheap tasks, Claude for complex ones.
- With human workers for some nodes.
The protocol is the contribution. This repo is a proof of concept.
Try it
git clone https://github.com/kimjune01/cord.git
cd cord
uv sync
cord run "your goal here" --budget 2.0
You can also point it at a planning document:
cord run plan.md --budget 5.0
The root agent reads the markdown and decomposes it into a coordination tree. Write your plan however you like—bullet points, sections, prose—and the agent will infer the task structure, dependencies, and parallelism.
Note: Requires the Claude Code CLI and a subscription that includes it.