How to Design Two Practical Orchestration Loops for LLM Agents
Source: Dev.to
The three layers you should always separate
Execution layer
Agents and responders live here.
Agent means any unit that does work: a model call, a tool, a heuristic function, a router.
Responder is the agent that produces the final user‑facing output for a turn or a session.
Communication layer
How agents talk to each other and to the orchestrator.
Examples: queues, events, internal RPC calls, function callbacks.
You rarely want agents to call each other directly. Route everything through this layer so you can trace and control it.
Memory layer
Where you store and retrieve state across time.
Can be a vector store, a key‑value store, a database, or a log.
It should not be “hidden in the prompt”. Treat memory as its own component.
Time as a first‑class dimension
Both loops treat time explicitly:
- Linear loop – discrete steps: T0, T1, T2, T3.
- Circular loop – a continuous stream while the conversation is active.
Once you have these pieces, you can design the two orchestration patterns.
Loop 1: Linear orchestrator for context extraction and analysis

When to use the linear loop
Use it when you have a fixed input (text, transcript, document, set of logs) and want to run several analytic passes over it. Latency matters but sub‑second interactivity is not required. Typical outputs are summaries, reports, classifications, or structured data.
Good examples
- Conversation analysis after a call has ended.
- Extracting entities and topics from chat logs.
- Multi‑stage document processing (OCR → cleaning → classification → summarization).
- Offline quality checks for previous sessions.
Mental model
Picture a horizontal diagram:
- INPUT → (time steps T0 … Tn) → Responder (final structured output)
- Between input and responder:
- Execution layer agents per time slice.
- Communication band in the middle.
- Memory band at the top.
At each step agents may retrieve from memory and may store new facts or summaries back into memory. The orchestrator walks through these steps one by one.
Step‑by‑step design
Step 1: Define the final output
Decide what the responder will produce, e.g.:
{
"intent": "...",
"sentiment": "...",
"entities": {...},
"summary": "..."
}
or a human‑readable report, or labels/scores for another system. All other agents exist to help this responder succeed.
Step 2: Split the job into stages
Identify dependencies and independent work. Example for conversation analysis:
- Normalization & language detection.
- Entity extraction (names, account IDs, products).
- Topic & intent detection.
- Sentiment & escalation risk.
- Final summary & suggestions.
Each stage becomes a time slice with one or more agents.
Step 3: Design the memory schema
For each stage list what the agent reads and writes. A simple schema:
{
"language": "en",
"entities": {...},
"topics": [...],
"sentiment": {...},
"summary": "..."
}
You can also scope memory by session_id, user_id, or time_window (for rolling analysis).
Key rule: agents receive a clean input and a structured slice of memory; no hidden context inside prompts.
Step 4: Wire store and retrieve
For each agent define two tiny functions:
def read(memory) -> context:
...
def write(memory, result) -> memory:
...
A minimal orchestration loop looks like:
for step in steps:
# 1. Load what this step needs
ctx = step.read(memory)
# 2. Run the agent with input and context
result = step.agent.run(raw_input, ctx)
# 3. Write new facts back to memory
memory = step.write(memory, result)
Some steps only write, some only read.
Step 5: Implement the responder as the last step
The responder is just another agent with a special role:
- Reads everything it needs from memory.
- Produces the final answer (often a single chat‑completion call that combines the original input, outputs of previous analytic agents, and any long‑term user/session memory).
- May log additional metadata back to memory.
Example: conversation analysis pipeline
| Agent | Reads | Writes |
|---|---|---|
| LanguageDetectorAgent | raw transcript | memory["language"] |
| EntityExtractorAgent | transcript, language | memory["entities"] |
| TopicClassifierAgent | transcript, entities | memory["topics"] |
| SentimentAgent | transcript | memory["sentiment"] |
| SummaryResponder | transcript, entities, topics, sentiment | final human‑readable summary & JSON record |
This maps directly to the linear diagram and is easy to debug step by step.
Loop 2: Circular streaming orchestrator for live chat and voice

The second pattern appears when you move from offline analysis to live interaction (voice or interactive chat). You need to:
- React quickly while the user is still speaking or typing.
- Run several background analyses in parallel.
- Avoid sending the full transcript to every agent on every turn.
The circular loop pattern is built for that.
When to use the circular loop
Use it when you:
- Stream audio or tokens in and out.
- Have a central “assistant” that talks to the user.
- Want background agents that detect sentiment shifts, safety/compliance issues, intent changes, entity updates for a CRM, or interesting moments to bookmark.
Think of a voice assistant or a real‑time meeting transcription system where the main assistant answers the user while auxiliary agents enrich context continuously.