Orchestration Patterns
8 proven patterns for coordinating multiple AI agents
Quick Reference
| Pattern | Complexity | Agents |
|---|---|---|
| Swarm | Low | 3-50+ |
| Pipeline | Low | 2-6 |
| Supervisor | Medium | 3-8 |
| Map-Reduce | Medium | 3-20+ |
| Debate | Medium | 3-5 |
| Hierarchical | High | 7-50+ |
| Router | Low | 3-10 |
| Reflection | Low | 1-2 |
| Consensus | Medium | 3-7 |
| Blackboard | High | 3-10 |
| Guardrail | Low | 2 |
| Tool Augmented | Low | 1 |
Swarm
Low3-50+ agentsFan out identical tasks to N worker agents running in parallel, then merge all results into a single output.
How It Works
A coordinator splits a task into independent subtasks and dispatches them to worker agents simultaneously. Each worker processes its subtask independently. When all workers complete, results are collected and merged into a unified response.
Diagram
┌─ Worker 1 ─┐
Task ──► │ Worker 2 │ ──► Merge
└─ Worker N ─┘Best For
Parallel independent tasks like batch classification, multi-file analysis, bulk content generation, or processing large datasets.
Tradeoffs
High throughput but no inter-agent communication during execution. Workers cannot share context or build on each other's results. Resource usage scales linearly with worker count.
Example
Analyze 20 GitHub repos in parallel — each worker clones and reviews one repo, then results merge into a comparison report.
Frameworks
Pipeline
Low2-6 agentsSequential handoff between specialized agents where each stage transforms or enriches the output before passing it forward.
How It Works
Agents are arranged in a chain. Agent 1 produces output that becomes Agent 2's input, and so on. Each agent specializes in one transformation step. The final agent produces the end result.
Diagram
Task ──► Agent 1 ──► Agent 2 ──► Agent 3 ──► Result
research draft editBest For
Multi-stage processing like research → draft → edit → publish, or extract → transform → load data pipelines.
Tradeoffs
Clear separation of concerns and predictable flow, but total latency is the sum of all stages. A failure at any stage blocks downstream agents.
Example
Content pipeline: Researcher agent gathers sources, Writer agent drafts the article, Editor agent polishes prose and checks facts.
Frameworks
Supervisor
Medium3-8 agentsOne orchestrator agent delegates tasks to specialist agents, reviews their work, and synthesizes a final result.
How It Works
The supervisor receives a task, breaks it down, and routes subtasks to the most appropriate specialist agents. It reviews each specialist's output, may request revisions, and compiles the final deliverable.
Diagram
┌─ Coder ──────┐
Task ──► Supervisor ─┼─ Researcher ─┼──► Result
└─ Tester ─────┘Best For
Complex projects requiring multiple skills — like building a feature that needs research, coding, testing, and documentation.
Tradeoffs
Flexible and adaptive, but the supervisor is a single point of failure. Quality depends heavily on the supervisor's ability to decompose tasks and evaluate results.
Example
Build a feature: Supervisor assigns research to Researcher, implementation to Coder, and validation to Tester, then reviews and merges all outputs.
Frameworks
Map-Reduce
Medium3-20+ agentsParallel map step processes chunks independently, then a reduce step aggregates and synthesizes all partial results.
How It Works
Input is split into chunks. The map phase sends each chunk to a worker agent for parallel processing. The reduce phase takes all mapped outputs and synthesizes them into a single coherent result, resolving conflicts and extracting patterns.
Diagram
┌─ Map 1 ─┐
Input ──►│ Map 2 │──► Reduce ──► Result
└─ Map N ─┘Best For
Large-scale data processing, document summarization, multi-source analysis, and any task where divide-and-conquer applies.
Tradeoffs
Excellent for large inputs but the reduce step can become a bottleneck. Requires that the task be decomposable into independent chunks.
Example
Summarize a 200-page document: Map agents summarize 10-page chunks in parallel, then a Reduce agent synthesizes chunk summaries into a coherent executive summary.
Frameworks
Debate
Medium3-5 agentsMultiple agents propose competing solutions, then a critic agent evaluates arguments and selects or synthesizes the best approach.
How It Works
Proposer agents independently generate solutions to the same problem. A critic agent reviews all proposals, identifies strengths and weaknesses, and either selects the best one or synthesizes a superior solution from the best elements of each.
Diagram
┌─ Proposer A ─┐ │ Proposer B │──► Critic ──► Best Solution └─ Proposer C ─┘
Best For
Decision-making, strategy selection, code architecture choices, and any task where exploring multiple approaches improves quality.
Tradeoffs
Produces higher-quality outputs through adversarial evaluation, but uses 3-5x more tokens than a single agent. Not suitable for straightforward tasks.
Example
Architecture decision: Three agents propose different database designs (SQL, NoSQL, graph), then a Critic evaluates each against requirements and picks the winner.
Frameworks
Hierarchical
High7-50+ agentsTree of supervisors and workers where top-level managers delegate to mid-level coordinators who manage specialized worker teams.
How It Works
A top-level supervisor breaks a large objective into domains. Each domain is assigned to a mid-level supervisor who further decomposes it into tasks for worker agents. Results flow back up the tree, with each level aggregating and quality-checking.
Diagram
Director
┌───┴───┐
Manager A Manager B
┌──┴──┐ ┌──┴──┐
W1 W2 W3 W4Best For
Large-scale orchestration with many agents, enterprise workflows, and projects that naturally decompose into teams and sub-teams.
Tradeoffs
Scales to many agents and complex projects, but introduces coordination overhead and latency from multiple management layers. Harder to debug when things go wrong.
Example
Build a full-stack app: Director delegates frontend to Manager A (with UI and CSS workers) and backend to Manager B (with API and DB workers).
Frameworks
Router
Low3-10 agentsA classifier agent analyzes incoming tasks and routes each one to the most appropriate specialist agent for handling.
How It Works
The router agent receives a task, classifies its type or domain, and forwards it to the matching specialist. Each specialist handles only the task types it excels at. This ensures optimal agent selection without wasting tokens on mismatched capabilities.
Diagram
┌─ Code Agent
Task ──► Router ─┼─ Writing Agent
└─ Research AgentBest For
Heterogeneous workloads where different tasks require different models, tools, or expertise — like a support system handling billing, technical, and account queries.
Tradeoffs
Efficient routing reduces cost and improves quality, but accuracy depends on the classifier. Misrouted tasks can produce poor results. Adding new task types requires updating the router.
Example
Customer support: Router classifies tickets as billing, technical, or account issues, then routes each to the specialist agent with the right tools and knowledge.
Frameworks
Reflection
Low1-2 agentsAn agent generates output, then a critic (or the same agent) reviews and improves it iteratively until quality criteria are met.
How It Works
The generator agent produces an initial output. A critic agent (or the generator in self-critique mode) evaluates the output against quality criteria and provides specific feedback. The generator revises based on feedback. This loop repeats until the output meets the quality bar or a max iteration count is reached.
Diagram
Task ──► Generator ──► Critic ──┐
▲ │
└── Revise ◄───────┘Best For
Quality-sensitive outputs like production code, published writing, complex analysis, and any task where iterative refinement significantly improves results.
Tradeoffs
Produces noticeably higher quality output, but each iteration adds latency and token cost. Requires well-defined quality criteria to avoid infinite loops.
Example
Code generation: Generator writes a function, Critic reviews for bugs and edge cases, Generator revises until all tests pass and code review criteria are met.
Frameworks
Consensus
Medium3-7 agentsMultiple agents independently solve the same problem, then a mediator picks the best solution or synthesizes a combined answer. Like best-of-N sampling at the agent level.
How It Works
N agents receive the same task and work on it independently with no communication. Each produces a complete solution. A mediator agent then reviews all solutions, scores them against criteria, and either selects the strongest one or synthesizes a superior answer by combining the best elements from each.
Diagram
┌─ Agent 1 ─┐
Task ──► │ Agent 2 │ ──► Mediator ──► Best Solution
└─ Agent N ─┘Best For
High-stakes decisions, code review, content quality gates, and any task where independent perspectives reduce bias and catch errors.
Tradeoffs
Dramatically improves output quality through redundancy, but multiplies token cost by the number of agents. The mediator must be capable enough to judge solution quality accurately.
Example
Code review: Three agents independently review a pull request for bugs, security issues, and performance. A mediator synthesizes findings into a single review with deduplicated, prioritized feedback.
Frameworks
Blackboard
High3-10 agentsAgents share a common knowledge base (the blackboard) and contribute partial solutions. Each agent watches the blackboard and acts when it can contribute.
How It Works
A shared data store (the blackboard) holds the current problem state and partial solutions. Specialist agents monitor the blackboard and activate when they detect something they can contribute to. Each agent reads the current state, adds its partial solution, and writes back. A controller checks completion criteria after each update.
Diagram
Agent A ──►┌─────────────┐◄── Agent C
│ Blackboard │
Agent B ──►│ (shared │◄── Agent D
│ state) │
└──────┬──────┘
▼
ResultBest For
Complex problem solving, collaborative analysis, research synthesis, and tasks where the solution emerges from incremental contributions by different specialists.
Tradeoffs
Highly flexible and supports emergent problem solving, but requires careful state management. Agents may conflict or overwrite each other's contributions. Harder to predict execution order and debug.
Example
Research synthesis: A blackboard holds a research question. An evidence-gathering agent adds sources, an analysis agent identifies patterns, a critique agent flags contradictions, and a writing agent drafts conclusions - all reading and writing to the shared board.
Frameworks
Guardrail
Low2 agentsA primary agent does the work while a secondary agent monitors and validates output in real-time. The guardrail agent can block, modify, or flag unsafe or incorrect outputs.
How It Works
The primary agent generates output as normal. Before the output is delivered, a guardrail agent intercepts and evaluates it against safety rules, quality standards, or business constraints. If the output passes, it goes through. If it fails, the guardrail either blocks it, requests a revision, or modifies it directly.
Diagram
Task ──► Primary Agent ──► Guardrail ──► Output
│
[block/revise]
│
▼
RejectedBest For
Safety-critical applications, content moderation, code validation, regulatory compliance, and any task where unchecked output poses risk.
Tradeoffs
Adds a reliable safety layer with minimal complexity, but introduces latency for every output. The guardrail agent must be fast and accurate - false positives block good output, false negatives defeat the purpose.
Example
Code generation: A coding agent writes functions while a guardrail agent checks every output for SQL injection, hardcoded secrets, and unsafe dependencies before the code reaches the user.
Frameworks
Tool Augmented
Low1 agentsA single agent with access to a rich tool ecosystem - MCP servers, APIs, file system, databases. Not multi-agent per se, but the tools act as specialized capabilities that extend the agent's reach.
How It Works
One agent receives a task and has access to a registry of tools (via MCP, function calling, or plugin systems). It plans a sequence of tool calls, executes them, observes results, and iterates until the task is complete. The agent decides which tools to use, in what order, and how to combine their outputs.
Diagram
┌─ MCP Server
├─ API
Agent ──────┼─ File System
├─ Database
└─ Browser
▼
ResultBest For
General-purpose automation, developer workflows, Claude Code-style usage, and tasks where a single smart agent with the right tools outperforms a team of agents.
Tradeoffs
Simple architecture with no coordination overhead, but limited by the agent's ability to plan and use tools correctly. Performance ceiling is bounded by a single model's reasoning capacity.
Example
Developer workflow: Claude Code reads files, runs tests, searches codebases, executes shell commands, and calls APIs - all through tool use from a single agent managing the entire task.