Multi-Agent Systems: Orchestration Patterns and Real-World Examples

A single AI agent is powerful. A team of specialized agents working in parallel can solve problems that would overwhelm one. This guide covers the patterns and tradeoffs of multi-agent systems.

Multi-agent system with a supervisor delegating to specialized sub-agents

Why Multi-Agent?

Single agents break down when:

Context window fills up — complex tasks generate too much history
Specialization helps — a coding agent and a research agent have different system prompts, tools, and temperature settings
Parallelism is needed — independent subtasks can run simultaneously
Quality improves with review — one agent writes, another critiques

Core Multi-Agent Topologies

1. SUPERVISOR / ORCHESTRATOR
   ┌──────────────┐
   │  Supervisor  │  ← receives user goal
   └──────┬───────┘
          │ delegates
   ┌──────┼──────┐
   ▼      ▼      ▼
 Agent  Agent  Agent  ← specialized workers
   │      │      │
   └──────┼──────┘
          │ results
          ▼
   ┌──────────────┐
   │  Supervisor  │  ← synthesizes final answer
   └──────────────┘

2. PIPELINE (Sequential)
   Agent A → Agent B → Agent C → Output
   (researcher) (writer) (editor)

3. DEBATE (Adversarial)
   Agent A ←→ Agent B  ← argue until consensus

4. PARALLEL WORKERS
   Task → [Agent 1, Agent 2, Agent 3]  ← simultaneous
         ↓
       Merge results

Supervisor Pattern in Code (LangGraph)

from langgraph.graph import StateGraph, END
from typing import TypedDict, Literal

class AgentState(TypedDict):
    messages: list
    next_agent: str

# Define specialized agents
def researcher_agent(state: AgentState) -> AgentState:
    """Searches the web and gathers information."""
    # ... use search tools
    return {"messages": state["messages"] + [result], "next_agent": "supervisor"}

def writer_agent(state: AgentState) -> AgentState:
    """Drafts content based on research."""
    # ... generate draft
    return {"messages": state["messages"] + [draft], "next_agent": "supervisor"}

def supervisor(state: AgentState) -> AgentState:
    """Decides which agent to call next, or finishes."""
    response = llm.invoke(SUPERVISOR_PROMPT + str(state["messages"]))
    next = parse_next_agent(response)  # "researcher" | "writer" | "END"
    return {"messages": state["messages"], "next_agent": next}

# Build the graph
graph = StateGraph(AgentState)
graph.add_node("supervisor", supervisor)
graph.add_node("researcher", researcher_agent)
graph.add_node("writer", writer_agent)

# Supervisor decides who goes next
graph.add_conditional_edges(
    "supervisor",
    lambda state: state["next_agent"],
    {"researcher": "researcher", "writer": "writer", "END": END},
)

# Workers always return to supervisor
graph.add_edge("researcher", "supervisor")
graph.add_edge("writer", "supervisor")

graph.set_entry_point("supervisor")
app = graph.compile()

# Run
result = app.invoke({"messages": [user_message], "next_agent": "supervisor"})

CrewAI: Role-Based Teams

CrewAI is the most intuitive framework for defining agent teams with roles and goals:

from crewai import Agent, Task, Crew, Process

# Define agents with roles
researcher = Agent(
    role="Senior Research Analyst",
    goal="Uncover cutting-edge developments in AI and data science",
    backstory="You are an expert researcher with a talent for finding"
              " accurate, up-to-date information from the web.",
    tools=[search_tool, scraper_tool],
    verbose=True,
)

writer = Agent(
    role="Tech Content Writer",
    goal="Write engaging, accurate technical blog posts",
    backstory="You craft compelling narratives from complex data,",
              " tailored for a developer audience.",
    tools=[],
)

editor = Agent(
    role="Editor",
    goal="Polish content for clarity, accuracy, and SEO",
    backstory="You have a sharp eye for improving technical writing.",
    tools=[],
)

# Define tasks
research_task = Task(
    description="Research the latest developments in AI agents for Q2 2025.",
    expected_output="A detailed report with key findings, stats, and source URLs.",
    agent=researcher,
)

writing_task = Task(
    description="Write a 1500-word blog post based on the research report.",
    expected_output="A complete blog post in Markdown format.",
    agent=writer,
    context=[research_task],  # depends on research output
)

edit_task = Task(
    description="Edit and improve the draft blog post.",
    expected_output="The final, polished blog post ready to publish.",
    agent=editor,
    context=[writing_task],
)

# Assemble and run the crew
crew = Crew(
    agents=[researcher, writer, editor],
    tasks=[research_task, writing_task, edit_task],
    process=Process.sequential,  # or Process.hierarchical
    verbose=True,
)

result = crew.kickoff(inputs={"topic": "AI agents in 2025"})
print(result.raw)

AutoGen: Conversational Agents

Microsoft AutoGen models agents as autonomous chatters. Agents converse until the task is done:

import autogen

config = {"config_list": [{"model": "gpt-4o", "api_key": OPENAI_KEY}]}

# UserProxyAgent acts on behalf of the user (can execute code)
user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",  # fully automated
    max_consecutive_auto_reply=10,
    code_execution_config={"work_dir": "output", "use_docker": True},
)

# AssistantAgent writes and reviews code
assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config=config,
    system_message="You are a Python expert. Write clean, tested code.",
)

# Start the conversation
user_proxy.initiate_chat(
    assistant,
    message="Write a Python script that fetches the top 10 HN stories and saves to CSV",
)

# AutoGen loop:
# 1. assistant writes code
# 2. user_proxy executes code
# 3. user_proxy sends output back to assistant
# 4. assistant reviews / fixes
# 5. repeat until TERMINATE

Agent Communication Protocols

When agents call each other, use a structured message format:

{
  "from": "supervisor",
  "to": "researcher",
  "task_id": "task_01J4...",
  "type": "task_request",
  "payload": {
    "instruction": "Find the latest pricing for Pinecone, Weaviate, and Qdrant vector databases",
    "output_format": "JSON with keys: provider, plan_name, price_per_month, free_tier",
    "deadline": "2025-06-01T12:00:00Z"
  },
  "context": {
    "parent_goal": "Create a vector DB comparison guide",
    "prior_findings": []
  }
}

Guardrails for Multi-Agent Systems

Multi-agent systems amplify both capability and risk. Each agent call costs money and time, and errors compound:

Critical safeguards:

1. Max iterations per agent (prevent loops)
   agent = Agent(max_iter=5)

2. Budget limits
   crew = Crew(max_tokens_per_run=50_000)

3. Timeouts
   async with asyncio.timeout(120):
       result = await agent.run(task)

4. Human-in-the-loop checkpoints
   # Pause for human approval before irreversible actions
   if action.type in IRREVERSIBLE_ACTIONS:
       await request_human_approval(action)

5. Structured output validation
   # Validate each agent's output before passing to the next
   output = AgentOutputSchema.model_validate(raw_output)

Choosing a Framework

Decision tree:

  Do you need agents to talk to each other freely?
  ├── YES → AutoGen (conversational, flexible)
  └── NO
       │
       Do you need role-based teams with tasks?
       ├── YES → CrewAI (simple, intuitive API)
       └── NO
            │
            Do you need fine-grained control of graph flow?
            ├── YES → LangGraph (complex state machines)
            └── NO
                 │
                 Are you building for TypeScript/full-stack web?
                 ├── YES → Vercel AI SDK
                 └── NO → LangChain agents (most integrations)

Multi-agent systems are still maturing fast. Favor simplicity: start with a single well-prompted agent, add a second only when you hit a clear wall.

A single AI agent is powerful. A team of specialized agents working in parallel can solve problems that would overwhelm one. This guide covers the patterns and tradeoffs of multi-agent systems.

Multi-agent system with a supervisor delegating to specialized sub-agents

Why Multi-Agent?

Single agents break down when:

Context window fills up — complex tasks generate too much history
Specialization helps — a coding agent and a research agent have different system prompts, tools, and temperature settings
Parallelism is needed — independent subtasks can run simultaneously
Quality improves with review — one agent writes, another critiques

Core Multi-Agent Topologies

1. SUPERVISOR / ORCHESTRATOR
   ┌──────────────┐
   │  Supervisor  │  ← receives user goal
   └──────┬───────┘
          │ delegates
   ┌──────┼──────┐
   ▼      ▼      ▼
 Agent  Agent  Agent  ← specialized workers
   │      │      │
   └──────┼──────┘
          │ results
          ▼
   ┌──────────────┐
   │  Supervisor  │  ← synthesizes final answer
   └──────────────┘

2. PIPELINE (Sequential)
   Agent A → Agent B → Agent C → Output
   (researcher) (writer) (editor)

3. DEBATE (Adversarial)
   Agent A ←→ Agent B  ← argue until consensus

4. PARALLEL WORKERS
   Task → [Agent 1, Agent 2, Agent 3]  ← simultaneous
         ↓
       Merge results

Supervisor Pattern in Code (LangGraph)

from langgraph.graph import StateGraph, END
from typing import TypedDict, Literal

class AgentState(TypedDict):
    messages: list
    next_agent: str

# Define specialized agents
def researcher_agent(state: AgentState) -> AgentState:
    """Searches the web and gathers information."""
    # ... use search tools
    return {"messages": state["messages"] + [result], "next_agent": "supervisor"}

def writer_agent(state: AgentState) -> AgentState:
    """Drafts content based on research."""
    # ... generate draft
    return {"messages": state["messages"] + [draft], "next_agent": "supervisor"}

def supervisor(state: AgentState) -> AgentState:
    """Decides which agent to call next, or finishes."""
    response = llm.invoke(SUPERVISOR_PROMPT + str(state["messages"]))
    next = parse_next_agent(response)  # "researcher" | "writer" | "END"
    return {"messages": state["messages"], "next_agent": next}

# Build the graph
graph = StateGraph(AgentState)
graph.add_node("supervisor", supervisor)
graph.add_node("researcher", researcher_agent)
graph.add_node("writer", writer_agent)

# Supervisor decides who goes next
graph.add_conditional_edges(
    "supervisor",
    lambda state: state["next_agent"],
    {"researcher": "researcher", "writer": "writer", "END": END},
)

# Workers always return to supervisor
graph.add_edge("researcher", "supervisor")
graph.add_edge("writer", "supervisor")

graph.set_entry_point("supervisor")
app = graph.compile()

# Run
result = app.invoke({"messages": [user_message], "next_agent": "supervisor"})

CrewAI: Role-Based Teams

CrewAI is the most intuitive framework for defining agent teams with roles and goals:

from crewai import Agent, Task, Crew, Process

# Define agents with roles
researcher = Agent(
    role="Senior Research Analyst",
    goal="Uncover cutting-edge developments in AI and data science",
    backstory="You are an expert researcher with a talent for finding"
              " accurate, up-to-date information from the web.",
    tools=[search_tool, scraper_tool],
    verbose=True,
)

writer = Agent(
    role="Tech Content Writer",
    goal="Write engaging, accurate technical blog posts",
    backstory="You craft compelling narratives from complex data,",
              " tailored for a developer audience.",
    tools=[],
)

editor = Agent(
    role="Editor",
    goal="Polish content for clarity, accuracy, and SEO",
    backstory="You have a sharp eye for improving technical writing.",
    tools=[],
)

# Define tasks
research_task = Task(
    description="Research the latest developments in AI agents for Q2 2025.",
    expected_output="A detailed report with key findings, stats, and source URLs.",
    agent=researcher,
)

writing_task = Task(
    description="Write a 1500-word blog post based on the research report.",
    expected_output="A complete blog post in Markdown format.",
    agent=writer,
    context=[research_task],  # depends on research output
)

edit_task = Task(
    description="Edit and improve the draft blog post.",
    expected_output="The final, polished blog post ready to publish.",
    agent=editor,
    context=[writing_task],
)

# Assemble and run the crew
crew = Crew(
    agents=[researcher, writer, editor],
    tasks=[research_task, writing_task, edit_task],
    process=Process.sequential,  # or Process.hierarchical
    verbose=True,
)

result = crew.kickoff(inputs={"topic": "AI agents in 2025"})
print(result.raw)

AutoGen: Conversational Agents

Microsoft AutoGen models agents as autonomous chatters. Agents converse until the task is done:

import autogen

config = {"config_list": [{"model": "gpt-4o", "api_key": OPENAI_KEY}]}

# UserProxyAgent acts on behalf of the user (can execute code)
user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",  # fully automated
    max_consecutive_auto_reply=10,
    code_execution_config={"work_dir": "output", "use_docker": True},
)

# AssistantAgent writes and reviews code
assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config=config,
    system_message="You are a Python expert. Write clean, tested code.",
)

# Start the conversation
user_proxy.initiate_chat(
    assistant,
    message="Write a Python script that fetches the top 10 HN stories and saves to CSV",
)

# AutoGen loop:
# 1. assistant writes code
# 2. user_proxy executes code
# 3. user_proxy sends output back to assistant
# 4. assistant reviews / fixes
# 5. repeat until TERMINATE

Agent Communication Protocols

When agents call each other, use a structured message format:

{
  "from": "supervisor",
  "to": "researcher",
  "task_id": "task_01J4...",
  "type": "task_request",
  "payload": {
    "instruction": "Find the latest pricing for Pinecone, Weaviate, and Qdrant vector databases",
    "output_format": "JSON with keys: provider, plan_name, price_per_month, free_tier",
    "deadline": "2025-06-01T12:00:00Z"
  },
  "context": {
    "parent_goal": "Create a vector DB comparison guide",
    "prior_findings": []
  }
}

Guardrails for Multi-Agent Systems

Multi-agent systems amplify both capability and risk. Each agent call costs money and time, and errors compound:

Critical safeguards:

1. Max iterations per agent (prevent loops)
   agent = Agent(max_iter=5)

2. Budget limits
   crew = Crew(max_tokens_per_run=50_000)

3. Timeouts
   async with asyncio.timeout(120):
       result = await agent.run(task)

4. Human-in-the-loop checkpoints
   # Pause for human approval before irreversible actions
   if action.type in IRREVERSIBLE_ACTIONS:
       await request_human_approval(action)

5. Structured output validation
   # Validate each agent's output before passing to the next
   output = AgentOutputSchema.model_validate(raw_output)

Choosing a Framework

Decision tree:

  Do you need agents to talk to each other freely?
  ├── YES → AutoGen (conversational, flexible)
  └── NO
       │
       Do you need role-based teams with tasks?
       ├── YES → CrewAI (simple, intuitive API)
       └── NO
            │
            Do you need fine-grained control of graph flow?
            ├── YES → LangGraph (complex state machines)
            └── NO
                 │
                 Are you building for TypeScript/full-stack web?
                 ├── YES → Vercel AI SDK
                 └── NO → LangChain agents (most integrations)

Multi-agent systems are still maturing fast. Favor simplicity: start with a single well-prompted agent, add a second only when you hit a clear wall.

Why Multi-Agent?

Core Multi-Agent Topologies

Supervisor Pattern in Code (LangGraph)

CrewAI: Role-Based Teams

AutoGen: Conversational Agents

Agent Communication Protocols

Guardrails for Multi-Agent Systems

Choosing a Framework

Try These Tools

Related Articles

Multi-Agent Systems: Orchestration Patterns and Real-World Examples

Why Multi-Agent?

Core Multi-Agent Topologies

Supervisor Pattern in Code (LangGraph)

CrewAI: Role-Based Teams

AutoGen: Conversational Agents

Agent Communication Protocols

Guardrails for Multi-Agent Systems

Choosing a Framework

Try These Tools

Related Articles