Learn how to design multi-agent systems — supervisor patterns, agent handoff, parallel execution, CrewAI and AutoGen examples, and when to use multi-agent over single-agent.
A single AI agent is powerful. A team of specialized agents working in parallel can solve problems that would overwhelm one. This guide covers the patterns and tradeoffs of multi-agent systems.
Single agents break down when:
1. SUPERVISOR / ORCHESTRATOR
┌──────────────┐
│ Supervisor │ ← receives user goal
└──────┬───────┘
│ delegates
┌──────┼──────┐
▼ ▼ ▼
Agent Agent Agent ← specialized workers
│ │ │
└──────┼──────┘
│ results
▼
┌──────────────┐
│ Supervisor │ ← synthesizes final answer
└──────────────┘
2. PIPELINE (Sequential)
Agent A → Agent B → Agent C → Output
(researcher) (writer) (editor)
3. DEBATE (Adversarial)
Agent A ←→ Agent B ← argue until consensus
4. PARALLEL WORKERS
Task → [Agent 1, Agent 2, Agent 3] ← simultaneous
↓
Merge results
from langgraph.graph import StateGraph, END
from typing import TypedDict, Literal
class AgentState(TypedDict):
messages: list
next_agent: str
# Define specialized agents
def researcher_agent(state: AgentState) -> AgentState:
"""Searches the web and gathers information."""
# ... use search tools
return {"messages": state["messages"] + [result], "next_agent": "supervisor"}
def writer_agent(state: AgentState) -> AgentState:
"""Drafts content based on research."""
# ... generate draft
return {"messages": state["messages"] + [draft], "next_agent": "supervisor"}
def supervisor(state: AgentState) -> AgentState:
"""Decides which agent to call next, or finishes."""
response = llm.invoke(SUPERVISOR_PROMPT + str(state["messages"]))
next = parse_next_agent(response) # "researcher" | "writer" | "END"
return {"messages": state["messages"], "next_agent": next}
# Build the graph
graph = StateGraph(AgentState)
graph.add_node("supervisor", supervisor)
graph.add_node("researcher", researcher_agent)
graph.add_node("writer", writer_agent)
# Supervisor decides who goes next
graph.add_conditional_edges(
"supervisor",
lambda state: state["next_agent"],
{"researcher": "researcher", "writer": "writer", "END": END},
)
# Workers always return to supervisor
graph.add_edge("researcher", "supervisor")
graph.add_edge("writer", "supervisor")
graph.set_entry_point("supervisor")
app = graph.compile()
# Run
result = app.invoke({"messages": [user_message], "next_agent": "supervisor"})
CrewAI is the most intuitive framework for defining agent teams with roles and goals:
from crewai import Agent, Task, Crew, Process
# Define agents with roles
researcher = Agent(
role="Senior Research Analyst",
goal="Uncover cutting-edge developments in AI and data science",
backstory="You are an expert researcher with a talent for finding"
" accurate, up-to-date information from the web.",
tools=[search_tool, scraper_tool],
verbose=True,
)
writer = Agent(
role="Tech Content Writer",
goal="Write engaging, accurate technical blog posts",
backstory="You craft compelling narratives from complex data,",
" tailored for a developer audience.",
tools=[],
)
editor = Agent(
role="Editor",
goal="Polish content for clarity, accuracy, and SEO",
backstory="You have a sharp eye for improving technical writing.",
tools=[],
)
# Define tasks
research_task = Task(
description="Research the latest developments in AI agents for Q2 2025.",
expected_output="A detailed report with key findings, stats, and source URLs.",
agent=researcher,
)
writing_task = Task(
description="Write a 1500-word blog post based on the research report.",
expected_output="A complete blog post in Markdown format.",
agent=writer,
context=[research_task], # depends on research output
)
edit_task = Task(
description="Edit and improve the draft blog post.",
expected_output="The final, polished blog post ready to publish.",
agent=editor,
context=[writing_task],
)
# Assemble and run the crew
crew = Crew(
agents=[researcher, writer, editor],
tasks=[research_task, writing_task, edit_task],
process=Process.sequential, # or Process.hierarchical
verbose=True,
)
result = crew.kickoff(inputs={"topic": "AI agents in 2025"})
print(result.raw)
Microsoft AutoGen models agents as autonomous chatters. Agents converse until the task is done:
import autogen
config = {"config_list": [{"model": "gpt-4o", "api_key": OPENAI_KEY}]}
# UserProxyAgent acts on behalf of the user (can execute code)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER", # fully automated
max_consecutive_auto_reply=10,
code_execution_config={"work_dir": "output", "use_docker": True},
)
# AssistantAgent writes and reviews code
assistant = autogen.AssistantAgent(
name="assistant",
llm_config=config,
system_message="You are a Python expert. Write clean, tested code.",
)
# Start the conversation
user_proxy.initiate_chat(
assistant,
message="Write a Python script that fetches the top 10 HN stories and saves to CSV",
)
# AutoGen loop:
# 1. assistant writes code
# 2. user_proxy executes code
# 3. user_proxy sends output back to assistant
# 4. assistant reviews / fixes
# 5. repeat until TERMINATE
When agents call each other, use a structured message format:
{
"from": "supervisor",
"to": "researcher",
"task_id": "task_01J4...",
"type": "task_request",
"payload": {
"instruction": "Find the latest pricing for Pinecone, Weaviate, and Qdrant vector databases",
"output_format": "JSON with keys: provider, plan_name, price_per_month, free_tier",
"deadline": "2025-06-01T12:00:00Z"
},
"context": {
"parent_goal": "Create a vector DB comparison guide",
"prior_findings": []
}
}
Multi-agent systems amplify both capability and risk. Each agent call costs money and time, and errors compound:
Critical safeguards:
1. Max iterations per agent (prevent loops)
agent = Agent(max_iter=5)
2. Budget limits
crew = Crew(max_tokens_per_run=50_000)
3. Timeouts
async with asyncio.timeout(120):
result = await agent.run(task)
4. Human-in-the-loop checkpoints
# Pause for human approval before irreversible actions
if action.type in IRREVERSIBLE_ACTIONS:
await request_human_approval(action)
5. Structured output validation
# Validate each agent's output before passing to the next
output = AgentOutputSchema.model_validate(raw_output)
Decision tree:
Do you need agents to talk to each other freely?
├── YES → AutoGen (conversational, flexible)
└── NO
│
Do you need role-based teams with tasks?
├── YES → CrewAI (simple, intuitive API)
└── NO
│
Do you need fine-grained control of graph flow?
├── YES → LangGraph (complex state machines)
└── NO
│
Are you building for TypeScript/full-stack web?
├── YES → Vercel AI SDK
└── NO → LangChain agents (most integrations)
Multi-agent systems are still maturing fast. Favor simplicity: start with a single well-prompted agent, add a second only when you hit a clear wall.