-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Overview
The Swarm pattern is a decentralized multi-agent architecture where agents hand off control to each other peer-to-peer, without a central supervisor. This approach is ~40% faster than supervisor-based orchestration because it eliminates routing overhead - each agent directly decides which peer should handle the next step.
Originally introduced by OpenAI as the "Swarm" framework (now evolved into the OpenAI Agents SDK), this pattern emphasizes ergonomic, lightweight multi-agent orchestration through explicit handoff functions.
How It Works
- Peer-to-Peer Control: Agents directly transfer control to other agents using handoff functions
- No Central Router: Each agent independently decides when to hand off and to whom
- Context Preservation: Chat history and context flow through handoffs
- Explicit Handoffs: Handoff functions are defined as tools that return an Agent object
Control Flow:
User Request → Agent A (Triage)
↓ handoff_to_researcher()
Agent B (Researcher)
↓ handoff_to_analyst()
Agent C (Analyst)
↓ handoff_to_writer()
Agent D (Writer)
↓
Final Result
Unlike supervisor patterns, there's no central decision-maker - each agent autonomously decides the next peer.
Reference Implementations
- OpenAI Swarm (archived) - Original educational framework for lightweight multi-agent orchestration
- OpenAI Agents SDK Cookbook - Orchestrating Agents: Routines and Handoffs
- OpenAI Swarm Framework Guide - Comprehensive guide to reliable multi-agent systems
- Swarm Explained with Code - Detailed breakdown of routines and handoffs
- LangGraph Multi-Agent Swarm - LangGraph implementation of swarm patterns
Proposed PyWorkflow Implementation
from pyworkflow_agents import Agent, handoff
from pyworkflow_agents.providers import AnthropicProvider
from pyworkflow import workflow, start_child_workflow
# Define agents with explicit handoff tools
triage_agent = Agent(
name="triage",
provider=AnthropicProvider(model="claude-sonnet-4-5-20250929"),
instructions="You triage customer requests and route to specialists.",
tools=[
handoff("researcher", "Hand off to research specialist"),
handoff("support", "Hand off to customer support"),
]
)
researcher_agent = Agent(
name="researcher",
provider=AnthropicProvider(model="claude-sonnet-4-5-20250929"),
instructions="You research topics and find information.",
tools=[
web_search_tool,
handoff("analyst", "Hand off to analyst for data processing"),
]
)
analyst_agent = Agent(
name="analyst",
provider=AnthropicProvider(model="claude-sonnet-4-5-20250929"),
instructions="You analyze data and create insights.",
tools=[
calculator_tool,
handoff("writer", "Hand off to writer for documentation"),
]
)
writer_agent = Agent(
name="writer",
provider=AnthropicProvider(model="claude-sonnet-4-5-20250929"),
instructions="You write clear, concise documentation.",
tools=[file_writer_tool]
)
# Swarm execution via pyworkflow
@workflow(durable=True)
async def swarm_workflow(user_request: str):
"""
Execute swarm-style agent handoffs using child workflows.
Each handoff is a new workflow execution, creating a chain.
"""
current_agent = triage_agent
context = {"request": user_request, "history": []}
max_handoffs = 10 # Prevent infinite loops
for i in range(max_handoffs):
# Execute current agent as child workflow
result = await start_child_workflow(
agent_execution_workflow,
agent=current_agent,
context=context,
wait_for_completion=True
)
# Check if agent handed off to another agent
if result.get("handoff_to"):
context["history"].append({
"agent": current_agent.name,
"output": result.get("output")
})
current_agent = get_agent_by_name(result["handoff_to"])
else:
# No handoff - task complete
return result
raise RuntimeError(f"Exceeded max handoffs ({max_handoffs})")
@workflow(durable=True)
async def agent_execution_workflow(agent: Agent, context: dict):
"""
Execute a single agent. If it calls a handoff tool, return the target agent.
"""
response = await agent.run(
messages=context.get("history", []),
input=context.get("request")
)
# Check if response includes a handoff tool call
if response.tool_calls:
for tool_call in response.tool_calls:
if tool_call.name.startswith("handoff_to_"):
target_agent = tool_call.name.replace("handoff_to_", "")
return {
"handoff_to": target_agent,
"output": response.content
}
# No handoff - return final result
return {"output": response.content}Key Mapping to PyWorkflow Primitives:
- Agent handoff =
start_child_workflow()withwait_for_completion=True - Handoff chain = Sequence of child workflow executions
- Handoff history = Recorded as
AGENT_HANDOFFevents - Context passing = Workflow context passed to each child
Event Types
New events for swarm pattern:
class EventType(str, Enum):
# Existing events...
AGENT_HANDOFF = "agent_handoff" # Agent hands off to another agent
HANDOFF_CHAIN_START = "handoff_chain_start" # Start of handoff chain
HANDOFF_CHAIN_END = "handoff_chain_end" # End of handoff chainEvent Data Schema:
# AGENT_HANDOFF
{
"from_agent": "triage",
"to_agent": "researcher",
"reason": "User request requires web research",
"context_size": 1024,
"child_run_id": "run_xyz789"
}
# HANDOFF_CHAIN_START
{
"initial_agent": "triage",
"user_request": "Find latest AI research",
"max_handoffs": 10
}
# HANDOFF_CHAIN_END
{
"final_agent": "writer",
"total_handoffs": 3,
"handoff_path": ["triage", "researcher", "analyst", "writer"],
"execution_time_ms": 12450
}Trade-offs
Pros
- Performance: ~40% faster than supervisor pattern - no central routing overhead
- Simplicity: No complex orchestration logic, agents make local decisions
- Autonomy: Each agent independently determines next steps
- Natural flow: Handoffs mirror human task delegation
- Event sourcing: Full handoff chain visible in event log
- Fault tolerance: PyWorkflow's child workflows provide isolation and recovery
Cons
- No global coordination: Hard to implement complex routing strategies
- Potential cycles: Agents could hand off in circles (requires max_handoffs limit)
- Context explosion: Long handoff chains accumulate large context
- Debugging complexity: Decentralized decisions harder to trace than supervisor routing
- Child workflow overhead: Each handoff creates a new workflow run_id
When to Use
- Linear or tree-shaped task delegation (not complex graphs)
- Performance-critical applications (minimize orchestration overhead)
- Agents have clear specialization and handoff rules
- Task flow is relatively predictable
When to Avoid
- Complex routing logic requiring global view (use supervisor)
- Need centralized audit trail of all decisions
- Agents need to consult multiple peers before deciding
- Risk of circular handoffs is high
Performance Comparison
Based on OpenAI Swarm research:
- Swarm handoffs: ~40% faster than supervisor routing
- Reason: Eliminates LLM call for routing decision
- Trade-off: Less control over orchestration logic
Implementation Checklist
- Create
handoff()tool factory inpyworkflow_agents/handoffs.py - Implement
SwarmAgentclass with handoff support - Add
AGENT_HANDOFF,HANDOFF_CHAIN_START/ENDevent types - Integrate with PyWorkflow's
start_child_workflow()primitive - Add max_handoffs protection against infinite loops
- Implement handoff chain visualization in event log
- Add context size tracking to prevent explosion
- Create examples in
examples/agents/swarm_pattern.py - Add tests for handoff chains and cycle detection
- Document handoff best practices and anti-patterns
Related Issues
- Agent Pattern: Supervisor (Manager + Workers) #154 - Supervisor Agent (Manager + Workers) - Centralized alternative to swarm
- #TBD - Hierarchical Multi-Agent - Can combine swarm within teams
- #TBD - Collaborative Agent - Shared state vs handoff context
References
- OpenAI Swarm GitHub - Original framework (now archived)
- Orchestrating Agents: Routines and Handoffs - Official OpenAI Cookbook
- OpenAI Swarm Framework Guide - Comprehensive implementation guide
- Swarm Explained with Code
- OpenAI Swarm Introduction
- Arize AI: Swarm Experimental Approach