Skip to content

Agent Pattern: Swarm (Peer Handoffs) #157

@yasha-dev1

Description

@yasha-dev1

Overview

The Swarm pattern is a decentralized multi-agent architecture where agents hand off control to each other peer-to-peer, without a central supervisor. This approach is ~40% faster than supervisor-based orchestration because it eliminates routing overhead - each agent directly decides which peer should handle the next step.

Originally introduced by OpenAI as the "Swarm" framework (now evolved into the OpenAI Agents SDK), this pattern emphasizes ergonomic, lightweight multi-agent orchestration through explicit handoff functions.

How It Works

  1. Peer-to-Peer Control: Agents directly transfer control to other agents using handoff functions
  2. No Central Router: Each agent independently decides when to hand off and to whom
  3. Context Preservation: Chat history and context flow through handoffs
  4. Explicit Handoffs: Handoff functions are defined as tools that return an Agent object

Control Flow:

User Request → Agent A (Triage)
                    ↓ handoff_to_researcher()
              Agent B (Researcher)
                    ↓ handoff_to_analyst()
              Agent C (Analyst)
                    ↓ handoff_to_writer()
              Agent D (Writer)
                    ↓
              Final Result

Unlike supervisor patterns, there's no central decision-maker - each agent autonomously decides the next peer.

Reference Implementations

Proposed PyWorkflow Implementation

from pyworkflow_agents import Agent, handoff
from pyworkflow_agents.providers import AnthropicProvider
from pyworkflow import workflow, start_child_workflow

# Define agents with explicit handoff tools
triage_agent = Agent(
    name="triage",
    provider=AnthropicProvider(model="claude-sonnet-4-5-20250929"),
    instructions="You triage customer requests and route to specialists.",
    tools=[
        handoff("researcher", "Hand off to research specialist"),
        handoff("support", "Hand off to customer support"),
    ]
)

researcher_agent = Agent(
    name="researcher",
    provider=AnthropicProvider(model="claude-sonnet-4-5-20250929"),
    instructions="You research topics and find information.",
    tools=[
        web_search_tool,
        handoff("analyst", "Hand off to analyst for data processing"),
    ]
)

analyst_agent = Agent(
    name="analyst",
    provider=AnthropicProvider(model="claude-sonnet-4-5-20250929"),
    instructions="You analyze data and create insights.",
    tools=[
        calculator_tool,
        handoff("writer", "Hand off to writer for documentation"),
    ]
)

writer_agent = Agent(
    name="writer",
    provider=AnthropicProvider(model="claude-sonnet-4-5-20250929"),
    instructions="You write clear, concise documentation.",
    tools=[file_writer_tool]
)

# Swarm execution via pyworkflow
@workflow(durable=True)
async def swarm_workflow(user_request: str):
    """
    Execute swarm-style agent handoffs using child workflows.
    Each handoff is a new workflow execution, creating a chain.
    """
    current_agent = triage_agent
    context = {"request": user_request, "history": []}
    
    max_handoffs = 10  # Prevent infinite loops
    for i in range(max_handoffs):
        # Execute current agent as child workflow
        result = await start_child_workflow(
            agent_execution_workflow,
            agent=current_agent,
            context=context,
            wait_for_completion=True
        )
        
        # Check if agent handed off to another agent
        if result.get("handoff_to"):
            context["history"].append({
                "agent": current_agent.name,
                "output": result.get("output")
            })
            current_agent = get_agent_by_name(result["handoff_to"])
        else:
            # No handoff - task complete
            return result
    
    raise RuntimeError(f"Exceeded max handoffs ({max_handoffs})")

@workflow(durable=True)
async def agent_execution_workflow(agent: Agent, context: dict):
    """
    Execute a single agent. If it calls a handoff tool, return the target agent.
    """
    response = await agent.run(
        messages=context.get("history", []),
        input=context.get("request")
    )
    
    # Check if response includes a handoff tool call
    if response.tool_calls:
        for tool_call in response.tool_calls:
            if tool_call.name.startswith("handoff_to_"):
                target_agent = tool_call.name.replace("handoff_to_", "")
                return {
                    "handoff_to": target_agent,
                    "output": response.content
                }
    
    # No handoff - return final result
    return {"output": response.content}

Key Mapping to PyWorkflow Primitives:

  • Agent handoff = start_child_workflow() with wait_for_completion=True
  • Handoff chain = Sequence of child workflow executions
  • Handoff history = Recorded as AGENT_HANDOFF events
  • Context passing = Workflow context passed to each child

Event Types

New events for swarm pattern:

class EventType(str, Enum):
    # Existing events...
    AGENT_HANDOFF = "agent_handoff"          # Agent hands off to another agent
    HANDOFF_CHAIN_START = "handoff_chain_start"  # Start of handoff chain
    HANDOFF_CHAIN_END = "handoff_chain_end"      # End of handoff chain

Event Data Schema:

# AGENT_HANDOFF
{
    "from_agent": "triage",
    "to_agent": "researcher",
    "reason": "User request requires web research",
    "context_size": 1024,
    "child_run_id": "run_xyz789"
}

# HANDOFF_CHAIN_START
{
    "initial_agent": "triage",
    "user_request": "Find latest AI research",
    "max_handoffs": 10
}

# HANDOFF_CHAIN_END
{
    "final_agent": "writer",
    "total_handoffs": 3,
    "handoff_path": ["triage", "researcher", "analyst", "writer"],
    "execution_time_ms": 12450
}

Trade-offs

Pros

  • Performance: ~40% faster than supervisor pattern - no central routing overhead
  • Simplicity: No complex orchestration logic, agents make local decisions
  • Autonomy: Each agent independently determines next steps
  • Natural flow: Handoffs mirror human task delegation
  • Event sourcing: Full handoff chain visible in event log
  • Fault tolerance: PyWorkflow's child workflows provide isolation and recovery

Cons

  • No global coordination: Hard to implement complex routing strategies
  • Potential cycles: Agents could hand off in circles (requires max_handoffs limit)
  • Context explosion: Long handoff chains accumulate large context
  • Debugging complexity: Decentralized decisions harder to trace than supervisor routing
  • Child workflow overhead: Each handoff creates a new workflow run_id

When to Use

  • Linear or tree-shaped task delegation (not complex graphs)
  • Performance-critical applications (minimize orchestration overhead)
  • Agents have clear specialization and handoff rules
  • Task flow is relatively predictable

When to Avoid

  • Complex routing logic requiring global view (use supervisor)
  • Need centralized audit trail of all decisions
  • Agents need to consult multiple peers before deciding
  • Risk of circular handoffs is high

Performance Comparison

Based on OpenAI Swarm research:

  • Swarm handoffs: ~40% faster than supervisor routing
  • Reason: Eliminates LLM call for routing decision
  • Trade-off: Less control over orchestration logic

Implementation Checklist

  • Create handoff() tool factory in pyworkflow_agents/handoffs.py
  • Implement SwarmAgent class with handoff support
  • Add AGENT_HANDOFF, HANDOFF_CHAIN_START/END event types
  • Integrate with PyWorkflow's start_child_workflow() primitive
  • Add max_handoffs protection against infinite loops
  • Implement handoff chain visualization in event log
  • Add context size tracking to prevent explosion
  • Create examples in examples/agents/swarm_pattern.py
  • Add tests for handoff chains and cycle detection
  • Document handoff best practices and anti-patterns

Related Issues

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    agentsAI Agent module (pyworkflow_agents)featureFeature to be implementedmulti-agentMulti-agent orchestration patterns

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions