Skip to content

Rework multi-agent orchestrator: intercept SDK task tool to surface sub-agents as PolyPilot sessions #229

@PureWeen

Description

@PureWeen

Problem

The OrchestratorReflect mode relies on the orchestrator producing @worker:name text blocks, which PolyPilot's reflect loop parses via regex and dispatches to worker sessions. However, the orchestrator is a full Copilot SDK session with built-in tools (task, grep, edit, etc.), and the model strongly prefers using the SDK's task sub-agent tool over producing text-based @worker: blocks.

This causes:

  1. The orchestrator does all work itself via SDK sub-agents instead of delegating to PolyPilot worker sessions
  2. Worker sessions sit idle with zero events
  3. The reflect loop's ParseTaskAssignments finds no @worker: blocks and stalls
  4. Sub-agent work is invisible in the PolyPilot UI — no progress tracking, no session history

Current workaround (PR #225)

Rewrote the orchestrator's planning prompt to forcefully say "You are a DISPATCHER ONLY, you do NOT have tools" and removed tool-like language from the RoutingContext. This works but fights against the model's natural behavior.

Proposed Approach: Intercept SDK task Tool

Instead of fighting the model, embrace its preference for the task tool and intercept it to route through PolyPilot's worker sessions.

Architecture

  1. Disable the SDK's built-in task tool via ExcludedTools = new List<string> { "task" } on the orchestrator's SessionConfig

  2. Register a custom AIFunction (e.g., delegate_to_worker) that:

    • Receives the prompt and agent_type from the model
    • Maps to the appropriate PolyPilot worker session (or creates one dynamically)
    • Calls ExecuteWorkerAsync(workerName, task, originalPrompt, ct) — the same code path the reflect loop already uses
    • Returns the worker's response back to the SDK as the tool result
  3. Workers become visible sessions — each sub-agent delegation shows up as a real PolyPilot session with:

    • Full event stream (tool calls, edits, file changes)
    • Progress indicators in the UI
    • Worktree isolation support
    • Persistent history

Key Research Findings

  • SessionConfig.Tools accepts List<AIFunction> — custom tools added alongside built-in ones (see ShowImageTool pattern)
  • SessionConfig.ExcludedTools and SessionConfig.AvailableTools exist in SDK 0.1.26 but aren't used by PolyPilot yet
  • subagent.started events contain toolCallId, agentName, agentDisplayName
  • tool.execution_start for task contains the full prompt, agent_type, and description — all the data needed to route to a worker
  • The model naturally names its sub-agent calls descriptively (e.g., "Implement steering messages", "Challenge steering implementation") — these could become session display names

Open Questions

  • Does ExcludedTools work for the built-in task tool?
  • Can a custom AIFunction be named task to transparently replace the built-in, or does it need a different name?
  • Should dynamic worker sessions be created on-the-fly, or pre-allocated from the preset?
  • How to handle the explore agent type — map to a lightweight read-only worker?
  • What happens to general-purpose sub-agents that the orchestrator spawns for complex multi-step work?

Benefits

  • Stop fighting the model — let it use tools naturally
  • Full visibility — every sub-agent becomes a trackable session
  • Worktree isolation — workers can have their own git worktrees
  • Persistence — sub-agent work survives app restarts
  • Simplify the reflect loop — no more @worker: text parsing, ParseTaskAssignments regex, or prompt engineering to force text dispatch

Files Involved

  • PolyPilot/Services/CopilotService.csSessionConfig.Tools, SessionConfig.ExcludedTools
  • PolyPilot/Services/CopilotService.Organization.csExecuteWorkerAsync, SendViaOrchestratorReflectAsync
  • PolyPilot/Services/ShowImageTool.cs — existing custom AIFunction pattern to follow
  • PolyPilot/Models/ModelCapabilities.cs — preset definitions, RoutingContext

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions