fix(agent): synthesise delegate_<toolkit> tools after live integrations fetch#1670
Conversation
…ns fetch
The orchestrator's `subagents = [..., { skills = "*" }]` declaration expands
into one `delegate_<toolkit>` tool per connected Composio integration via
`collect_orchestrator_tools(def, reg, &connected_integrations)`. That call
lives inside the synchronous `AgentBuilder`, which has no async runtime
handle to query Composio, so it always passed `&[]`. Result: `delegate_gmail`
(and every other skill-wildcard tool) was never created and the orchestrator
could not delegate to `integrations_agent` for Gmail / Notion / etc.
`fetch_connected_integrations()` runs on turn 1 before the system prompt is
built, but it only updated the prompt-side integration block — it never went
back and synthesised the missing delegation tools.
Adds `Agent::refresh_delegation_tools()` which re-runs the synthesis with the
live `connected_integrations` slice and appends any newly-produced tools to
`self.tools` / `self.tool_specs` / `self.visible_tool_names` /
`self.visible_tool_specs`. Called immediately after
`fetch_connected_integrations` in both the live turn loop and the
`debug-agent-prompts.sh` dumper so both surfaces reflect the real toolkit
set.
📝 WalkthroughWalkthroughAdds Agent::refresh_delegation_tools() and calls it after connected integration discovery (first-turn and debug rendering). Also records a stable agent_definition_id at build time to resolve subagent delegation tools. ChangesAgent Delegation Tool Refresh
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/openhuman/agent/harness/session/turn.rs (1)
1287-1314: 💤 Low valueConsider adding trace-level diagnostics for silent early-return branches.
Per coding guidelines, new/changed flows should include logs at branch decisions for development-oriented diagnostics. Several early-return paths (no registry, no subagents, empty synthesis, no new tools) exit silently, making it harder to diagnose why delegation tools aren't appearing.
💡 Suggested trace logs for early exits
let Some(reg) = AgentDefinitionRegistry::global() else { + log::trace!("[agent] refresh_delegation_tools: global registry not available — skipping"); return; }; let Some(def) = reg.get(&self.agent_definition_name) else { log::debug!( "[agent] refresh_delegation_tools: definition '{}' not in registry — skipping", self.agent_definition_name ); return; }; if def.subagents.is_empty() { + log::trace!( + "[agent] refresh_delegation_tools: no subagents declared for '{}' — skipping", + self.agent_definition_name + ); return; } let synthed = collect_orchestrator_tools(def, reg, &self.connected_integrations); if synthed.is_empty() { + log::trace!( + "[agent] refresh_delegation_tools: collect_orchestrator_tools returned empty for '{}'", + self.agent_definition_name + ); return; } let existing: std::collections::HashSet<String> = self.tools.iter().map(|t| t.name().to_string()).collect(); let new_tools: Vec<Box<dyn Tool>> = synthed .into_iter() .filter(|t| !existing.contains(t.name())) .collect(); if new_tools.is_empty() { + log::trace!( + "[agent] refresh_delegation_tools: all synthesised tools already exist for '{}'", + self.agent_definition_name + ); return; }As per coding guidelines: "use
log/tracingatdebugortracelevel for development-oriented diagnostics on new/changed flows, including logs at entry/exit points, branch decisions".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent/harness/session/turn.rs` around lines 1287 - 1314, Add trace/debug logging on each early-return branch in refresh_delegation_tools: when AgentDefinitionRegistry::global() returns None, when reg.get(&self.agent_definition_name) returns None (include self.agent_definition_name), when def.subagents.is_empty() is true, when collect_orchestrator_tools(...) yields an empty synthed, and when new_tools is empty; place messages near those checks in the refresh_delegation_tools function so developers can see which branch caused the silent exit (include brief context like registry miss, definition missing, no subagents, no synthesized tools, or no new tools).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/openhuman/agent/harness/session/turn.rs`:
- Around line 1287-1314: Add trace/debug logging on each early-return branch in
refresh_delegation_tools: when AgentDefinitionRegistry::global() returns None,
when reg.get(&self.agent_definition_name) returns None (include
self.agent_definition_name), when def.subagents.is_empty() is true, when
collect_orchestrator_tools(...) yields an empty synthed, and when new_tools is
empty; place messages near those checks in the refresh_delegation_tools function
so developers can see which branch caused the silent exit (include brief context
like registry miss, definition missing, no subagents, no synthesized tools, or
no new tools).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: cfe11137-b9d9-43c5-91c6-037f7245d98c
📒 Files selected for processing (2)
src/openhuman/agent/debug/mod.rssrc/openhuman/agent/harness/session/turn.rs
The web channel renames each session's agent_definition_name to
`{target_agent_id}_{short_thread}` for per-thread transcript scoping
(channels/providers/web.rs:1223). That happens after build and before
turn(), so by the time refresh_delegation_tools fires, looking up the
agent by `self.agent_definition_name` misses the registry (which only
has `orchestrator`, not `orchestrator_abc123def456`), the refresh
silently bails, and `delegate_gmail` / `delegate_<toolkit>` tools are
never injected for any real chat session.
Captures the canonical id in a new `agent_definition_id` field at build
time, leaves it untouched by `set_agent_definition_name`, and points
the registry lookup at the canonical id so the rename can no longer
break delegation-tool synthesis.
PR tinyhumansai#1670 wired refresh_delegation_tools into the turn-1 path so the orchestrator finally sees delegate_<toolkit> tools when the agent is first built. But that gate (`if self.history.is_empty()`) only fires once per Agent instance — every subsequent turn falls into the `else` branch that deliberately leaves the tool surface alone for KV-cache prefix preservation. If a user connects (or revokes) a Composio toolkit mid-conversation, the orchestrator's function-calling schema stays frozen on the turn-1 snapshot until the session is restarted and the user is stuck telling the model "i just connected slack, can you check it" and watching it deflect. This patch adds a schema-only refresh that fires on every subsequent turn, gated on a stable hash of the connected-toolkit set. When the set has actually changed since the last reconcile, self.tools and self.tool_specs are rebuilt to match — but the system prompt is explicitly left untouched, so the provider's KV cache prefix stays valid for the rest of the conversation. The schema is sent fresh on every API call anyway, so the model picks up the new tool surface on the very next dispatch. Three layers of change: 1. composio/ops.rs gains cached_active_integrations() — a read-only probe over the existing INTEGRATIONS_CACHE that returns the current connected set without triggering a backend fetch. try_read so a writer (UI poll / event subscriber) never blocks a turn. Plus connected_set_hash() that returns a stable u64 derived from the sorted connected-toolkit slug set; order-independent and description-insensitive so cosmetic catalogue edits don't trigger spurious refreshes. 2. composio/bus.rs decouples cache invalidate + eager warm from the provider-routing hook. Previously the subscriber returned early when get_provider(toolkit) returned None — that meant toolkits without a registered provider (most of the 119 Composio toolkits, e.g. googlecalendar, github) bypassed the eager warm entirely and had to wait for the desktop UI's 5s composio_list_connections diff-poll to invalidate the stale cache. Now wait_for_connection_active + invalidate + fetch_connected_integrations runs for every Created event, and the provider hook is an optional downstream step gated on its own get_provider lookup. Same pattern mirrored in the revoke path (ops.rs::composio_delete_connection) so disconnects propagate equally fast. 3. agent/harness/session/turn.rs adds the per-turn schema-only refresh, plus rewrites refresh_delegation_tools() from PR tinyhumansai#1670's append-only shape into a full reconcile of the synthesised subset. The Agent struct now tracks synthesized_tool_names: HashSet<String> as an explicit mask of what came out of the last collect_orchestrator_tools call, so the reconciler can drop every previously-synthesised entry wholesale and append the fresh set. Stale delegate_<toolkit> tools after a revoke now drop cleanly instead of leaking as zombies. Direct tools (query_memory, cron_add, …) are guaranteed untouched because their names are never in synthesized_tool_names. On Arc::get_mut failure (sub-agent captured a snapshot before turn boundary — shouldn't happen but we defend) the previous synthesized_tool_names is restored so the next refresh retries the full transition cleanly. Validated end-to-end on a desktop dev build (staging-api): * Turn-1 baseline sync (PR tinyhumansai#1670 path, unchanged) * Mid-session connect with provider-having toolkit (slack) * Mid-session connect with no-provider toolkit (googlecalendar, github) — the case the bus decouple unlocks * Mid-session revoke (notion) — schema correctly reflected on next fresh-Agent observation * Three incremental mid-session connects on the same thread without Agent reconstruction — each turn picks up the new toolkit, KV cache stays warm, system prompt never re-rendered Cost in the steady state (no integration change): one INTEGRATIONS_CACHE try_read + one u64 compare. KV cache fully preserved. Backend `list_connections` round-trip from the eager warm fires once per OAuth completion in a background task — same data the UI's 5s poll was going to fetch anyway, just moved ~5s earlier. Follow-up issue to file separately: Composio's action-execution gateway has a 30-60s lag after `connection.status == ACTIVE` before the new token is usable for action calls. The first delegate_<toolkit> dispatch immediately after OAuth often gets `Connection error, try to authenticate` and succeeds on retry. Not addressable here; needs a retry / readiness probe inside integrations_agent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR tinyhumansai#1670 wired refresh_delegation_tools into the turn-1 path so the orchestrator finally sees delegate_<toolkit> tools when the agent is first built. But that gate (`if self.history.is_empty()`) only fires once per Agent instance — every subsequent turn falls into the `else` branch that deliberately leaves the tool surface alone for KV-cache prefix preservation. If a user connects (or revokes) a Composio toolkit mid-conversation, the orchestrator's function-calling schema stays frozen on the turn-1 snapshot until the session is restarted and the user is stuck telling the model "i just connected slack, can you check it" and watching it deflect. This patch adds a schema-only refresh that fires on every subsequent turn, gated on a stable hash of the connected-toolkit set. When the set has actually changed since the last reconcile, self.tools and self.tool_specs are rebuilt to match — but the system prompt is explicitly left untouched, so the provider's KV cache prefix stays valid for the rest of the conversation. The schema is sent fresh on every API call anyway, so the model picks up the new tool surface on the very next dispatch. Three layers of change: 1. composio/ops.rs gains cached_active_integrations() — a read-only probe over the existing INTEGRATIONS_CACHE that returns the current connected set without triggering a backend fetch. try_read so a writer (UI poll / event subscriber) never blocks a turn. Plus connected_set_hash() that returns a stable u64 derived from the sorted connected-toolkit slug set; order-independent and description-insensitive so cosmetic catalogue edits don't trigger spurious refreshes. 2. composio/bus.rs decouples cache invalidate + eager warm from the provider-routing hook. Previously the subscriber returned early when get_provider(toolkit) returned None — that meant toolkits without a registered provider (most of the 119 Composio toolkits, e.g. googlecalendar, github) bypassed the eager warm entirely and had to wait for the desktop UI's 5s composio_list_connections diff-poll to invalidate the stale cache. Now wait_for_connection_active + invalidate + fetch_connected_integrations runs for every Created event, and the provider hook is an optional downstream step gated on its own get_provider lookup. Same pattern mirrored in the revoke path (ops.rs::composio_delete_connection) so disconnects propagate equally fast. 3. agent/harness/session/turn.rs adds the per-turn schema-only refresh, plus rewrites refresh_delegation_tools() from PR tinyhumansai#1670's append-only shape into a full reconcile of the synthesised subset. The Agent struct now tracks synthesized_tool_names: HashSet<String> as an explicit mask of what came out of the last collect_orchestrator_tools call, so the reconciler can drop every previously-synthesised entry wholesale and append the fresh set. Stale delegate_<toolkit> tools after a revoke now drop cleanly instead of leaking as zombies. Direct tools (query_memory, cron_add, …) are guaranteed untouched because their names are never in synthesized_tool_names. On Arc::get_mut failure (sub-agent captured a snapshot before turn boundary — shouldn't happen but we defend) the previous synthesized_tool_names is restored so the next refresh retries the full transition cleanly. Validated end-to-end on a desktop dev build (staging-api): * Turn-1 baseline sync (PR tinyhumansai#1670 path, unchanged) * Mid-session connect with provider-having toolkit (slack) * Mid-session connect with no-provider toolkit (googlecalendar, github) — the case the bus decouple unlocks * Mid-session revoke (notion) — schema correctly reflected on next fresh-Agent observation * Three incremental mid-session connects on the same thread without Agent reconstruction — each turn picks up the new toolkit, KV cache stays warm, system prompt never re-rendered Cost in the steady state (no integration change): one INTEGRATIONS_CACHE try_read + one u64 compare. KV cache fully preserved. Backend `list_connections` round-trip from the eager warm fires once per OAuth completion in a background task — same data the UI's 5s poll was going to fetch anyway, just moved ~5s earlier. Follow-up issue to file separately: Composio's action-execution gateway has a 30-60s lag after `connection.status == ACTIVE` before the new token is usable for action calls. The first delegate_<toolkit> dispatch immediately after OAuth often gets `Connection error, try to authenticate` and succeeds on retry. Not addressable here; needs a retry / readiness probe inside integrations_agent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
integrations_agentfor Gmail / Notion / etc. — previously thedelegate_<toolkit>tools were never created.Agent::refresh_delegation_tools()that re-runscollect_orchestrator_toolswith the liveconnected_integrationsslice afterfetch_connected_integrationsreturns.session::turnand into the debug prompt dumper soscripts/debug-agent-prompts.shreflects the same surface.Problem
The orchestrator's
subagents = [..., { skills = "*" }]declaration expands into onedelegate_<toolkit>tool per connected Composio toolkit viacollect_orchestrator_tools(def, reg, &connected_integrations). That call lives inside the synchronousAgentBuilder(src/openhuman/agent/harness/session/builder.rs:883), which has no async runtime handle to query Composio — so it always passed&[]. Result:delegate_gmail(and every other skill-wildcard tool) was never created and the orchestrator could not delegate Gmail work tointegrations_agent.fetch_connected_integrations()runs on turn 1 before the system prompt is built, but it only updated the prompt-side integration block — it never went back and synthesised the missing delegation tools.Solution
New
Agent::refresh_delegation_tools()insrc/openhuman/agent/harness/session/turn.rs:AgentDefinitionRegistry.collect_orchestrator_toolswith the liveself.connected_integrations.self.toolsand appends the rest, mutating theArc<Vec<...>>viaArc::get_mut(safe on turn 1 — no sub-agent has captured a snapshot yet).self.visible_tool_nameswhen the agent usesToolScope::Named, and rebuildsself.visible_tool_specs.Called immediately after
fetch_connected_integrations()in:Agent::turn(live runtime, turn 1 only).render_via_sessioninside the debug dumper so prompt snapshots match runtime behaviour.Existing
orchestrator_toolsunit tests (6) still pass.Submission Checklist
Arc::get_mutharness that doesn't exist yet — happy to add if reviewers wantImpact
collect_orchestrator_toolscall on turn 1 (cheap — pure data shuffling over the in-memory registry), zero on subsequent turns.subagentsentries and signed-out users hit the no-op branch.Related
tools_arc()before turn 1, theArc::get_mutguard logs and bails — that path can become a hard error once the timing is invariant.AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
fix/orchestrator-delegate-toolkit-tools601d4282f8d3ab2e265714d8ec8df28ff2e9e488Validation Run
pnpm --filter openhuman-app format:check— Rust-only changepnpm typecheck— Rust-only changecargo test -p openhuman -- orchestrator_tools(6/6 pass)cargo build --bin openhuman-corecleanValidation Blocked
command:N/Aerror:N/Aimpact:N/ABehavior Changes
delegate_gmail/delegate_notion/ … as first-class function-calling tools when the user has the matching Composio toolkit connected.integrations_agent; previously it would either answer directly with "head to Settings → Connections" or fail to find a tool to call.Parity Contract
subagentsor no connected toolkits hit the no-op branch and produce byte-identical tool lists.Arc::get_mutguard logs and skips if the tools/specs Arcs are already shared, so a sub-agent snapshot taken before refresh time can never see a half-mutated state.Duplicate / Superseded PR Handling
Summary by CodeRabbit