Skip to content

perf(agent): prewarm session integrations before first turn#2454

Merged
M3gA-Mind merged 8 commits into
tinyhumansai:mainfrom
srikaanthh:feat/agent-spawn-depth-gate
May 22, 2026
Merged

perf(agent): prewarm session integrations before first turn#2454
M3gA-Mind merged 8 commits into
tinyhumansai:mainfrom
srikaanthh:feat/agent-spawn-depth-gate

Conversation

@srikaanthh
Copy link
Copy Markdown
Contributor

@srikaanthh srikaanthh commented May 21, 2026

Summary

  • prewarm session connected_integrations from the shared Composio cache during from_config_* agent construction
  • synthesize delegation tools against the prewarmed integration view so fresh sessions start with the correct delegate_<toolkit> surface
  • skip the turn-1 integration fetch and delegation-surface rebuild when the builder already had an authoritative cache snapshot
  • carry the runtime Config snapshot on the session agent so mid-session integration-cache probes stop reloading config on the hot path
  • add a regression test for the initialized/hash bookkeeping when integrations are injected onto an agent

Problem

  • Fresh agent sessions were doing avoidable cold-start work inside Agent::turn() before the first provider call.
  • On a new session, the turn path loaded transcript state, fetched connected integrations, rebuilt delegation tools, fetched learned context, and only then froze the system prompt.
  • The integration fetch itself reloaded Config inside the hot path, and the session builder always synthesized delegation tools against an empty integration set, guaranteeing a repair pass on turn 1.
  • That inflated first-token latency for orchestrator-style sessions even when the Composio cache already had a valid integration snapshot.

Solution

  • Reuse composio::cached_active_integrations(config) during session construction to prewarm connected_integrations when the shared cache is already warm.
  • Build delegation tools against that cached integration slice instead of hardcoding &[], then persist the synthesized-tool name set onto the built Agent.
  • Track whether a session's integration view is authoritative with connected_integrations_initialized; turn 1 now only fetches integrations and refreshes delegation tools when the builder could not prewarm the cache.
  • Store the full runtime Config snapshot on the session agent so mid-session cache reads and fallback integration fetches do not call Config::load_or_init() on the hot path.
  • Keep the existing fallback behavior for cold-cache sessions and shared-Arc reconciliation failures so correctness stays unchanged when prewarming is unavailable.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • N/A: diff coverage is enforced by CI; local coverage commands were blocked in this environment (pnpm unavailable on PATH, focused Rust tests blocked by missing cmake).
  • Coverage matrix updated — N/A: behaviour-only change
  • All affected feature IDs from the matrix are listed in the PR description under ## Related
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md)
  • Linked issue closed via Closes #NNN in the ## Related section

Impact

  • Runtime/platform impact: desktop/in-process core agent sessions.
  • Performance: reduces first-turn latency when the Composio cache is already warm by avoiding a redundant integration fetch, avoiding a redundant delegation-tool rebuild, and avoiding Config::load_or_init() on subsequent cache probes.
  • Compatibility: cold-cache sessions preserve the old fallback behavior and still fetch integrations on turn 1 when no prewarmed snapshot exists.
  • Security: no change in privilege or network surface; this only changes when cached integration metadata is reused.

Related

  • Closes:
  • Follow-up PR(s)/TODOs:

AI Authored PR Metadata (required for Codex/Linear PRs)

Keep this section for AI-authored PRs. For human-only PRs, mark each field N/A.

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: feat/agent-spawn-depth-gate
  • Commit SHA: 44ca700

Validation Run

  • N/A: local environment does not have pnpm on PATH, so this command could not be run here.
  • N/A: local environment does not have pnpm on PATH, so this command could not be run here.
  • N/A: focused Rust tests were attempted, but the build is blocked locally because whisper-rs-sys requires cmake, which is not installed in this environment.
  • Rust fmt/check (if changed): cargo fmt --manifest-path Cargo.toml passed; git diff --check origin/main...HEAD clean.
  • N/A: Tauri shell files were not changed in this PR; a local cargo check --manifest-path app/src-tauri/Cargo.toml attempt was also blocked because the vendored tauri-cef dependency tree is missing in this environment.

Validation Blocked

  • command: GGML_NATIVE=OFF cargo test --manifest-path Cargo.toml set_connected_integrations_marks_session_initialized_and_updates_hash -- --nocapture and GGML_NATIVE=OFF cargo test --manifest-path Cargo.toml turn_without_tools_returns_text -- --nocapture
  • error: whisper-rs-sys build script failed because cmake is not installed in the local environment
  • impact: focused Rust tests did not complete locally; correctness is based on source review plus the added regression coverage

Behavior Changes

  • Intended behavior change: sessions built from a warm Composio cache now start with prewarmed integrations and delegation tools instead of repairing that state inside the first turn
  • User-visible effect: lower first-token latency for fresh orchestrator-style sessions when integration metadata is already cached

Parity Contract

  • Legacy behavior preserved: when the Composio cache is cold or unavailable, turn 1 still fetches integrations and rebuilds the delegation surface before freezing the prompt
  • Guard/fallback/dispatch parity checks: shared-Arc reconciliation fallback, mid-session cache-driven refresh, and config-load fallback behavior remain intact

Duplicate / Superseded PR Handling

  • Duplicate PR(s): none
  • Canonical PR: this PR
  • Resolution (closed/superseded/updated): N/A

Summary by CodeRabbit

  • New Features

    • Enforced sub-agent spawn-depth limit (max 3) with surfaced error on overflow.
    • Sessions now preload and track connected integrations and their runtime config.
    • Connected integrations now include a gated-tools catalogue describing hidden toolkit actions.
  • Tests

    • Added tests for spawn-depth enforcement and reset behavior.
    • Added tests validating integration-initialization state and hash updates.
  • Documentation

    • Marked spawn-depth runtime limiter as implemented in architecture docs.

Review Change Stack

srikaanthh and others added 4 commits May 19, 2026 13:41
Add a task-local depth guard around run_subagent so runtime delegation cannot recurse beyond the documented harness limit.

Co-authored-by: Cursor <cursoragent@cursor.com>
Prevents usize wrap-around that could bypass MAX_SPAWN_DEPTH guard
in release builds. Addresses CodeRabbit review on PR tinyhumansai#2234.
The new with_spawn_depth task-local scope adds a second stacked
task_local wrapper around run_typed_mode. Under cargo-llvm-cov
instrumentation, the combined future state machine overflows tokio's
default 2 MiB per-thread test stack in
turn_dispatches_spawn_subagent_through_full_path. Box::pin the inner
future so the large state machine lives on the heap.
@srikaanthh srikaanthh requested a review from a team May 21, 2026 15:04
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ddaf54ab-6776-4bb1-b5c2-3ca6fc882ae6

📥 Commits

Reviewing files that changed from the base of the PR and between 203ade4 and 299db43.

📒 Files selected for processing (10)
  • src/openhuman/agent/agents/integrations_agent/prompt.rs
  • src/openhuman/agent/agents/orchestrator/prompt.rs
  • src/openhuman/agent/agents/welcome/prompt.rs
  • src/openhuman/agent/harness/session/tests.rs
  • src/openhuman/agent/harness/subagent_runner/ops.rs
  • src/openhuman/agent/harness/test_support_test.rs
  • src/openhuman/agent/prompts/types.rs
  • src/openhuman/composio/ops.rs
  • src/openhuman/composio/ops_test.rs
  • src/openhuman/tools/orchestrator_tools.rs
✅ Files skipped from review due to trivial changes (2)
  • src/openhuman/composio/ops_test.rs
  • src/openhuman/agent/harness/test_support_test.rs

📝 Walkthrough

Walkthrough

Adds a Tokio task-local spawn-depth budget (MAX_SPAWN_DEPTH = 3) and enforcement that returns SubagentRunError::SpawnDepthExceeded when exceeded; threads a best-effort Composio integration prewarm through AgentBuilder, adds session flags/config to ensure integrations are initialized once per session, and updates related types and tests.

Changes

Spawn-depth limiting and integration prewarming

Layer / File(s) Summary
Spawn-depth context infrastructure
src/openhuman/agent/harness/spawn_depth_context.rs, src/openhuman/agent/harness/mod.rs
Introduces MAX_SPAWN_DEPTH, a Tokio task-local CURRENT_SPAWN_DEPTH, current_spawn_depth() reader, and with_spawn_depth() scoper. Adds unit tests and re-exports from harness mod.
Spawn-depth error and enforcement
src/openhuman/agent/harness/subagent_runner/types.rs, src/openhuman/agent/harness/subagent_runner/ops.rs, src/openhuman/agent/harness/subagent_runner/ops_tests.rs
Adds SubagentRunError::SpawnDepthExceeded. run_subagent computes attempted depth, rejects when > MAX_SPAWN_DEPTH, scopes inner run with with_spawn_depth(...) (and sandbox scoping), box-pins the inner future, and logs spawn_depth. Tests cover allowed and rejected depths and task-local reset.
Integration prewarm and builder wiring
src/openhuman/agent/harness/session/builder.rs, src/openhuman/agent/harness/mod.rs
Performs a best-effort synchronous Composio prewarm in build_session_agent_inner, passes the prewarmed slice into delegation-tool synthesis, captures synthesized tool names, and stamps prewarmed integrations, initialized flag, runtime config, and integration hash onto the built Agent.
Integration runtime usage and setters/tests
src/openhuman/agent/harness/session/turn.rs, src/openhuman/agent/harness/session/runtime.rs, src/openhuman/agent/harness/session/types.rs, src/openhuman/agent/harness/session/tests.rs
Gates turn-1 fetch/refresh by connected_integrations_initialized, prefers cached integration_runtime_config for schema-only refreshes, sets initialized flag after reconcile, fetch_connected_integrations uses cached config when present and marks initialized, set_connected_integrations marks initialized and recomputes hash. Adds unit test validating setter behavior and hash update.
Prompts, types, and test fixtures for gated tools
src/openhuman/agent/prompts/types.rs, src/openhuman/composio/ops.rs, src/openhuman/agent/agents/*/prompt.rs, src/openhuman/tools/orchestrator_tools.rs, src/openhuman/composio/ops_test.rs, src/openhuman/agent/harness/test_support_test.rs
Adds GatedIntegrationTool and gated_tools: Vec<GatedIntegrationTool> on ConnectedIntegration, initializes gated_tools in composio and test fixtures, and clones gated_tools into refreshed ConnectedIntegration during runner refresh. Test fixtures updated to set empty vectors.
Documentation and coverage notes
gitbooks/developing/architecture/agent-harness.md, src/openhuman/agent/harness/harness_gap_tests.rs
Docs updated to mark runtime spawn-depth limiter as implemented (MAX_SPAWN_DEPTH=3, SpawnDepthExceeded live). Gap-test docs updated to point SpawnDepthExceeded coverage to run_subagent tests.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • senamakel

Poem

🐰 I scoped the depth with careful paws,
Three hops high, those are the laws.
Prewarmed tools hum soft and bright,
One session wakes to delegation’s light.
Task-locals hold the stack — a tidy cause.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'perf(agent): prewarm session integrations before first turn' accurately summarizes the main change: performance optimization by prewarming session integrations before the first turn to reduce first-token latency.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch feat/agent-spawn-depth-gate

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. working A PR that is being worked on by the team. labels May 21, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/openhuman/agent/harness/session/builder.rs (1)

1305-1315: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Track only actually inserted synthesized tool names.

synthesized_tool_names is currently derived before collision filtering. If a synthesized name collides with an existing direct tool, it won’t be inserted now, but it will still be marked synthesized and can be removed on the next refresh.

Suggested fix
-        let synthesized_tool_names: std::collections::HashSet<String> = delegation_tools
-            .iter()
-            .map(|t| t.name().to_string())
-            .collect();
         let existing_names: std::collections::HashSet<String> =
             tools.iter().map(|t| t.name().to_string()).collect();
-        tools.extend(
-            delegation_tools
-                .into_iter()
-                .filter(|t| !existing_names.contains(t.name())),
-        );
+        let filtered_delegation_tools: Vec<Box<dyn Tool>> = delegation_tools
+            .into_iter()
+            .filter(|t| !existing_names.contains(t.name()))
+            .collect();
+        let synthesized_tool_names: std::collections::HashSet<String> = filtered_delegation_tools
+            .iter()
+            .map(|t| t.name().to_string())
+            .collect();
+        tools.extend(filtered_delegation_tools);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/agent/harness/session/builder.rs` around lines 1305 - 1315,
synthesized_tool_names is computed from delegation_tools before filtering
collisions, so names that were not actually inserted (because they collided with
existing tools) are still marked synthesized; update the logic to compute
synthesized_tool_names only from the delegation_tools that are actually inserted
into tools — for example, apply the same filter used in tools.extend (the
closure that checks !existing_names.contains(t.name())) and collect the names
from that filtered set (or collect inserted tools into a temporary vec and
derive synthesized_tool_names from it) so synthesized_tool_names reflects only
truly added tools.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/agent/harness/session/builder.rs`:
- Around line 1523-1524: The first issue is a use-after-move on
prewarmed_integrations: check prewarmed_integrations.is_some() first (assign
connected_integrations_initialized = prewarmed_integrations.is_some()), then
assign agent.connected_integrations = prewarmed_integrations.unwrap_or_default()
(or clone prior to unwrap) so you don't read after move; update references to
prewarmed_integrations, connected_integrations, and
connected_integrations_initialized accordingly. The second issue is that
self.synthesized_tool_names is populated from delegation_tools before removing
name collisions, causing refresh_delegation_tools to drop direct tools; fix by
first filtering delegation_tools to remove any whose names collide with existing
direct tools, extend tools with that filtered list, and then set
self.synthesized_tool_names to the names of the actually inserted delegation
tools (use the filtered collection) so refresh_delegation_tools uses the correct
drop mask (affecting synthesized_tool_names, delegation_tools, tools, and
refresh_delegation_tools).

---

Outside diff comments:
In `@src/openhuman/agent/harness/session/builder.rs`:
- Around line 1305-1315: synthesized_tool_names is computed from
delegation_tools before filtering collisions, so names that were not actually
inserted (because they collided with existing tools) are still marked
synthesized; update the logic to compute synthesized_tool_names only from the
delegation_tools that are actually inserted into tools — for example, apply the
same filter used in tools.extend (the closure that checks
!existing_names.contains(t.name())) and collect the names from that filtered set
(or collect inserted tools into a temporary vec and derive
synthesized_tool_names from it) so synthesized_tool_names reflects only truly
added tools.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: dedc5c1f-db40-4392-b7d7-c1d181373745

📥 Commits

Reviewing files that changed from the base of the PR and between ec9708a and 44ca700.

📒 Files selected for processing (12)
  • gitbooks/developing/architecture/agent-harness.md
  • src/openhuman/agent/harness/harness_gap_tests.rs
  • src/openhuman/agent/harness/mod.rs
  • src/openhuman/agent/harness/session/builder.rs
  • src/openhuman/agent/harness/session/runtime.rs
  • src/openhuman/agent/harness/session/tests.rs
  • src/openhuman/agent/harness/session/turn.rs
  • src/openhuman/agent/harness/session/types.rs
  • src/openhuman/agent/harness/spawn_depth_context.rs
  • src/openhuman/agent/harness/subagent_runner/ops.rs
  • src/openhuman/agent/harness/subagent_runner/ops_tests.rs
  • src/openhuman/agent/harness/subagent_runner/types.rs

Comment thread src/openhuman/agent/harness/session/builder.rs Outdated
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 21, 2026
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 21, 2026
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — clean, well-structured performance optimisation.

Walkthrough: this PR prewarms session connected_integrations from the shared Composio cache during from_config_* agent construction, synthesises delegation tools against the prewarmed snapshot, and carries the runtime Config on the session agent to avoid Config::load_or_init() on the hot path. Cold-cache sessions preserve fallback behaviour.

Area Files Key changes
Rust core / session builder.rs Prewarm integrations from cache, build delegation tools against cached slice, track synthesized_tool_names post-collision-filter
Rust core / session turn.rs Gate turn-1 fetch behind connected_integrations_initialized, use stored config for mid-session cache probes
Rust core / session types.rs New fields: connected_integrations_initialized, integration_runtime_config
Rust core / session runtime.rs set_connected_integrations now sets initialized flag + updates hash
Tests tests.rs Regression test for initialized/hash bookkeeping

CodeRabbit findings (already addressed): use-after-move on prewarmed_integrations ✅, synthesized tool mask alignment ✅ — both confirmed fixed in current diff.

Additional observations:

  • Fallback paths are well-preserved: cold-cache → empty slice → turn-1 fetch → delegation rebuild. No correctness regression.
  • config.clone() for the snapshot is a one-time construction cost, acceptable for the hot-path savings.
  • Good test coverage for the bookkeeping; the prewarm path itself is integration-level (requires warm cache) which is reasonable.

No new findings beyond what CodeRabbit already covered. Moving to approval queue.

@M3gA-Mind M3gA-Mind self-assigned this May 22, 2026
@M3gA-Mind M3gA-Mind merged commit b2ae12a into tinyhumansai:main May 22, 2026
26 checks passed
@M3gA-Mind
Copy link
Copy Markdown
Contributor

nice work @srikaanthh, prewarming the session integrations so turn-1 skips the fetch and delegation rebuild is a really clean win 🔥 always love seeing perf improvements that cut latency on the hot path like this. glad to have it merged, keep it coming!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants