Skip to content

fix: cap session-start/subagent-start hook latency (#221)#271

Merged
rohitg00 merged 1 commit into
mainfrom
fix/221-hooks-startup-latency
May 10, 2026
Merged

fix: cap session-start/subagent-start hook latency (#221)#271
rohitg00 merged 1 commit into
mainfrom
fix/221-hooks-startup-latency

Conversation

@rohitg00
Copy link
Copy Markdown
Owner

@rohitg00 rohitg00 commented May 10, 2026

Summary

  • session-start.ts awaited a 5000ms POST and discarded the response whenever AGENTMEMORY_INJECT_CONTEXT=false (the default since 0.8.10) — pure blocking latency on every Claude Code session start.
  • subagent-start.ts had a // fire and forget comment but the code awaited a 2000ms POST.
  • Under concurrent fan-out (Slack-bot orchestrators, multi-agent harnesses, fanned claude -p jobs) the awaited timeouts stack and create the positive feedback loop the reporter described — leading to iii-engine OOM-kill.

Fix

  • session-start — fire-and-forget on the telemetry path. Cap the inject path at 1500ms (down from 5000ms) so a slow server can't block the agent indefinitely when stdout is consumed.
  • subagent-start — actually fire-and-forget, matching the existing comment. Cap at 800ms.

Live e2e (black-hole TCP listener: accepts, never replies)

hook before after
session-start (no inject) 5.05s 0.85s
session-start (inject=true) 5.05s 1.55s
subagent-start 2.05s 0.87s

Notes

  • Bundled artifacts in plugin/scripts/ regenerated via npx tsdown.
  • The shared isSdkChildContext guard is intentionally inlined in each hook source (matches the established pattern: every tsdown hook entry compiles to a single self-contained .mjs).
  • Full suite: 856 passing.

Closes #221.

Summary by CodeRabbit

  • Performance
    • Optimized timeout handling for startup operations with separate thresholds for telemetry versus response-dependent requests.
    • Improved startup responsiveness by making telemetry requests non-blocking.

Review Change Stack

Two hook scripts blocked Claude Code's startup waiting on REST responses
they didn't actually need:

- `session-start` awaited a 5000ms POST and discarded the response when
  `AGENTMEMORY_INJECT_CONTEXT=false` (the default). Pure latency.
- `subagent-start` had a `// fire and forget` comment but the code
  awaited a 2000ms POST. Pure latency.

Under fan-out (Slack-bot orchestrators, multi-agent harnesses, fanned
`claude -p` jobs) the awaited timeouts stack and feed back into the
engine; the reporter hit a positive feedback loop that OOM-killed
iii-engine.

Fix:

- `session-start` — fire-and-forget when `INJECT_CONTEXT=false`. Cap the
  inject path at 1500ms (down from 5000ms) so a slow server can't block
  the agent indefinitely when stdout is actually consumed.
- `subagent-start` — actually fire-and-forget, matching the existing
  comment. Cap at 800ms.

Verified live against a black-hole TCP listener (accepts, never replies):
- session-start (no inject): 5.05s → 0.85s
- session-start (inject):    5.05s → 1.55s
- subagent-start:            2.05s → 0.87s

Built artifacts in `plugin/scripts/` regenerated via `npx tsdown`.

Closes #221.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agentmemory Ready Ready Preview, Comment May 10, 2026 6:04pm

Request Review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 10, 2026

📝 Walkthrough

Walkthrough

The PR refactors session-start and subagent-start hooks to eliminate blocking waits on slow/unreachable REST endpoints when responses are discarded. Separate timeout constants are introduced, request setup is consolidated into reusable variables, and control flow splits into non-blocking telemetry paths (fire-and-forget with 800ms timeout) and awaited context-injection paths (1500ms timeout for session-start).

Changes

Hook Timeout and Fire-and-Forget Refactoring

Layer / File(s) Summary
Timeout Constants
plugin/scripts/session-start.mjs, plugin/scripts/subagent-start.mjs, src/hooks/session-start.ts, src/hooks/subagent-start.ts
New timeout constants: REGISTER_TIMEOUT_MS (800ms) and INJECT_TIMEOUT_MS (1500ms) for session-start; TIMEOUT_MS (800ms) for subagent-start. These cap request duration and prevent indefinite blocking on slow endpoints.
Request Setup and Control Flow Splitting
plugin/scripts/session-start.mjs, plugin/scripts/subagent-start.mjs, src/hooks/session-start.ts, src/hooks/subagent-start.ts
HTTP request URL and init configuration are extracted into reusable variables. Control flow branches based on INJECT_CONTEXT: fire-and-forget with .catch(() => {}) and AbortSignal.timeout(...) when response is unused, or awaited when context is required.
Context Injection Response Handling
plugin/scripts/session-start.mjs
When INJECT_CONTEXT=true, the session-start request is awaited with the injection timeout. On res.ok, JSON is parsed and result.context is conditionally written to stdout, preserving context propagation.
Documentation
src/hooks/session-start.ts, src/hooks/subagent-start.ts
Header comments clarify that TypeScript hook files are inlined into self-contained .mjs bundles, explaining intentional code duplication and file relationships.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A rabbit hops through timeouts fast,
No more waits that seem to last!
Fire-and-forget, the telemetry springs,
While context injection still awaits its things—
Swift sessions bloom, the load cascades less. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.27% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the primary change: capping hook latency for session-start and subagent-start, directly matching the main objectives.
Linked Issues check ✅ Passed All code changes directly address #221's requirements: fire-and-forget telemetry paths, timeout reductions (session-start inject 5000→1500ms, subagent-start 2000→800ms), and bundled script regeneration.
Out of Scope Changes check ✅ Passed All changes are scoped to the linked issue: refactoring timeout handling and control flow in session-start and subagent-start hooks, plus regenerated plugin/scripts artifacts.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/221-hooks-startup-latency

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/hooks/subagent-start.ts`:
- Line 51: The code constructs timestamps inline for the request body; capture a
single timestamp once at the start of the subagent-start handler (declare const
timestamp = new Date().toISOString()) and reuse that variable wherever a
timestamp is needed (e.g., the timestamp property in the request body) instead
of calling new Date().toISOString() multiple times.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6b9d3051-01f9-48bb-bb20-d24a840ba625

📥 Commits

Reviewing files that changed from the base of the PR and between 0322da8 and 77a1d19.

📒 Files selected for processing (4)
  • plugin/scripts/session-start.mjs
  • plugin/scripts/subagent-start.mjs
  • src/hooks/session-start.ts
  • src/hooks/subagent-start.ts

sessionId,
project: data.cwd || process.cwd(),
cwd: data.cwd || process.cwd(),
timestamp: new Date().toISOString(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Capture timestamp once at the top of the function.

The timestamp is created inline within the request body. As per coding guidelines, capture timestamps once with new Date().toISOString() and reuse instead of calling Date multiple times.

📝 Proposed fix
   const sessionId = (data.session_id as string) || "unknown";
+  const timestamp = new Date().toISOString();

   fetch(`${REST_URL}/agentmemory/observe`, {
     method: "POST",
     headers: authHeaders(),
     body: JSON.stringify({
       hookType: "subagent_start",
       sessionId,
       project: data.cwd || process.cwd(),
       cwd: data.cwd || process.cwd(),
-      timestamp: new Date().toISOString(),
+      timestamp,
       data: {
         agent_id: data.agent_id,
         agent_type: data.agent_type,
       },
     }),

As per coding guidelines, "Capture timestamps once with new Date().toISOString() and reuse instead of calling Date multiple times".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
timestamp: new Date().toISOString(),
const sessionId = (data.session_id as string) || "unknown";
const timestamp = new Date().toISOString();
fetch(`${REST_URL}/agentmemory/observe`, {
method: "POST",
headers: authHeaders(),
body: JSON.stringify({
hookType: "subagent_start",
sessionId,
project: data.cwd || process.cwd(),
cwd: data.cwd || process.cwd(),
timestamp,
data: {
agent_id: data.agent_id,
agent_type: data.agent_type,
},
}),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/hooks/subagent-start.ts` at line 51, The code constructs timestamps
inline for the request body; capture a single timestamp once at the start of the
subagent-start handler (declare const timestamp = new Date().toISOString()) and
reuse that variable wherever a timestamp is needed (e.g., the timestamp property
in the request body) instead of calling new Date().toISOString() multiple times.

@rohitg00 rohitg00 merged commit 1ff5849 into main May 10, 2026
5 checks passed
@rohitg00 rohitg00 mentioned this pull request May 10, 2026
rohitg00 added a commit that referenced this pull request May 10, 2026
Three reliability fixes from #269/#270/#271:

- search/recall surfaces saved memories (closes #265)
- MCP shim proxies full server tool set (closes #234)
- session/subagent hooks no longer block startup (closes #221)

Also fixes packages/mcp version drift — was stuck at 0.9.4 through v0.9.5,
now lockstepped with main.
@bunke
Copy link
Copy Markdown

bunke commented May 12, 2026

Confirming this works in production after upgrading our plugin install to v0.9.9. Verified locally against a black-hole TCP listener (accepts, never replies):

session-start (INJECT_CONTEXT=false):  5.073s → 0.874s
subagent-start:                        2.075s → 0.874s

Matches the PR description. Thanks for the rapid turnaround — this closes the OOM-loop we hit on 2026-05-01 (originally filed as #221).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

session-start / subagent-start hooks block agent startup for up to 5s on slow/unreachable REST

2 participants