Skip to content

PRD: Copilot SDK Hooks Integration — Migration Avenues & Per-Hook Improvement Plan #151

@diberry

Description

@diberry

PRD: Copilot SDK Hooks Integration for Squad

Executive Summary

Squad already wraps @github/copilot-sdk (v0.1.32) and has a HookPipeline with PreToolUse/PostToolUse hooks plus 5 built-in governance policies. However, the Copilot SDK exposes 6 lifecycle hooks — Squad only leverages 2 of them for tool governance. The remaining 4 hooks (onSessionStart, onUserPromptSubmitted, onSessionEnd, onErrorOccurred) plus deeper use of the existing 2 represent a significant opportunity to move coordinator logic from fragile prompt templates into reliable, programmatic hooks.

This PRD maps each Copilot SDK hook to concrete Squad improvements, categorizes what's already in the product vs. what requires new work, and provides business value analysis for each.


Migration Avenues: What's In vs. Not In Squad Today

Already In Squad (Leverage Existing)

Capability Where It Lives Status
HookPipeline class packages/squad-sdk/src/hooks/index.ts Production (384 lines)
PreToolUseHook interface hooks/index.ts Production
PostToolUseHook interface hooks/index.ts Production
File-write guard policy Built-in policy Production
Shell restriction policy Built-in policy Production
Ask-user rate limiting Built-in policy Production
PII scrubbing (post-tool) Built-in policy Production
Reviewer lockout hook ReviewerLockoutHook Production
Copilot SDK adapter packages/squad-sdk/src/adapter/ Production
SquadSessionHooks type adapter/types.ts Production
Hook governance sample samples/hook-governance/ Production
Hook docs docs/reference/tools-and-hooks.md Production
Coordinator accepts hookPipeline coordinator/index.ts via CoordinatorDeps Production

Not In Squad (Requires New Work)

Capability Copilot SDK Hook Gap
Session start context injection onSessionStart Squad passes hooks config but doesn't implement lifecycle logic
Prompt interception / rewriting onUserPromptSubmitted Not wired — all prompt processing is in coordinator prose
Session teardown / cleanup onSessionEnd Not wired — cleanup is manual or Scribe-driven
Error-driven model fallback onErrorOccurred Not wired — fallback chains are prose in squad.agent.md
Dynamic policy from decisions.md onPreToolUse (extended) Static policies only — no decision-driven enforcement
Auto-Scribe on file writes onPostToolUse (extended) Scribe is manually spawned by coordinator after batches
Agent spawn lifecycle hooks N/A (Squad-specific) LifecycleManager.spawnAgent() has no hook points
Routing middleware N/A (Squad-specific) Router is deterministic, not pluggable

Partially In Squad (Needs Extension)

Capability Current State What's Missing
onPreToolUse 5 static policies Dynamic policies from team decisions; per-agent workspace boundaries
onPostToolUse PII scrubbing only Auto-detect .squad/ writes to queue Scribe; audit logging
Adapter hook passthrough Types defined, plumbing exists No default Squad-level implementations registered

Hook-by-Hook Improvement Plan

Hook 1: onSessionStart — Auto-Context Loading

What it does: Fires when a Copilot session begins (new or resumed). Can inject additional context.

Current Squad behavior: Every spawn prompt manually instructs agents to "read history.md, read decisions.md, check skills/". This is 200-400 tokens of boilerplate per spawn, and agents sometimes skip reads.

Proposed improvement:

  • Register an onSessionStart hook that automatically resolves and injects:
    • Agent's history.md (last N entries)
    • Current decisions.md snapshot
    • Relevant skills from .squad/skills/ matching the agent's domain
    • identity/now.md (current focus) and identity/wisdom.md (team wisdom)
  • Context is injected programmatically — not as prompt instructions that depend on the agent reading files

Implementation scope:

  • Extend SquadSessionHooks in adapter/types.ts
  • Create packages/squad-sdk/src/hooks/session-start.ts
  • Wire into LifecycleManager.spawnAgent() session creation
  • Make context sources configurable via squad.config.ts

Executive Summary — Business Value:

For Squad product: Reduces coordinator prompt template complexity by ~30%. Makes context loading reliable instead of advisory. Eliminates a class of bugs where agents "forget" to read decisions or history.

For Squad customers: Agents start faster and with better context. Users no longer need to debug why an agent missed a team decision. Token savings of 200-400 tokens per spawn (compounding across multi-agent fan-outs).


Hook 2: onUserPromptSubmitted — Directive Detection and Prompt Enhancement

What it does: Fires every time a user sends a message. Can rewrite, filter, or augment the prompt.

Current Squad behavior: The coordinator pattern-matches directive signals ("always", "never", "from now on") in prose logic within squad.agent.md. Directives are captured to decisions/inbox/ manually. No prompt rewriting occurs.

Proposed improvement:

  • Register an onUserPromptSubmitted hook that:
    1. Scans for directive signals and auto-writes to decisions/inbox/ before LLM processing
    2. Injects relevant team context (recent decisions, active ceremonies) as additional context
    3. Applies team-specific prompt conventions (e.g., "always include error handling" from decisions.md)
    4. Detects agent-addressed messages ("Ripley, fix X") and adds routing hints

Implementation scope:

  • Create packages/squad-sdk/src/hooks/prompt-submitted.ts
  • Directive pattern matcher (regex + heuristics)
  • Context injector reading from decisions.md
  • Wire into session creation

Executive Summary — Business Value:

For Squad product: Moves directive capture from fragile coordinator prose into deterministic code. Enables prompt-level governance (e.g., enterprise teams can enforce "all prompts must include security context"). Opens the door to prompt analytics.

For Squad customers: Directives are never missed — every "always do X" is captured automatically. Team conventions are enforced at the prompt level, not just documented. Reduces cognitive load on the coordinator, freeing context for actual work.


Hook 3: onPreToolUse (Extended) — Dynamic Policy Enforcement

What it does: Fires before every tool execution. Can allow, deny, or modify the call.

Current Squad behavior: 5 static policies (file guards, shell restrictions, rate limiting, PII, reviewer lockout). Policies are defined in code, not configurable from team state.

Proposed improvement:

  • Add decision-driven policies: parse decisions.md for enforceable rules (e.g., "never use jQuery" blocks tool calls that write jQuery imports)
  • Add per-agent workspace boundaries: each agent can only write to files matching their charter scope
  • Add ceremony gates: block certain tool categories during active ceremonies (e.g., no file writes during a design review)
  • Make policy configuration declarative via squad.config.ts or .squad/policies/

Implementation scope:

  • Create packages/squad-sdk/src/hooks/dynamic-policies.ts
  • Decision parser that extracts enforceable rules from decisions.md
  • Per-agent scope resolver using charter metadata
  • Policy config schema in squad.config.ts

Executive Summary — Business Value:

For Squad product: Transforms decisions.md from a passive document into an active enforcement layer. This is a unique differentiator — no other AI team framework auto-enforces team decisions at the tool level.

For Squad customers: Team decisions actually stick. When the team decides "use Tailwind, not Bootstrap", agents physically cannot introduce Bootstrap. Enterprise teams get auditable policy enforcement without custom tooling.


Hook 4: onPostToolUse (Extended) — Auto-Scribe and Audit Trail

What it does: Fires after every tool execution. Can transform results, redact data, or trigger side effects.

Current Squad behavior: PII scrubbing only. Scribe is manually spawned by the coordinator after work batches complete. No automatic audit trail of tool usage.

Proposed improvement:

  • Auto-Scribe trigger: detect when agents write to .squad/ files and automatically queue Scribe merge/log work
  • Audit logging: log every tool call (name, args summary, success/fail, duration) to .squad/orchestration-log/
  • Cross-agent notification: when Agent A writes a file that Agent B's charter covers, flag it
  • Result caching: cache expensive tool results (e.g., large file reads) for reuse within the session

Implementation scope:

  • Create packages/squad-sdk/src/hooks/post-tool-audit.ts
  • Scribe auto-trigger detection (file path pattern matching)
  • Audit log writer (append-only, structured JSON)
  • Cross-agent scope overlap detector

Executive Summary — Business Value:

For Squad product: Eliminates the coordinator's manual Scribe spawning — one of the most error-prone parts of the post-work flow. Makes "silent success detection" obsolete because every tool call is logged.

For Squad customers: Complete audit trail of everything agents did, without any configuration. Enterprise compliance teams get the logging they need. Auto-Scribe means decisions and logs are always current, even if the coordinator context overflows.


Hook 5: onSessionEnd — Automatic Cleanup and Metrics

What it does: Fires when a session ends.

Current Squad behavior: No automatic cleanup. History summarization, log archival, and metrics are ad-hoc (Scribe does it when told, or it doesn't happen).

Proposed improvement:

  • Auto-archive: if history.md exceeds threshold, summarize and archive on session end
  • Metrics capture: tokens used, tools called, files modified, decisions made — write to .squad/metrics/
  • Ralph trigger: if work items remain, notify Ralph or write to .squad/identity/now.md
  • Session summary: auto-generate a brief session recap for .squad/log/

Implementation scope:

  • Create packages/squad-sdk/src/hooks/session-end.ts
  • History size checker + summarizer
  • Metrics collector (reads from session state)
  • Ralph integration (writes to identity/now.md)

Executive Summary — Business Value:

For Squad product: Solves the "history.md grows forever" problem permanently. Makes metrics collection automatic instead of opt-in. Ensures every session leaves a clean state for the next one.

For Squad customers: No more manual cleanup. Session metrics enable cost tracking and optimization. Teams can see exactly what happened in every session without digging through logs.


Hook 6: onErrorOccurred — Programmatic Model Fallback

What it does: Fires when any error occurs during the session.

Current Squad behavior: Model fallback chains are documented in squad.agent.md as prose instructions. The coordinator must interpret and execute them manually. Rate limits, model unavailability, and context overflow are handled ad-hoc.

Proposed improvement:

  • Model fallback as code: define fallback chains (Premium to Standard to Fast to nuclear) in squad.config.ts and execute them automatically
  • Rate limit backoff: detect rate limit errors and apply automatic exponential backoff + retry
  • Context overflow recovery: detect context overflow and automatically summarize and retry with compressed context
  • Error alerting: notify the user or log when errors exceed thresholds

Implementation scope:

  • Create packages/squad-sdk/src/hooks/error-recovery.ts
  • Model fallback chain resolver (reads from config)
  • Rate limit detector + backoff logic
  • Context overflow detector + summarization trigger

Executive Summary — Business Value:

For Squad product: Moves the most failure-prone coordinator logic (model fallback, rate limit handling) from prose instructions into deterministic code. Eliminates ~10-15% of session failures caused by the coordinator mishandling fallbacks.

For Squad customers: Sessions are more resilient. Model unavailability is invisible — the system automatically falls back. Rate limits don't crash sessions. Enterprise teams with model restrictions get automatic routing without coordinator confusion.


Hook 7: Agent Spawn Lifecycle (Squad-Specific — Not in Copilot SDK)

What it does: New hook points in LifecycleManager.spawnAgent()onBeforeSpawn, onAfterSpawn, onSpawnError.

Current Squad behavior: LifecycleManager.spawnAgent() is opaque — no extension points. All spawn customization must happen in the coordinator's prompt template.

Proposed improvement:

  • onBeforeSpawn: validate agent config, check resource limits, apply team policies before creating a session
  • onAfterSpawn: register the agent in a session registry, notify other agents, start monitoring
  • onSpawnError: retry with different model, alert user, log failure details

Implementation scope:

  • Extend LifecycleManager in packages/squad-sdk/src/agents/lifecycle.ts
  • Define SpawnLifecycleHooks interface
  • Wire into coordinator's dispatch flow

Executive Summary — Business Value:

For Squad product: Makes agent spawning extensible for the first time. Plugin authors can customize spawn behavior without forking Squad. Enables features like agent resource quotas, spawn rate limiting, and A/B testing of agent configurations.

For Squad customers: Enterprises can enforce spawn policies (e.g., max 5 concurrent agents, only premium models for security reviews). Plugin marketplace can offer spawn-time customizations.


Implementation Priority

Priority Hook Effort Impact Rationale
P0 onErrorOccurred (model fallback) Medium High Directly reduces session failures — most impactful reliability win
P0 onSessionStart (auto-context) Medium High Token savings compound across every spawn; eliminates context-loading bugs
P1 onPostToolUse (auto-Scribe) Medium High Removes manual Scribe spawning — simplifies coordinator significantly
P1 onPreToolUse (dynamic policies) High High Unique differentiator but complex to parse decisions into enforceable rules
P2 onSessionEnd (cleanup/metrics) Low Medium Quality-of-life; prevents state rot over time
P2 onUserPromptSubmitted (directives) Medium Medium Nice-to-have; current prose-based detection works adequately
P3 Spawn lifecycle hooks High Medium Foundational for plugin ecosystem but not blocking current features

Success Metrics

  • Session failure rate drops from ~10% to less than 3% (model fallback + error recovery)
  • Tokens per spawn reduced by 200-400 (auto-context replaces prompt boilerplate)
  • Scribe coverage reaches 100% of sessions (auto-trigger vs manual)
  • Decision enforcement goes from 0% (advisory) to 100% (programmatic)
  • Customer satisfaction: zero "my agent forgot the team decision" reports

This PRD was generated from an investigation of Squad's current hook system, Copilot SDK documentation, and gap analysis. See the Copilot SDK hooks docs for SDK reference.

Metadata

Metadata

Assignees

No one assigned

    Labels

    go:needs-researchNeeds investigationsquadSquad triage inbox — Lead will assign to a membersquad:fidoAssigned to FIDO (Quality Owner)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions