PRD: Copilot SDK Hooks Lifecycle Plugin for Squad
Executive Summary
This PRD proposes a Squad plugin (not a core product change) that implements Copilot SDK lifecycle hooks as a composable, installable package. Unlike PRD #151 which embeds hooks into Squad's core, this approach treats the hooks lifecycle as an opt-in plugin distributed via the Squad marketplace.
The plugin registers onSessionStart, onUserPromptSubmitted, onPreToolUse, onPostToolUse, onSessionEnd, and onErrorOccurred handlers using Squad's existing HookPipeline and SquadSessionHooks APIs. Users install it, configure what they want, and Squad's plugin system activates the hooks at runtime.
Why a plugin instead of core?
- Not every team needs every hook — a plugin lets teams opt into exactly what they want
- Faster iteration — plugin ships independently of Squad's release cycle
- Marketplace showcase — demonstrates that Squad's extension system can handle real governance
- Lower risk — doesn't touch Squad's critical path; users can uninstall if hooks misbehave
- Composable — teams can fork the plugin and customize hooks for their domain
How This Fits Squad's Extension Architecture
Squad has a three-layer architecture for extensibility:
| Layer |
What Lives Here |
This Plugin's Relationship |
| Squad Core |
HookPipeline, SquadSessionHooks, adapter layer |
Plugin consumes these APIs — does NOT modify them |
| Squad Extensions (plugins) |
Skills, ceremonies, hook packages |
This plugin lives here — installed via marketplace |
| Team Config |
squad.config.ts, .squad/policies/ |
Plugin reads config to know which hooks to activate |
The plugin uses publicly exported APIs:
HookPipeline from @bradygaster/squad-sdk/hooks (addPreToolHook, addPostToolHook)
SquadSessionHooks from adapter/types.ts (onSessionStart, onSessionEnd, onErrorOccurred, onUserPromptSubmitted)
SquadSessionConfig for session-level hook registration
No core changes required — the plugin works with Squad as it ships today.
Plugin Structure
squad-hooks-lifecycle/
├── package.json
├── README.md
├── src/
│ ├── index.ts # Plugin entry — exports registerHooks()
│ ├── hooks/
│ │ ├── session-start.ts # onSessionStart — auto-context loading
│ │ ├── prompt-guard.ts # onUserPromptSubmitted — directive capture, prompt enhancement
│ │ ├── tool-guard.ts # onPreToolUse — dynamic policies from decisions.md
│ │ ├── tool-audit.ts # onPostToolUse — auto-Scribe, audit trail
│ │ ├── session-end.ts # onSessionEnd — cleanup, metrics, Ralph trigger
│ │ └── error-recovery.ts # onErrorOccurred — model fallback, rate limit backoff
│ ├── parsers/
│ │ ├── decision-parser.ts # Extracts enforceable rules from decisions.md
│ │ └── directive-detector.ts # Regex + heuristics for "always/never" signals
│ └── config.ts # Plugin configuration schema
├── skill/
│ └── SKILL.md # Squad skill teaching agents about the hooks
└── test/
└── *.test.ts # Tests for each hook
Registration API
// squad.config.ts — user's project
import { defineSquad } from '@bradygaster/squad-sdk';
import { registerHooks } from 'squad-hooks-lifecycle';
export default defineSquad({
plugins: [
registerHooks({
// Toggle individual hooks on/off
sessionStart: {
enabled: true,
autoLoadHistory: true, // inject history.md at session start
autoLoadDecisions: true, // inject decisions.md snapshot
autoLoadSkills: true, // inject matching skills
maxHistoryEntries: 20, // limit context size
},
promptGuard: {
enabled: true,
captureDirectives: true, // auto-detect "always/never" patterns
injectDecisions: true, // add relevant decisions to prompt context
},
toolGuard: {
enabled: true,
enforceDecisions: true, // parse decisions.md into tool policies
agentBoundaries: true, // restrict writes to charter scope
decisionFormat: 'structured', // 'structured' | 'prose-inferred'
},
toolAudit: {
enabled: true,
autoScribe: true, // trigger Scribe on .squad/ writes
auditLog: true, // log all tool calls
logPath: '.squad/audit/', // where audit logs go
},
sessionEnd: {
enabled: true,
autoArchiveHistory: true, // summarize history.md if over threshold
captureMetrics: true, // write session metrics
ralphTrigger: true, // notify Ralph of remaining work
},
errorRecovery: {
enabled: true,
modelFallback: true, // automatic model fallback chains
rateLimitBackoff: true, // exponential backoff on rate limits
contextOverflowRecovery: true, // summarize + retry on overflow
fallbackChain: {
premium: ['claude-opus-4.6', 'claude-opus-4.5', 'claude-sonnet-4.6'],
standard: ['claude-sonnet-4.6', 'claude-sonnet-4.5', 'gpt-5.4'],
fast: ['claude-haiku-4.5', 'gpt-5.4-mini', 'gpt-4.1'],
},
},
}),
],
});
Real-World Use Cases
Use Cases in Manual Copilot CLI Chat Mode
These are scenarios where a human is sitting at the terminal, chatting with Squad via Copilot CLI.
UC-1: "My agents keep forgetting team decisions"
Persona: Solo developer using Squad on a side project. Has 10+ decisions in decisions.md but agents routinely ignore them.
Hook: onSessionStart (auto-context loading)
Before plugin: User starts a session. Agent spawns. Coordinator prompt says "read decisions.md." Agent reads 3 of 12 decisions (context budget), misses the one about "always use Tailwind". Agent writes Bootstrap CSS. User is frustrated.
With plugin: onSessionStart hook fires, reads decisions.md, and injects the full decision set as structured context. Agent sees all 12 decisions before it writes a single line. Bootstrap is never introduced.
Business value:
Product Squad: Proves that Squad's governance model works in practice, not just on paper. Turns decisions.md from a suggestion box into an enforcement layer — a key differentiator in the AI team framework space.
Customer: Decisions actually stick. The team's institutional memory works every session, not just when agents happen to read the right file.
UC-2: "I said 'always use async/await' last week and the agent used callbacks today"
Persona: Tech lead using Squad for a Node.js project. Gave a directive in a previous session that wasn't carried forward.
Hook: onUserPromptSubmitted (directive capture)
Before plugin: User says "always use async/await, never callbacks." Coordinator captures it in decisions/inbox/ (maybe). Next session, different coordinator context, directive is lost. Agent writes callback-style code.
With plugin: onUserPromptSubmitted hook fires on every message. Detects the "always/never" pattern. Auto-writes to decisions/inbox/. On the next message in any session, onSessionStart loads it from decisions.md. Directive persists permanently.
Business value:
Product Squad: Demonstrates that Squad has memory across sessions — not just within a session. This is a top-3 user complaint ("my agents forget things") and the plugin solves it mechanically.
Customer: Say it once, it sticks forever. No more repeating yourself across sessions.
UC-3: "An agent edited a file it shouldn't have touched"
Persona: Team of 3 using Squad with role-based agents. Backend agent modified a frontend component.
Hook: onPreToolUse (dynamic policies + agent boundaries)
Before plugin: Backend agent "Fenster" is told to fix an API bug. While exploring, it edits src/components/LoginForm.tsx — a frontend file outside its scope. PR review catches it, but time is wasted.
With plugin: onPreToolUse hook fires before every file write. Checks agent's charter scope (backend: src/api/**, src/services/**). LoginForm.tsx is outside scope — write is blocked with reason: "File outside your charter scope. Route to the Frontend agent." Agent self-corrects or coordinator re-routes.
Business value:
Product Squad: Enforces the agent boundary model that Squad promises but currently only advises. Moves from "agents should stay in scope" to "agents cannot leave scope."
Customer: Clean PRs with no scope creep. Code review effort drops because agents physically can't touch files outside their domain.
UC-4: "I want to know what my agents actually did"
Persona: Engineering manager who runs Squad sessions but wants a trail of what happened for compliance.
Hook: onPostToolUse (audit trail)
Before plugin: Session ends. Manager asks "what files did agents touch?" Answer: check orchestration logs (if Scribe ran), check git diff (if agents committed), or re-read the session transcript. Tedious.
With plugin: Every tool call is logged to .squad/audit/ with timestamp, agent name, tool name, arguments summary, and result status. Manager runs cat .squad/audit/2026-04-14*.jsonl | jq '.toolName' | sort | uniq -c and gets a complete picture in seconds.
Business value:
Product Squad: Opens Squad to enterprise customers who require audit trails. Compliance-ready logging is a gate for regulated industries (fintech, healthcare, government).
Customer: Complete visibility without effort. "What happened?" is answered by a JSONL file, not a detective mission through logs.
UC-5: "My session crashed because the model hit a rate limit"
Persona: Developer running a large fan-out (5 agents in parallel). Third agent hits a rate limit and the session dies.
Hook: onErrorOccurred (model fallback + rate limit backoff)
Before plugin: Agent 3 of 5 hits rate limit. Error surfaces to coordinator. Coordinator's prose instructions say "retry with next model" but the context is already pressured from 2 completed agents. Coordinator fumbles the fallback. Session stalls.
With plugin: onErrorOccurred hook fires immediately. Detects rate limit error code. Applies exponential backoff (1s, 2s, 4s). If still rate-limited, falls back to next model in chain (claude-sonnet-4.6 -> claude-sonnet-4.5 -> gpt-5.4). Agent 3 continues on the fallback model. User never notices.
Business value:
Product Squad: Directly addresses the #1 reliability complaint — sessions dying mid-work. Makes Squad viable for heavy workloads (5+ parallel agents) that stress model rate limits.
Customer: Sessions don't crash. Rate limits are invisible. The work gets done even when infrastructure hiccups.
UC-6: "History.md is 50KB and my agents are slow"
Persona: Long-running project (3 months). History files have grown to 30-50KB per agent. Session starts are sluggish.
Hook: onSessionEnd (auto-archive + metrics)
Before plugin: User must remember to ask Scribe to summarize. If they forget, history grows unbounded. Eventually agents burn 2000+ tokens just reading their own history. Session startup takes 45+ seconds.
With plugin: onSessionEnd hook fires every time a session ends. Checks each agent's history.md size. If over 15KB, triggers summarization — moves old entries to history-archive.md, keeps a compressed summary. Next session starts with a lean 3KB history. Metrics written to .squad/metrics/session-2026-04-14.json.
Business value:
Product Squad: Solves "Squad gets slower over time" — a retention killer. Makes Squad viable for long-running projects without manual maintenance.
Customer: Zero-maintenance history hygiene. Sessions stay fast month after month. Metrics show token trends so teams can optimize.
Use Cases in Agentic SDK Mode
These are scenarios where Squad is embedded in an application via @bradygaster/squad-sdk, running autonomously without a human at the keyboard.
UC-7: "CI pipeline spawns agents to review PRs — needs guardrails"
Persona: Platform team running Squad in CI. GitHub Action triggers Squad SDK to review every PR with 3 agents (security, architecture, tests).
Hook: onPreToolUse (tool guard) + onSessionStart (auto-context) + onErrorOccurred (fallback)
Scenario: PR bradygaster#247 is opened. CI triggers Squad SDK. Three review agents spawn. Security agent needs to read .env.example but must never read .env. Architecture agent should only read, never write. Test agent can write test files only.
import { CopilotClient } from '@github/copilot-sdk';
import { registerHooks } from 'squad-hooks-lifecycle';
const hooks = registerHooks({
sessionStart: { enabled: true, autoLoadDecisions: true },
toolGuard: {
enabled: true,
agentBoundaries: true,
// Security agent: read-only, no .env
// Architecture agent: read-only
// Test agent: write to test/** only
},
errorRecovery: { enabled: true, modelFallback: true },
});
const client = new CopilotClient();
const session = await client.createSession({ hooks: hooks.asSessionHooks() });
Business value:
Product Squad: Proves Squad SDK works in headless/CI environments — not just interactive chat. This is the enterprise adoption vector: automated code review pipelines powered by Squad agents with governance.
Customer: Automated PR review with real guardrails. Security agent can't leak secrets. Architecture agent can't introduce changes. All enforced by hooks, not prompts.
UC-8: "Nightly batch processes 50 issues — needs cost control"
Persona: Startup using Squad SDK to auto-triage and fix GitHub issues overnight. Batch of 50 issues. Budget: $20/night.
Hook: onSessionEnd (metrics) + onErrorOccurred (fallback for cost) + onPreToolUse (rate limiting)
Scenario: Ralph-like loop processes issues. After 30 issues, token spend approaches budget. Need to downgrade models or pause.
const hooks = registerHooks({
sessionEnd: {
enabled: true,
captureMetrics: true,
// Metrics include: tokens_used, model, duration, tools_called
},
errorRecovery: {
enabled: true,
modelFallback: true,
fallbackChain: {
// Start cheap, only upgrade if needed
standard: ['claude-haiku-4.5', 'gpt-5.4-mini'],
},
},
toolGuard: {
enabled: true,
// Custom pre-tool hook: check cumulative cost
customPreHooks: [
async (ctx) => {
const spent = await readMetrics();
if (spent > 18.00) {
return { action: 'block', reason: 'Budget limit reached ($18/$20)' };
}
return { action: 'allow' };
},
],
},
});
Business value:
Product Squad: Enables cost-controlled autonomous operation — the missing piece for Squad in production. No AI team framework offers per-session budget enforcement at the tool level.
Customer: Run Squad overnight without fear of runaway costs. Budget caps are enforced mechanically, not by hoping the coordinator remembers.
UC-9: "Multi-tenant SaaS — each customer gets different agent permissions"
Persona: SaaS platform embedding Squad SDK. Each customer tenant has different permissions (free tier: read-only agents, pro tier: full agents, enterprise: custom policies).
Hook: onSessionStart (inject tenant config) + onPreToolUse (tenant-scoped permissions)
Scenario: Customer A (free) starts a session. Customer B (enterprise) starts a session. Same Squad SDK, different permissions.
function createTenantHooks(tenant: Tenant) {
return registerHooks({
sessionStart: {
enabled: true,
// Inject tenant-specific context
customContext: `Tenant: ${tenant.name}, Plan: ${tenant.plan}`,
},
toolGuard: {
enabled: true,
customPreHooks: [
async (ctx) => {
if (tenant.plan === 'free' && isWriteTool(ctx.toolName)) {
return { action: 'block', reason: 'Upgrade to Pro for write access' };
}
if (tenant.plan !== 'enterprise' && isDangerousTool(ctx.toolName)) {
return { action: 'block', reason: 'Enterprise only' };
}
return { action: 'allow' };
},
],
},
});
}
Business value:
Product Squad: Positions Squad SDK as embeddable in SaaS products — a new market segment. Multi-tenant governance via hooks is a compelling SDK selling point.
Customer: Build AI-powered features with Squad SDK and offer tiered access to customers. Permission boundaries are enforced by hooks, not application logic.
UC-10: "Autonomous agent loop with human-in-the-loop approval gates"
Persona: Enterprise deploying Squad SDK for code generation. Policy: all file writes must be approved by a human reviewer before committing.
Hook: onPreToolUse (approval gate) + onPostToolUse (audit) + onSessionEnd (compliance report)
Scenario: Agent generates code. Before any file write, a webhook fires to the approval system. Human approves or rejects. Agent proceeds or stops.
const hooks = registerHooks({
toolGuard: {
enabled: true,
customPreHooks: [
async (ctx) => {
if (isWriteTool(ctx.toolName)) {
const approval = await requestHumanApproval({
agent: ctx.agentName,
tool: ctx.toolName,
file: ctx.arguments.path,
preview: ctx.arguments.content?.substring(0, 500),
});
if (approval.status === 'rejected') {
return { action: 'block', reason: `Rejected by ${approval.reviewer}: ${approval.reason}` };
}
}
return { action: 'allow' };
},
],
},
toolAudit: {
enabled: true,
auditLog: true,
// Every approval/rejection is logged for compliance
},
sessionEnd: {
enabled: true,
captureMetrics: true,
// Generate compliance report: N writes approved, M rejected, by whom
},
});
Business value:
Product Squad: Unlocks regulated industries (finance, healthcare, defense) where autonomous AI agents require human oversight. This is the enterprise gate — without it, Squad can't enter these markets.
Customer: Full human-in-the-loop control without losing the speed of AI agents. Every action is approved, logged, and reportable for auditors.
UC-11: "Agent fleet with centralized observability"
Persona: DevOps team running 20+ Squad instances across microservices. Needs centralized monitoring of all agent activity.
Hook: onPostToolUse (telemetry) + onErrorOccurred (alerting) + onSessionEnd (aggregation)
Scenario: Each microservice repo has its own Squad. DevOps needs a single dashboard showing: which agents ran, what they did, error rates, token spend.
const hooks = registerHooks({
toolAudit: {
enabled: true,
customPostHooks: [
async (ctx) => {
// Ship telemetry to centralized collector
await fetch('https://telemetry.internal/ingest', {
method: 'POST',
body: JSON.stringify({
repo: process.env.REPO_NAME,
agent: ctx.agentName,
tool: ctx.toolName,
duration: ctx.duration,
success: !ctx.error,
timestamp: new Date().toISOString(),
}),
});
return { result: ctx.result }; // pass through unchanged
},
],
},
errorRecovery: {
enabled: true,
customErrorHooks: [
async (error) => {
// Alert on-call if error rate spikes
await fetch('https://pagerduty.internal/alert', {
method: 'POST',
body: JSON.stringify({ source: 'squad', error: error.message }),
});
},
],
},
});
Business value:
Product Squad: Enables fleet-scale Squad deployments. Observability is non-negotiable for platform teams managing multiple AI agent instances. This makes Squad enterprise-infrastructure-grade.
Customer: One dashboard for all Squad activity across all repos. Error spikes trigger PagerDuty. Token spend is tracked per-repo. DevOps has full visibility.
UC-12: "Auto-fix failing CI — agents respond to webhook, hooks enforce safety"
Persona: Platform team. When CI fails, a webhook triggers Squad SDK to diagnose and fix the failure autonomously.
Hook: onSessionStart (inject CI context) + onPreToolUse (safety limits) + onPostToolUse (validate fix) + onSessionEnd (report)
Scenario: CI fails on PR bradygaster#312. Webhook fires. Squad SDK spins up. Agent reads CI logs, identifies the issue, applies a fix, runs tests locally, pushes if green.
const hooks = registerHooks({
sessionStart: {
enabled: true,
customContext: async () => {
const ciLogs = await fetchCILogs(process.env.CI_RUN_ID);
return `CI Failure Context:\n${ciLogs.substring(0, 5000)}`;
},
},
toolGuard: {
enabled: true,
customPreHooks: [
async (ctx) => {
// Safety: max 3 file edits per CI-fix session
const editCount = getSessionEditCount();
if (isWriteTool(ctx.toolName) && editCount >= 3) {
return { action: 'block', reason: 'Max 3 edits per CI-fix session. Escalate to human.' };
}
// Safety: never edit CI config files
if (ctx.arguments?.path?.includes('.github/workflows/')) {
return { action: 'block', reason: 'Cannot modify CI workflows autonomously.' };
}
return { action: 'allow' };
},
],
},
sessionEnd: {
enabled: true,
captureMetrics: true,
// Report: what was fixed, how many attempts, was it pushed
},
});
Business value:
Product Squad: Demonstrates Squad SDK as an autonomous CI remediation engine — a high-value, high-visibility use case that sells itself. Every engineering team wants "CI fixes itself."
Customer: CI failures are auto-diagnosed and fixed with guardrails. Agent can't go rogue (max 3 edits, can't modify workflows). If it can't fix it in 3 edits, it escalates to a human.
Plugin vs. Core: Decision Matrix
| Factor |
Core (PRD #151) |
Plugin (This PRD) |
| Availability |
All Squad users get it |
Opt-in install |
| Release cycle |
Tied to Squad releases |
Independent — ship anytime |
| Customization |
Config only |
Fork + modify freely |
| Risk to core |
High — touches critical path |
Zero — external package |
| Enterprise appeal |
"Built-in governance" |
"Composable governance" |
| Cost to implement |
High — needs core review |
Medium — uses public APIs |
| Backward compat |
Must not break existing users |
Only affects users who install |
| Marketplace signal |
N/A |
Proves marketplace works for real governance plugins |
Recommendation: Ship the plugin FIRST. If adoption proves demand, promote the most-used hooks into core in a future release. This is the lower-risk path that still delivers all the value.
Implementation Priority
| Priority |
Hook |
CLI Chat Value |
SDK Agentic Value |
| P0 |
onErrorOccurred (model fallback) |
Sessions stop crashing |
Autonomous loops stay alive |
| P0 |
onSessionStart (auto-context) |
Agents remember decisions |
Headless agents get full context |
| P1 |
onPreToolUse (dynamic policies) |
Agents stay in scope |
Multi-tenant permissions |
| P1 |
onPostToolUse (audit trail) |
"What happened?" answered |
Fleet observability |
| P2 |
onSessionEnd (cleanup/metrics) |
History stays lean |
Cost tracking per-session |
| P2 |
onUserPromptSubmitted (directives) |
Directives persist |
Prompt-level governance |
Success Metrics
- Plugin installs via marketplace: 50+ repos in first quarter
- Session failure rate for plugin users drops below 3% (from ~10%)
- Audit log adoption: 80%+ of plugin users enable audit trail
- Zero-config value: users who install with defaults see immediate improvement (auto-context + error recovery)
This PRD complements PRD #151 (core hooks). Both can coexist — the plugin uses Squad's public APIs, and if specific hooks prove essential, they can graduate into core.
PRD: Copilot SDK Hooks Lifecycle Plugin for Squad
Executive Summary
This PRD proposes a Squad plugin (not a core product change) that implements Copilot SDK lifecycle hooks as a composable, installable package. Unlike PRD #151 which embeds hooks into Squad's core, this approach treats the hooks lifecycle as an opt-in plugin distributed via the Squad marketplace.
The plugin registers
onSessionStart,onUserPromptSubmitted,onPreToolUse,onPostToolUse,onSessionEnd, andonErrorOccurredhandlers using Squad's existingHookPipelineandSquadSessionHooksAPIs. Users install it, configure what they want, and Squad's plugin system activates the hooks at runtime.Why a plugin instead of core?
How This Fits Squad's Extension Architecture
Squad has a three-layer architecture for extensibility:
HookPipeline,SquadSessionHooks, adapter layersquad.config.ts,.squad/policies/The plugin uses publicly exported APIs:
HookPipelinefrom@bradygaster/squad-sdk/hooks(addPreToolHook, addPostToolHook)SquadSessionHooksfromadapter/types.ts(onSessionStart, onSessionEnd, onErrorOccurred, onUserPromptSubmitted)SquadSessionConfigfor session-level hook registrationNo core changes required — the plugin works with Squad as it ships today.
Plugin Structure
Registration API
Real-World Use Cases
Use Cases in Manual Copilot CLI Chat Mode
These are scenarios where a human is sitting at the terminal, chatting with Squad via Copilot CLI.
UC-1: "My agents keep forgetting team decisions"
Persona: Solo developer using Squad on a side project. Has 10+ decisions in decisions.md but agents routinely ignore them.
Hook:
onSessionStart(auto-context loading)Before plugin: User starts a session. Agent spawns. Coordinator prompt says "read decisions.md." Agent reads 3 of 12 decisions (context budget), misses the one about "always use Tailwind". Agent writes Bootstrap CSS. User is frustrated.
With plugin:
onSessionStarthook fires, reads decisions.md, and injects the full decision set as structured context. Agent sees all 12 decisions before it writes a single line. Bootstrap is never introduced.Business value:
UC-2: "I said 'always use async/await' last week and the agent used callbacks today"
Persona: Tech lead using Squad for a Node.js project. Gave a directive in a previous session that wasn't carried forward.
Hook:
onUserPromptSubmitted(directive capture)Before plugin: User says "always use async/await, never callbacks." Coordinator captures it in decisions/inbox/ (maybe). Next session, different coordinator context, directive is lost. Agent writes callback-style code.
With plugin:
onUserPromptSubmittedhook fires on every message. Detects the "always/never" pattern. Auto-writes to decisions/inbox/. On the next message in any session,onSessionStartloads it from decisions.md. Directive persists permanently.Business value:
UC-3: "An agent edited a file it shouldn't have touched"
Persona: Team of 3 using Squad with role-based agents. Backend agent modified a frontend component.
Hook:
onPreToolUse(dynamic policies + agent boundaries)Before plugin: Backend agent "Fenster" is told to fix an API bug. While exploring, it edits
src/components/LoginForm.tsx— a frontend file outside its scope. PR review catches it, but time is wasted.With plugin:
onPreToolUsehook fires before every file write. Checks agent's charter scope (backend:src/api/**,src/services/**).LoginForm.tsxis outside scope — write is blocked with reason: "File outside your charter scope. Route to the Frontend agent." Agent self-corrects or coordinator re-routes.Business value:
UC-4: "I want to know what my agents actually did"
Persona: Engineering manager who runs Squad sessions but wants a trail of what happened for compliance.
Hook:
onPostToolUse(audit trail)Before plugin: Session ends. Manager asks "what files did agents touch?" Answer: check orchestration logs (if Scribe ran), check git diff (if agents committed), or re-read the session transcript. Tedious.
With plugin: Every tool call is logged to
.squad/audit/with timestamp, agent name, tool name, arguments summary, and result status. Manager runscat .squad/audit/2026-04-14*.jsonl | jq '.toolName' | sort | uniq -cand gets a complete picture in seconds.Business value:
UC-5: "My session crashed because the model hit a rate limit"
Persona: Developer running a large fan-out (5 agents in parallel). Third agent hits a rate limit and the session dies.
Hook:
onErrorOccurred(model fallback + rate limit backoff)Before plugin: Agent 3 of 5 hits rate limit. Error surfaces to coordinator. Coordinator's prose instructions say "retry with next model" but the context is already pressured from 2 completed agents. Coordinator fumbles the fallback. Session stalls.
With plugin:
onErrorOccurredhook fires immediately. Detects rate limit error code. Applies exponential backoff (1s, 2s, 4s). If still rate-limited, falls back to next model in chain (claude-sonnet-4.6->claude-sonnet-4.5->gpt-5.4). Agent 3 continues on the fallback model. User never notices.Business value:
UC-6: "History.md is 50KB and my agents are slow"
Persona: Long-running project (3 months). History files have grown to 30-50KB per agent. Session starts are sluggish.
Hook:
onSessionEnd(auto-archive + metrics)Before plugin: User must remember to ask Scribe to summarize. If they forget, history grows unbounded. Eventually agents burn 2000+ tokens just reading their own history. Session startup takes 45+ seconds.
With plugin:
onSessionEndhook fires every time a session ends. Checks each agent's history.md size. If over 15KB, triggers summarization — moves old entries to history-archive.md, keeps a compressed summary. Next session starts with a lean 3KB history. Metrics written to.squad/metrics/session-2026-04-14.json.Business value:
Use Cases in Agentic SDK Mode
These are scenarios where Squad is embedded in an application via
@bradygaster/squad-sdk, running autonomously without a human at the keyboard.UC-7: "CI pipeline spawns agents to review PRs — needs guardrails"
Persona: Platform team running Squad in CI. GitHub Action triggers Squad SDK to review every PR with 3 agents (security, architecture, tests).
Hook:
onPreToolUse(tool guard) +onSessionStart(auto-context) +onErrorOccurred(fallback)Scenario: PR bradygaster#247 is opened. CI triggers Squad SDK. Three review agents spawn. Security agent needs to read
.env.examplebut must never read.env. Architecture agent should only read, never write. Test agent can write test files only.Business value:
UC-8: "Nightly batch processes 50 issues — needs cost control"
Persona: Startup using Squad SDK to auto-triage and fix GitHub issues overnight. Batch of 50 issues. Budget: $20/night.
Hook:
onSessionEnd(metrics) +onErrorOccurred(fallback for cost) +onPreToolUse(rate limiting)Scenario: Ralph-like loop processes issues. After 30 issues, token spend approaches budget. Need to downgrade models or pause.
Business value:
UC-9: "Multi-tenant SaaS — each customer gets different agent permissions"
Persona: SaaS platform embedding Squad SDK. Each customer tenant has different permissions (free tier: read-only agents, pro tier: full agents, enterprise: custom policies).
Hook:
onSessionStart(inject tenant config) +onPreToolUse(tenant-scoped permissions)Scenario: Customer A (free) starts a session. Customer B (enterprise) starts a session. Same Squad SDK, different permissions.
Business value:
UC-10: "Autonomous agent loop with human-in-the-loop approval gates"
Persona: Enterprise deploying Squad SDK for code generation. Policy: all file writes must be approved by a human reviewer before committing.
Hook:
onPreToolUse(approval gate) +onPostToolUse(audit) +onSessionEnd(compliance report)Scenario: Agent generates code. Before any file write, a webhook fires to the approval system. Human approves or rejects. Agent proceeds or stops.
Business value:
UC-11: "Agent fleet with centralized observability"
Persona: DevOps team running 20+ Squad instances across microservices. Needs centralized monitoring of all agent activity.
Hook:
onPostToolUse(telemetry) +onErrorOccurred(alerting) +onSessionEnd(aggregation)Scenario: Each microservice repo has its own Squad. DevOps needs a single dashboard showing: which agents ran, what they did, error rates, token spend.
Business value:
UC-12: "Auto-fix failing CI — agents respond to webhook, hooks enforce safety"
Persona: Platform team. When CI fails, a webhook triggers Squad SDK to diagnose and fix the failure autonomously.
Hook:
onSessionStart(inject CI context) +onPreToolUse(safety limits) +onPostToolUse(validate fix) +onSessionEnd(report)Scenario: CI fails on PR bradygaster#312. Webhook fires. Squad SDK spins up. Agent reads CI logs, identifies the issue, applies a fix, runs tests locally, pushes if green.
Business value:
Plugin vs. Core: Decision Matrix
Recommendation: Ship the plugin FIRST. If adoption proves demand, promote the most-used hooks into core in a future release. This is the lower-risk path that still delivers all the value.
Implementation Priority
onErrorOccurred(model fallback)onSessionStart(auto-context)onPreToolUse(dynamic policies)onPostToolUse(audit trail)onSessionEnd(cleanup/metrics)onUserPromptSubmitted(directives)Success Metrics
This PRD complements PRD #151 (core hooks). Both can coexist — the plugin uses Squad's public APIs, and if specific hooks prove essential, they can graduate into core.