test: SDK feature parity tests for #340 by bradygaster · Pull Request #422 · bradygaster/squad

bradygaster · 2026-03-15T19:42:57Z

Closes #340

Adds 22 real integration tests for SDK features with actual testable code paths:

✅ Features Tested

Worktree Awareness (9 tests)

esolveSquad() stops at .git boundary (supports worktree files)

esolveSquadPaths() handles local vs remote mode

isConsultMode() detection
�nsureSquadPathDual() validates paths in dual-root mode

Reviewer Lockout (8 tests)

Agent lockout/unlock from artifacts
Lockout registry queries
Hook blocks write tools when agent locked out
Multi-lockout support (2-error threshold)

Deadlock Handling (2 tests)

All-agents-locked-out scenario detection
Escalation via clearLockout()

Skill Confidence Lifecycle (3 tests)

Confidence levels (low/medium/high)
Routing match confidence propagation
Priority assignment based on confidence

🚧 Manual Verification Items

Features that need human/environment verification (created as separate issues assigned to @bradygaster):

Manual verify: Watch mode (#32) #418 — Watch mode (long-running process + GitHub polling)
Manual verify: Ralph idle-watch (#31) #419 — Ralph idle-watch (timer-based triggers)
Manual verify: Client compatibility (#47) #420 — Client compatibility (VS Code extension host)
Manual verify: Human members (#36) #421 — Human members (coordinator → human handoff)

✅ Quality Gates

Build: ✅ Passes (0 errors)
Tests: ✅ 22 new tests pass, 4199 total passing, no regressions
Only failure: aspire-integration (pre-existing, Docker not available)

All tests follow the Product Isolation Rule — no hardcoded agent names from Apollo 13 squad.

Adds 22 integration tests for SDK features with actual code implementations: **Worktree Awareness (9 tests):** - resolveSquad() stops at .git boundary (worktree file support) - resolveSquadPaths() local vs remote mode resolution - isConsultMode() detection - ensureSquadPathDual() path validation for dual-root mode **Reviewer Lockout (8 tests):** - lockout/unlock agent from artifact - getLockedAgents() registry queries - createHook() blocks write tools when locked out - Multi-lockout support (2-error threshold simulation) **Deadlock Handling (2 tests):** - all-agents-locked-out detection - clearLockout() escalation path **Skill Confidence Lifecycle (3 tests):** - confidence levels (low/medium/high) - routing match confidence propagation - priority assignment based on confidence Manual verification items tracked as separate issues: - #418 Watch mode - #419 Ralph idle-watch - #420 Client compatibility - #421 Human members Closes #340 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…er#341) 46 tests covering 4 features: - bradygaster#31 Ralph Idle-Watch Mode (RalphMonitor): 11 tests - bradygaster#47 Client Compatibility (Platform Detection): 16 tests - bradygaster#45 Reviewer Lockout (deepened): 11 tests - bradygaster#46 Deadlock Handling (deepened): 8 tests Combined with batch 1 (PR bradygaster#422) and batch 2 (PR bradygaster#425), automated tests now cover 11 of 13 ⚠️ Needs Setup features from bradygaster#341.

* chore(squad): Phase 2 launch — thinking feedback, P0 bugs, dual telemetry Phase 1 complete: 5 issues closed (bradygaster#325, bradygaster#326, bradygaster#327, bradygaster#328, bradygaster#329), 5 PRs merged. Phase 2 launched with Cheritto (thinking feedback), Hockney (P0 bugs), Saul (dual telemetry). Decision inbox merged and archived. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): Phase 2 Wave 1 merged, Wave 2 launched Session: 2026-02-23T2145-phase2-wave2 Phase 2 Wave 1 complete (PRs bradygaster#351, bradygaster#352, bradygaster#353 merged). Wave 2 launched: Cheritto on ghost response detection (bradygaster#332), Hockney on error hardening (bradygaster#334). Changes: - Session log created: 2026-02-23T2145-phase2-wave2.md - Merged 3 inbox decisions (Cheritto, Hockney, Saul) - Deleted inbox files post-merge Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): Epic bradygaster#323 complete — all phases shipped 🎉 All 3 phases delivered: - Phase 1 (Testing Wave): 6 issues closed - Phase 2 (Improvement): 6 issues closed - Phase 3 (Breathtaking): 7 issues closed - 17 PRs merged, 19 issues closed total Session log: 2026-02-23T2320-epic-complete.md Decisions merged from inbox: P2 UX Polish, first-run wow moment Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * hostile QA: end-to-end quality assessment — 10 findings, 4 HIGH severity Candid assessment requested by Brady. Traced every code path in cli-entry.ts, shell/index.ts, shell/commands.ts, App.tsx, coordinator.ts, spawn.ts, and the SDK adapter client. Key findings: - Dead sessions never evicted from agentSessions Map after connection drop - No React ErrorBoundary — any render throw kills the shell - Nasty-inputs corpus (95 strings) is never imported by any test - No SIGTERM handler in interactive shell - MemoryManager exported but never instantiated (dead code) - Single streaming content slot clobbers multi-agent output - User input silently dropped during processing (no type-ahead buffer) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): quality review findings — 7 issues filed Quality audit complete: 5 agents assessed CLI across testing, coverage, stability, accessibility, UX. Results: 4 P0 blockers (bradygaster#365–bradygaster#368), 3 P1 items (bradygaster#369–bradygaster#371). Blocking: Waingro dead sessions, ErrorBoundary, dropped input; Marquez help text consistency. Changes: - Logged session summary to .squad/log/2026-02-24T0205-quality-review-complete.md - Updated .squad/identity/now.md with quality review findings and new issue numbers Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): merge decision — Marquez UX audit findings Quality assessment merged from inbox (Grade B): 11 improvements (3 P0, 4 P1, 4 P2). help text, stub commands, vocabulary, separators, roster. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): test sprint launch Session: 2026-02-24T0210-test-sprint Changes: - Logged test sprint: 5 agents, 7+ issues - Branches: P0 fixes, stale tests, E2E, hostile/SDK, A11y Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix bradygaster#427: Add immediate shell launch indicator Adds 'Loading Squad shell...' message at start of runShell() to eliminate 2-4 second launch dead air. Message clears once Ink mounts. Users now see feedback within 100ms instead of staring at blank terminal. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(squad): Phase 2 launch — thinking feedback, P0 bugs, dual telemetry Phase 1 complete: 5 issues closed (bradygaster#325, bradygaster#326, bradygaster#327, bradygaster#328, bradygaster#329), 5 PRs merged. Phase 2 launched with Cheritto (thinking feedback), Hockney (P0 bugs), Saul (dual telemetry). Decision inbox merged and archived. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): Phase 2 Wave 1 merged, Wave 2 launched Session: 2026-02-23T2145-phase2-wave2 Phase 2 Wave 1 complete (PRs bradygaster#351, bradygaster#352, bradygaster#353 merged). Wave 2 launched: Cheritto on ghost response detection (bradygaster#332), Hockney on error hardening (bradygaster#334). Changes: - Session log created: 2026-02-23T2145-phase2-wave2.md - Merged 3 inbox decisions (Cheritto, Hockney, Saul) - Deleted inbox files post-merge Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): Epic bradygaster#323 complete — all phases shipped 🎉 All 3 phases delivered: - Phase 1 (Testing Wave): 6 issues closed - Phase 2 (Improvement): 6 issues closed - Phase 3 (Breathtaking): 7 issues closed - 17 PRs merged, 19 issues closed total Session log: 2026-02-23T2320-epic-complete.md Decisions merged from inbox: P2 UX Polish, first-run wow moment Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * hostile QA: end-to-end quality assessment — 10 findings, 4 HIGH severity Candid assessment requested by Brady. Traced every code path in cli-entry.ts, shell/index.ts, shell/commands.ts, App.tsx, coordinator.ts, spawn.ts, and the SDK adapter client. Key findings: - Dead sessions never evicted from agentSessions Map after connection drop - No React ErrorBoundary — any render throw kills the shell - Nasty-inputs corpus (95 strings) is never imported by any test - No SIGTERM handler in interactive shell - MemoryManager exported but never instantiated (dead code) - Single streaming content slot clobbers multi-agent output - User input silently dropped during processing (no type-ahead buffer) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): quality review findings — 7 issues filed Quality audit complete: 5 agents assessed CLI across testing, coverage, stability, accessibility, UX. Results: 4 P0 blockers (bradygaster#365–bradygaster#368), 3 P1 items (bradygaster#369–bradygaster#371). Blocking: Waingro dead sessions, ErrorBoundary, dropped input; Marquez help text consistency. Changes: - Logged session summary to .squad/log/2026-02-24T0205-quality-review-complete.md - Updated .squad/identity/now.md with quality review findings and new issue numbers Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): merge decision — Marquez UX audit findings Quality assessment merged from inbox (Grade B): 11 improvements (3 P0, 4 P1, 4 P2). help text, stub commands, vocabulary, separators, roster. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): test sprint launch Session: 2026-02-24T0210-test-sprint Changes: - Logged test sprint: 5 agents, 7+ issues - Branches: P0 fixes, stale tests, E2E, hostile/SDK, A11y Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix bradygaster#422: Add context to thinking spinner Changed ThinkingIndicator default label from 'Thinking...' to 'Routing to agent...' to give users meaningful feedback during SDK connection and initial routing phases. When activityHint is provided (e.g., 'Keaton thinking...'), it still takes priority. The new default eliminates the 'is it broken?' anxiety during the 3-5 second cold connection wait. Updated tests to reflect new default label. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: Update Marquez history with bradygaster#422 resolution Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix bradygaster#420/bradygaster#425: Add immediate SDK connection feedback Before this fix, the first message sent to the REPL had 2-7 seconds of dead air while createSession() blocked on SDK connection. Users thought the shell was hung. Changes: - Set 'Connecting to SDK...' hint BEFORE createSession() in dispatchToCoordinator - Set 'Connecting to <agent>...' hint BEFORE createSession() in dispatchToAgent - Use setImmediate to give React a tick to render before blocking - Update hint to 'Routing...' or 'thinking...' after connection completes The ThinkingIndicator now displays immediately, eliminating perceived hang. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

bradygaster merged commit 0d86da3 into dev Mar 15, 2026
2 checks passed

bradygaster deleted the squad/340-sdk-feature-parity branch March 15, 2026 20:55

bradygaster mentioned this pull request Mar 15, 2026

SDK feature parity: 29 features need active exercise testing #340

Closed

This was referenced Mar 16, 2026

Shore up squad init --sdk: unified SDK init quality gate #347

Open

PRD: SDK-First Feature Parity — Full Test Results (32/50 verified, 6 gaps, 12 need setup) #341

Open

williamhallatt mentioned this pull request Mar 16, 2026

test: SDK feature parity batch 3 — 46 tests for #31, #47, #45, #46 #428

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: SDK feature parity tests for #340#422

test: SDK feature parity tests for #340#422
bradygaster merged 1 commit intodevfrom
squad/340-sdk-feature-parity

bradygaster commented Mar 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bradygaster commented Mar 15, 2026

✅ Features Tested

Worktree Awareness (9 tests)

esolveSquad() stops at .git boundary (supports worktree files)

Reviewer Lockout (8 tests)

Deadlock Handling (2 tests)

Skill Confidence Lifecycle (3 tests)

🚧 Manual Verification Items

✅ Quality Gates

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant