Skip to content

test: SDK feature parity tests for #340#422

Merged
bradygaster merged 1 commit intodevfrom
squad/340-sdk-feature-parity
Mar 15, 2026
Merged

test: SDK feature parity tests for #340#422
bradygaster merged 1 commit intodevfrom
squad/340-sdk-feature-parity

Conversation

@bradygaster
Copy link
Copy Markdown
Owner

Closes #340

Adds 22 real integration tests for SDK features with actual testable code paths:

✅ Features Tested

Worktree Awareness (9 tests)

esolveSquad() stops at .git boundary (supports worktree files)

esolveSquadPaths() handles local vs remote mode

  • isConsultMode() detection
  • �nsureSquadPathDual() validates paths in dual-root mode

Reviewer Lockout (8 tests)

  • Agent lockout/unlock from artifacts
  • Lockout registry queries
  • Hook blocks write tools when agent locked out
  • Multi-lockout support (2-error threshold)

Deadlock Handling (2 tests)

  • All-agents-locked-out scenario detection
  • Escalation via clearLockout()

Skill Confidence Lifecycle (3 tests)

  • Confidence levels (low/medium/high)
  • Routing match confidence propagation
  • Priority assignment based on confidence

🚧 Manual Verification Items

Features that need human/environment verification (created as separate issues assigned to @bradygaster):

✅ Quality Gates

  • Build: ✅ Passes (0 errors)
  • Tests: ✅ 22 new tests pass, 4199 total passing, no regressions
  • Only failure: aspire-integration (pre-existing, Docker not available)

All tests follow the Product Isolation Rule — no hardcoded agent names from Apollo 13 squad.

Adds 22 integration tests for SDK features with actual code implementations:

**Worktree Awareness (9 tests):**
- resolveSquad() stops at .git boundary (worktree file support)
- resolveSquadPaths() local vs remote mode resolution
- isConsultMode() detection
- ensureSquadPathDual() path validation for dual-root mode

**Reviewer Lockout (8 tests):**
- lockout/unlock agent from artifact
- getLockedAgents() registry queries
- createHook() blocks write tools when locked out
- Multi-lockout support (2-error threshold simulation)

**Deadlock Handling (2 tests):**
- all-agents-locked-out detection
- clearLockout() escalation path

**Skill Confidence Lifecycle (3 tests):**
- confidence levels (low/medium/high)
- routing match confidence propagation
- priority assignment based on confidence

Manual verification items tracked as separate issues:
- #418 Watch mode
- #419 Ralph idle-watch
- #420 Client compatibility
- #421 Human members

Closes #340

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@bradygaster bradygaster merged commit 0d86da3 into dev Mar 15, 2026
2 checks passed
@bradygaster bradygaster deleted the squad/340-sdk-feature-parity branch March 15, 2026 20:55
williamhallatt added a commit to williamhallatt/squad that referenced this pull request Mar 16, 2026
…er#341)

46 tests covering 4 features:
- bradygaster#31 Ralph Idle-Watch Mode (RalphMonitor): 11 tests
- bradygaster#47 Client Compatibility (Platform Detection): 16 tests
- bradygaster#45 Reviewer Lockout (deepened): 11 tests
- bradygaster#46 Deadlock Handling (deepened): 8 tests

Combined with batch 1 (PR bradygaster#422) and batch 2 (PR bradygaster#425), automated
tests now cover 11 of 13 ⚠️ Needs Setup features from bradygaster#341.
tamirdresher pushed a commit to tamirdresher/squad that referenced this pull request Mar 16, 2026
* chore(squad): Phase 2 launch — thinking feedback, P0 bugs, dual telemetry

Phase 1 complete: 5 issues closed (bradygaster#325, bradygaster#326, bradygaster#327, bradygaster#328, bradygaster#329), 5 PRs merged.
Phase 2 launched with Cheritto (thinking feedback), Hockney (P0 bugs), Saul (dual telemetry).
Decision inbox merged and archived.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(squad): Phase 2 Wave 1 merged, Wave 2 launched

Session: 2026-02-23T2145-phase2-wave2
Phase 2 Wave 1 complete (PRs bradygaster#351, bradygaster#352, bradygaster#353 merged).
Wave 2 launched: Cheritto on ghost response detection (bradygaster#332), Hockney on error hardening (bradygaster#334).

Changes:
- Session log created: 2026-02-23T2145-phase2-wave2.md
- Merged 3 inbox decisions (Cheritto, Hockney, Saul)
- Deleted inbox files post-merge

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(squad): Epic bradygaster#323 complete — all phases shipped 🎉

All 3 phases delivered:
- Phase 1 (Testing Wave): 6 issues closed
- Phase 2 (Improvement): 6 issues closed
- Phase 3 (Breathtaking): 7 issues closed
- 17 PRs merged, 19 issues closed total

Session log: 2026-02-23T2320-epic-complete.md
Decisions merged from inbox: P2 UX Polish, first-run wow moment

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* hostile QA: end-to-end quality assessment — 10 findings, 4 HIGH severity

Candid assessment requested by Brady. Traced every code path in cli-entry.ts,
shell/index.ts, shell/commands.ts, App.tsx, coordinator.ts, spawn.ts, and the
SDK adapter client.

Key findings:
- Dead sessions never evicted from agentSessions Map after connection drop
- No React ErrorBoundary — any render throw kills the shell
- Nasty-inputs corpus (95 strings) is never imported by any test
- No SIGTERM handler in interactive shell
- MemoryManager exported but never instantiated (dead code)
- Single streaming content slot clobbers multi-agent output
- User input silently dropped during processing (no type-ahead buffer)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(squad): quality review findings — 7 issues filed

Quality audit complete: 5 agents assessed CLI across testing, coverage, stability, accessibility, UX.
Results: 4 P0 blockers (bradygaster#365bradygaster#368), 3 P1 items (bradygaster#369bradygaster#371).
Blocking: Waingro dead sessions, ErrorBoundary, dropped input; Marquez help text consistency.

Changes:
- Logged session summary to .squad/log/2026-02-24T0205-quality-review-complete.md
- Updated .squad/identity/now.md with quality review findings and new issue numbers

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(squad): merge decision — Marquez UX audit findings

Quality assessment merged from inbox (Grade B): 11 improvements (3 P0, 4 P1, 4 P2). help text, stub commands, vocabulary, separators, roster.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(squad): test sprint launch

Session: 2026-02-24T0210-test-sprint
Changes:
- Logged test sprint: 5 agents, 7+ issues
- Branches: P0 fixes, stale tests, E2E, hostile/SDK, A11y

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix bradygaster#427: Add immediate shell launch indicator

Adds 'Loading Squad shell...' message at start of runShell() to eliminate
2-4 second launch dead air. Message clears once Ink mounts.

Users now see feedback within 100ms instead of staring at blank terminal.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
tamirdresher pushed a commit to tamirdresher/squad that referenced this pull request Mar 16, 2026
* chore(squad): Phase 2 launch — thinking feedback, P0 bugs, dual telemetry

Phase 1 complete: 5 issues closed (bradygaster#325, bradygaster#326, bradygaster#327, bradygaster#328, bradygaster#329), 5 PRs merged.
Phase 2 launched with Cheritto (thinking feedback), Hockney (P0 bugs), Saul (dual telemetry).
Decision inbox merged and archived.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(squad): Phase 2 Wave 1 merged, Wave 2 launched

Session: 2026-02-23T2145-phase2-wave2
Phase 2 Wave 1 complete (PRs bradygaster#351, bradygaster#352, bradygaster#353 merged).
Wave 2 launched: Cheritto on ghost response detection (bradygaster#332), Hockney on error hardening (bradygaster#334).

Changes:
- Session log created: 2026-02-23T2145-phase2-wave2.md
- Merged 3 inbox decisions (Cheritto, Hockney, Saul)
- Deleted inbox files post-merge

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(squad): Epic bradygaster#323 complete — all phases shipped 🎉

All 3 phases delivered:
- Phase 1 (Testing Wave): 6 issues closed
- Phase 2 (Improvement): 6 issues closed
- Phase 3 (Breathtaking): 7 issues closed
- 17 PRs merged, 19 issues closed total

Session log: 2026-02-23T2320-epic-complete.md
Decisions merged from inbox: P2 UX Polish, first-run wow moment

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* hostile QA: end-to-end quality assessment — 10 findings, 4 HIGH severity

Candid assessment requested by Brady. Traced every code path in cli-entry.ts,
shell/index.ts, shell/commands.ts, App.tsx, coordinator.ts, spawn.ts, and the
SDK adapter client.

Key findings:
- Dead sessions never evicted from agentSessions Map after connection drop
- No React ErrorBoundary — any render throw kills the shell
- Nasty-inputs corpus (95 strings) is never imported by any test
- No SIGTERM handler in interactive shell
- MemoryManager exported but never instantiated (dead code)
- Single streaming content slot clobbers multi-agent output
- User input silently dropped during processing (no type-ahead buffer)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(squad): quality review findings — 7 issues filed

Quality audit complete: 5 agents assessed CLI across testing, coverage, stability, accessibility, UX.
Results: 4 P0 blockers (bradygaster#365bradygaster#368), 3 P1 items (bradygaster#369bradygaster#371).
Blocking: Waingro dead sessions, ErrorBoundary, dropped input; Marquez help text consistency.

Changes:
- Logged session summary to .squad/log/2026-02-24T0205-quality-review-complete.md
- Updated .squad/identity/now.md with quality review findings and new issue numbers

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(squad): merge decision — Marquez UX audit findings

Quality assessment merged from inbox (Grade B): 11 improvements (3 P0, 4 P1, 4 P2). help text, stub commands, vocabulary, separators, roster.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(squad): test sprint launch

Session: 2026-02-24T0210-test-sprint
Changes:
- Logged test sprint: 5 agents, 7+ issues
- Branches: P0 fixes, stale tests, E2E, hostile/SDK, A11y

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix bradygaster#422: Add context to thinking spinner

Changed ThinkingIndicator default label from 'Thinking...' to
'Routing to agent...' to give users meaningful feedback during
SDK connection and initial routing phases.

When activityHint is provided (e.g., 'Keaton thinking...'), it
still takes priority. The new default eliminates the 'is it broken?'
anxiety during the 3-5 second cold connection wait.

Updated tests to reflect new default label.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: Update Marquez history with bradygaster#422 resolution

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix bradygaster#420/bradygaster#425: Add immediate SDK connection feedback

Before this fix, the first message sent to the REPL had 2-7 seconds of
dead air while createSession() blocked on SDK connection. Users thought
the shell was hung.

Changes:
- Set 'Connecting to SDK...' hint BEFORE createSession() in dispatchToCoordinator
- Set 'Connecting to <agent>...' hint BEFORE createSession() in dispatchToAgent
- Use setImmediate to give React a tick to render before blocking
- Update hint to 'Routing...' or 'thinking...' after connection completes

The ThinkingIndicator now displays immediately, eliminating perceived hang.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant