Skip to content

fix: add timeout to az CLI exec preventing test hangs in CI#483

Merged
bradygaster merged 1 commit intobradygaster:devfrom
diberry:squad/fix-platform-adapter-timeouts
Mar 22, 2026
Merged

fix: add timeout to az CLI exec preventing test hangs in CI#483
bradygaster merged 1 commit intobradygaster:devfrom
diberry:squad/fix-platform-adapter-timeouts

Conversation

@diberry
Copy link
Copy Markdown
Collaborator

@diberry diberry commented Mar 21, 2026

🚨 Fix: platform-adapter test timeouts blocking all PRs

Impact: All 4 open PRs are failing CI with the same error. This unblocks them all.

Root Cause

validateWorkItemType tests (added in PR #444) call getAvailableWorkItemTypes(), which internally runs execFileSync('az', ['boards', 'work-item', ...]) with no timeout.

In CI, the Azure CLI is installed but there's no real ADO organization to reach. The az command hangs indefinitely until Vitest's 5-second test timeout kills it:

FAIL test/platform-adapter.test.ts
  validateWorkItemType > is case-insensitive — Error: Test timed out in 5000ms (line 807)
  validateWorkItemType > validates against available types — Error: Test timed out in 5000ms (line 751)

This affects every PR since #444 was merged into dev on 2026-03-20.

Fix

Added a 2-second timeout to the execFileSync call in getAvailableWorkItemTypes(). When the az CLI doesn't respond within 2s (no ADO credentials, unreachable org, etc.), it throws and falls through to the default work item types — which is the correct behavior.

Files Changed

packages/squad-sdk/src/platform/azure-devops.ts — 1 file, +3 -1 lines

Test Results

All 120 platform-adapter tests pass locally after the fix. The 2 previously-timing-out tests now complete in <100ms.

Note for maintainers

PR #444's CI test job showed as failed before merge. The timeout issue was present in the PR itself — it wasn't introduced by subsequent changes. Consider enforcing green CI before merge to prevent this pattern from recurring.

@diberry diberry force-pushed the squad/fix-platform-adapter-timeouts branch from af828e5 to 8fd1839 Compare March 21, 2026 14:25
The execFileSync call in getAvailableWorkItemTypes had no timeout,
causing CI test failures when az CLI is installed but slow to respond
for nonexistent orgs (network timeout > 5s Vitest default).

Adding a 2-second timeout ensures the catch block fires promptly,
returning the fallback work item types without blocking tests.

Fixes validateWorkItemType test timeouts on lines 751 and 807.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@diberry diberry force-pushed the squad/fix-platform-adapter-timeouts branch from 8fd1839 to 30af0e2 Compare March 21, 2026 14:28
@diberry
Copy link
Copy Markdown
Collaborator Author

diberry commented Mar 21, 2026

@bradygaster This is ready for your review. Our team has reviewed and signed off:

  • Simon (Tester): ✅ Approved — verified all 120 platform-adapter tests pass, fallback behavior is clean, no production risk. One minor comment fix (already applied).
  • Root cause: PR ADO configurable work item types #444's \getAvailableWorkItemTypes()\ calls \�xecFileSync('az', ...)\ with no timeout. In CI, the az CLI hangs reaching non-existent ADO orgs. Fix adds a 3s timeout so it falls through to defaults.
  • Impact: Unblocks all 4 open PRs currently failing CI.

Minimal change — 1 file, 3 lines.

Copy link
Copy Markdown
Owner

@bradygaster bradygaster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EECOM review: Change is correct and minimal. The 3s timeout on execFileSync in getAvailableWorkItemTypes prevents CI hangs when az CLI can't reach a real ADO org. Spread pattern preserves existing EXEC_OPTS cleanly. Catch block already returns sensible defaults. Branch is clean on top of dev — no rebase needed. Ship it.

@bradygaster bradygaster merged commit 55db099 into bradygaster:dev Mar 22, 2026
1 of 2 checks passed
chrislomonico pushed a commit to clomonico/squad that referenced this pull request Mar 26, 2026
…ster#483)

* test: human journey — Power user (bradygaster#396)

Add E2E journey test covering advanced shell features:
- /help and /status slash commands
- Tab completion for /commands and @agent names
- Ctrl+C cancel during processing
- Double Ctrl+C exit
- Multiple slash commands in sequence
- @agent direct routing with complex messages

Closes bradygaster#396

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test: human journey — My first conversation (bradygaster#384)

Add E2E journey test covering a new user's first interaction with Squad:
- Welcome banner display (title, version, agents, help hint)
- First message submission and coordinator routing
- Thinking indicator during processing
- Receiving agent responses via ShellApi
- /help command exploration
- /status agent roster check
- @agent direct message routing
- Graceful exit (exit, quit, Ctrl+C)

28 tests across 8 journey steps using createShellHarness() pattern.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test: human journey — Something went wrong (bradygaster#386)

Add E2E journey tests covering error scenarios users may encounter:

- SDK connection failure shows helpful error message
- Agent dispatch failure is caught and shown to user
- Invalid /command shows conversational hint with /help
- Network-like errors during streaming are handled gracefully
- ErrorBoundary catches React rendering errors
- Shell remains usable after errors (can submit new messages)
- Error messages are user-friendly, not raw stack traces

21 tests across 7 describe blocks verify the full error UX.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants