Skip to content

feat(cli): upgrade squad watch to full work monitor with --execute mode (#708)#709

Merged
tamirdresher merged 15 commits intodevfrom
squad/708-watch-work-monitor
Mar 31, 2026
Merged

feat(cli): upgrade squad watch to full work monitor with --execute mode (#708)#709
tamirdresher merged 15 commits intodevfrom
squad/708-watch-work-monitor

Conversation

@tamirdresher
Copy link
Copy Markdown
Collaborator

What

Upgrade squad watch (Ralph's polling loop) from triage-only to a full work monitor that can autonomously execute issues via --execute mode.

Why

Today squad watch only triages (labels issues + reports PR status). With --execute, Ralph can spawn Copilot sessions to actually work on eligible issues — closing the loop from triage to execution without human intervention.

Closes #708

How

cli-entry.ts — Parse five new flags (--execute, --copilot-flags, --agent-cmd (hidden), --max-concurrent, --timeout) and pass a WatchOptions object to runWatch.

watch.ts — Core changes:

  • WatchOptions interface — typed options bag replacing the bare intervalMinutes parameter
  • buildAgentCommand() — constructs {cmd, args} for the agent subprocess (default: gh copilot --message ...; hidden --agent-cmd override for custom agents)
  • executeIssue() — claims issue (@me), posts comment, spawns agent via execFile with 50 MB maxBuffer and configurable timeout
  • findExecutableIssues() — filters board for issues that are labelled for a squad member, unassigned, and not blocked (filters 9 blocking labels including status:blocked, pending-user, do-not-merge)
  • selfPull() — best-effort git fetch + pull --ff-only between rounds (never blocks)
  • Updated executeRound() — when execute mode is on: selfPull first, then after triage find & execute eligible issues up to maxConcurrent
  • Updated BoardState — added executed count; reportBoard shows 🚀 line
  • Updated startup banner — shows (Execute) mode indicator, copilot flags, and concurrency settings

Testing

  • When --execute is NOT passed, behaviour is 100% identical to current (no new code paths execute)
  • New functions are pure or isolated behind the options.execute guard
  • Manual testing: squad watch --execute --copilot-flags "--yolo" --max-concurrent 2 --timeout 15

Docs

Help text updated in cli-entry.ts for triage command. Hidden --agent-cmd flag intentionally omitted from help.

Exports

No new public exports from package.json. WatchOptions and helper functions are exported from the module for testability but not from the package entry point.

Breaking Changes

None. The runWatch signature changed from (dest, intervalMinutes) to (dest, WatchOptions) but the only call site (cli-entry.ts) is updated in this PR.

Waivers

  • (d) CHANGELOG: Will be added in a follow-up before release (this is a feature branch, not shipping yet)
  • (f) Samples: No sample needed — this is a CLI operational mode, not an API

…de (#708)

Add execute mode to the watch/triage loop so Ralph can autonomously
spawn Copilot sessions to work on eligible issues.

New CLI flags:
  --execute          enable work-execution mode (default: triage only)
  --copilot-flags    extra flags forwarded to copilot CLI
  --agent-cmd        hidden override for the full agent command
  --max-concurrent   parallel issue limit per round (default: 1)
  --timeout          max minutes per issue execution (default: 30)

New functions in watch.ts:
  buildAgentCommand  — construct cmd+args for the agent subprocess
  executeIssue       — claim issue, comment, spawn agent with timeout
  findExecutableIssues — filter board for ready-to-work issues
  selfPull           — best-effort git fetch+pull between rounds

Updated BoardState and reportBoard to include executed count.
When --execute is NOT passed, behaviour is 100% identical to before.

Part of #708

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 30, 2026 14:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an “execute mode” to squad watch so Ralph can claim and run an agent subprocess against eligible issues, turning the existing triage loop into a full work monitor.

Changes:

  • Replace runWatch(dest, intervalMinutes) with runWatch(dest, WatchOptions) and add execute-mode config (--execute, --copilot-flags, --max-concurrent, --timeout, hidden --agent-cmd).
  • Implement execute-mode helpers in watch.ts (agent command building, executable-issue filtering, per-issue execution, best-effort git fetch/pull).
  • Extend board reporting with an executed counter for execute-mode rounds.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
packages/squad-cli/src/cli/commands/watch.ts Adds WatchOptions, execute-mode helpers (executeIssue, buildAgentCommand, findExecutableIssues, selfPull), and reports per-round executed count.
packages/squad-cli/src/cli-entry.ts Parses new CLI flags for triage/watch and passes them into runWatch via WatchOptions.

Comment on lines 62 to 66
changesRequested: number;
ciFailures: number;
readyToMerge: number;
executed: number;
}
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BoardState now requires an executed field, which will break existing TypeScript call sites that construct a board-state object literal (e.g. test/cli/watch.test.ts passes objects without executed). Either update those callers/tests to include executed: 0 or make executed optional with a default of 0 inside reportBoard/emptyBoardState to keep the type change non-breaking.

Copilot uses AI. Check for mistakes.
const prompt = `Work on issue #${issue.number}: ${issue.title}. Read the issue body for full details.`;

if (options.agentCmd) {
const parts = options.agentCmd.trim().split(/\s+/);
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildAgentCommand treats any truthy options.agentCmd as valid, but a whitespace-only value (e.g. ' ') becomes an empty cmd after trim() and will cause execFile to fail with ENOENT. Consider validating agentCmd.trim().length > 0 (or throwing a clear error) before splitting/using it.

Suggested change
const parts = options.agentCmd.trim().split(/\s+/);
const raw = options.agentCmd.trim();
if (raw.length === 0) {
fatal('Invalid --agent-cmd: command is empty after trimming whitespace.');
}
const parts = raw.split(/\s+/);

Copilot uses AI. Check for mistakes.
Comment on lines +388 to +392
export function findExecutableIssues(
roster: ReturnType<typeof parseRoster>,
capabilities: MachineCapabilities | null,
issues: GhIssue[],
): GhIssue[] {
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

findExecutableIssues receives capabilities but never uses it, so execute mode may pick up issues that runCheck would skip due to missing needs:* capabilities. Consider applying the same filterByCapabilities() gating here (or remove the unused param and pass only already-filtered issues) to avoid executing non-actionable work.

Copilot uses AI. Check for mistakes.
Comment thread packages/squad-cli/src/cli-entry.ts Outdated
const maxConcurrentIdx = args.indexOf('--max-concurrent');
const maxConcurrent = (maxConcurrentIdx !== -1 && args[maxConcurrentIdx + 1])
? parseInt(args[maxConcurrentIdx + 1]!, 10)
: 1;
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--max-concurrent is parsed with parseInt but never validated. If the user passes a non-number/0/negative, slice(0, maxConcurrent) can silently execute zero issues or behave unexpectedly. Consider validating it is a finite integer >= 1 and failing fast with a clear usage error.

Suggested change
: 1;
: 1;
if (!Number.isFinite(maxConcurrent) || !Number.isInteger(maxConcurrent) || maxConcurrent < 1) {
console.error(
`Error: --max-concurrent must be a positive integer (>= 1). Received: ${args[maxConcurrentIdx + 1] ?? maxConcurrent}`,
);
process.exit(1);
}

Copilot uses AI. Check for mistakes.
Comment thread packages/squad-cli/src/cli-entry.ts Outdated
Comment on lines +352 to +355
const issueTimeoutMinutes = (timeoutIdx !== -1 && args[timeoutIdx + 1])
? parseInt(args[timeoutIdx + 1]!, 10)
: 30;

Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--timeout is parsed with parseInt but never validated. If it becomes NaN or <= 0, timeoutMs in executeIssue becomes NaN/non-sensical and the child-process timeout behavior will be unreliable. Consider validating it is a finite integer >= 1 and reporting a usage error.

Suggested change
const issueTimeoutMinutes = (timeoutIdx !== -1 && args[timeoutIdx + 1])
? parseInt(args[timeoutIdx + 1]!, 10)
: 30;
let issueTimeoutMinutes = (timeoutIdx !== -1 && args[timeoutIdx + 1])
? parseInt(args[timeoutIdx + 1]!, 10)
: 30;
if (!Number.isFinite(issueTimeoutMinutes) || !Number.isInteger(issueTimeoutMinutes) || issueTimeoutMinutes <= 0) {
console.error('Error: --timeout must be a positive integer number of minutes.');
process.exit(1);
}

Copilot uses AI. Check for mistakes.
Comment on lines +292 to +296
export function buildAgentCommand(
issue: GhIssue,
teamRoot: string,
options: WatchOptions,
): { cmd: string; args: string[] } {
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New pure helpers (buildAgentCommand, findExecutableIssues) introduce non-trivial parsing/filtering logic, but the existing watch tests only cover reportBoard. Adding unit tests for these helpers would help prevent regressions (e.g., handling copilotFlags splitting, blocked labels, assignee filtering, and custom agentCmd parsing).

Copilot uses AI. Check for mistakes.
Copilot and others added 14 commits March 30, 2026 18:16
…ave dispatch, retro, hygiene (#708)

Add 9 new opt-in flags to squad watch / triage command:
- --monitor-teams: scan Teams via WorkIQ for actionable messages
- --monitor-email: scan email for CI failures, Dependabot, security alerts
- --board / --board-project N: project board lifecycle (reconcile, archive)
- --two-pass: lightweight list then hydrate actionable issues only
- --wave-dispatch: dependency-aware parallel sub-task execution
- --retro: enforce retrospective checks (Fridays or >7 days)
- --decision-hygiene: auto-merge decision inbox when >5 files
- --channel-routing: route notifications to Teams channels

All features disabled by default — existing behavior unchanged.
Includes subsquad discovery, buildAgentCommandFromPrompt helper,
and spawnWithTimeout utility. Every new function has try/catch.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- CHANGELOG entry for all new watch flags
- Ralph feature docs rewritten with full watch mode guide
- CLI reference updated with all new flags
- Unit tests for new watch functions
- Blog post: 'From Triage Bot to Work Monitor'

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix TS2339: cast execFile error to `Error & { killed?: boolean }` instead
  of NodeJS.ErrnoException (which lacks .killed property)
- Fix cspell: replace "vulns" with "vulnerabilities" in blog post

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… wrapper (#708)

EECOM blocking issues resolved:
1. updateBoard: replaced $(gh repo view ...) shell substitution with
   pre-resolved execFileSync call — execFile doesn't expand shell syntax
2. twoPassScan: replaced raw execFileAsync('gh', ...) with existing
   ghIssueList wrapper for consistent error handling

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
New watch flags (--execute, --board, --monitor-teams, etc.) added help
text lines, bumping output from 110 to 113 lines. Increase threshold
from 110 to 125 to accommodate current and future flag additions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Tests were importing from @bradygaster/squad-cli/commands/watch which
requires a built dist/. Changed to source path import so tests work
without a full build. All 30 watch tests pass (6 original + 8 new + 16 circuit-breaker).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rt (#708)

Refactor watch.ts to use a WatchPlatform interface that abstracts
platform-specific operations (list work items, edit, PRs, comments).

- Add WatchPlatform interface with GitHubWatchPlatform and AdoWatchPlatform
- GitHubWatchPlatform wraps existing gh-cli calls (zero behavior change)
- AdoWatchPlatform wraps AzureDevOpsAdapter from squad-sdk (az CLI)
- Add createWatchPlatform factory with auto-detection from git remote
- Add --platform github|ado flag to cli-entry.ts watch handler
- Replace all direct gh-cli calls in watch.ts with platform calls
- Add tests for createWatchPlatform factory and platform options
- Update CLI reference and Ralph docs with --platform flag and ADO section
- Update CHANGELOG with ADO platform support entry

All existing tests pass — backward compatible (GitHub is the default).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove the custom WatchPlatform interface, GitHubWatchPlatform and
AdoWatchPlatform classes, and the createWatchPlatform factory in favor
of the SDK's createPlatformAdapter(). Add thin mapping helpers
(toWatchWorkItem, toWatchPullRequest, listWatchWorkItems,
listWatchPullRequests, editWorkItem) to bridge SDK types to the
internal WatchWorkItem/WatchPullRequest interfaces.

- Remove --platform CLI flag from cli-entry.ts and WatchOptions
- Replace platform.checkAvailable/checkAuthenticated with direct
  CLI checks based on adapter.type
- Scope rate-limit circuit breaker to GitHub only (adapter.type)
- Update CHANGELOG, CLI reference docs, and Ralph feature docs
- Remove createWatchPlatform and WatchPlatform test blocks

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The removal of createWatchPlatform and WatchPlatform test blocks
left the outer describe block unclosed, causing a parse error.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace monolithic watch.ts (1562 lines) with a plugin architecture
where each opt-in feature is a self-contained WatchCapability.

Architecture:
- watch/types.ts: WatchCapability interface, WatchContext, WatchPhase
- watch/registry.ts: CapabilityRegistry (register, getByPhase, all)
- watch/config.ts: loadWatchConfig merges .squad/config.json + CLI flags
- watch/capabilities/: 9 capability plugins extracted from watch.ts

Capabilities extracted:
- self-pull: git fetch/pull at round start (pre-scan phase)
- execute: spawn Copilot sessions for eligible issues (post-execute)
- board: project board lifecycle + reconciliation (post-execute)
- monitor-teams: Teams scanning via WorkIQ (housekeeping)
- monitor-email: email scanning + GitHub alerts (housekeeping)
- two-pass: lightweight scan then hydrate actionable (post-triage)
- wave-dispatch: parallel sub-task execution (post-execute)
- retro: retrospective enforcement (housekeeping)
- decision-hygiene: merge inbox cleanup (housekeeping)

Config system:
- .squad/config.json 'watch' section for persistent config
- Priority: CLI flag > config.json > default (off)
- --no-{capability} disables even if config enables it
- Auto-generates --help from registered capability descriptions

Backward compatibility:
- Legacy WatchOptions type still accepted by runWatch
- 'squad watch' with no flags = triage only (unchanged)
- 'squad watch --execute' still works as CLI flag
- All existing exports preserved via re-exports

The main loop is now a thin 4-phase orchestrator:
1. pre-scan → 2. core triage → 3. post-execute → 4. housekeeping

Each capability has try/catch in execute() — failures are warnings,
never crash the round. Capabilities are independently testable.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
watch.ts moved to watch/index.ts — update test import path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
watch.ts → watch/index.ts moved the export. Update package.json
exports map to point to dist/cli/commands/watch/index.js.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ralph-board.test.ts and persistent-ralph.test.ts still imported from
the old watch.js path. Updated to watch/index.js.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The cli-command-wiring test only scanned for top-level .ts files in
commands/, but the watch refactor moved watch.ts into watch/index.ts.
Now the test also discovers subdirectory commands with index.ts and
includes them in the existingFiles set.

Fixes test failure: watch/index.ts not found in commands/

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@diberry
Copy link
Copy Markdown
Collaborator

diberry commented Mar 31, 2026

🔄 Ralph PR status

Check Status
Mergeable ✅ Clean
Base dev
Commits 15
Changed files 33
Closes #708

Large feature PR — upgrades squad watch to full work monitor with --execute mode. Ready for review but scope is significant (3,825 additions). Consider squashing before merge.

@tamirdresher tamirdresher merged commit ab1333e into dev Mar 31, 2026
10 checks passed
@tamirdresher tamirdresher deleted the squad/708-watch-work-monitor branch March 31, 2026 17:17
diberry added a commit that referenced this pull request Apr 5, 2026
Add 51 tests covering 5 watch capabilities (execute, cleanup,
decision-hygiene, self-pull, board) that shipped with zero test
coverage in PR #709.

Test coverage:
- ExecuteCapability: buildAgentPrompt, findExecutableIssues edge cases,
  preflight, execute flow (adapter mock, agent dispatch, errors, timeouts)
- CleanupCapability: preflight, round-skipping, file pruning by date,
  stale inbox warnings, config validation
- DecisionHygieneCapability: preflight, threshold logic, merge trigger,
  timeout handling
- SelfPullCapability: preflight, git stash/fetch/pull flow, stash pop
  failure, source change detection
- BoardCapability: preflight checks

Closes #709

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
diberry added a commit that referenced this pull request Apr 6, 2026
Add 51 tests covering 5 watch capabilities (execute, cleanup,
decision-hygiene, self-pull, board) that shipped with zero test
coverage in PR #709.

Test coverage:
- ExecuteCapability: buildAgentPrompt, findExecutableIssues edge cases,
  preflight, execute flow (adapter mock, agent dispatch, errors, timeouts)
- CleanupCapability: preflight, round-skipping, file pruning by date,
  stale inbox warnings, config validation
- DecisionHygieneCapability: preflight, threshold logic, merge trigger,
  timeout handling
- SelfPullCapability: preflight, git stash/fetch/pull flow, stash pop
  failure, source change detection
- BoardCapability: preflight checks

Closes #709

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
diberry added a commit that referenced this pull request Apr 6, 2026
Add 51 tests covering 5 watch capabilities (execute, cleanup,
decision-hygiene, self-pull, board) that shipped with zero test
coverage in PR #709.

Test coverage:
- ExecuteCapability: buildAgentPrompt, findExecutableIssues edge cases,
  preflight, execute flow (adapter mock, agent dispatch, errors, timeouts)
- CleanupCapability: preflight, round-skipping, file pruning by date,
  stale inbox warnings, config validation
- DecisionHygieneCapability: preflight, threshold logic, merge trigger,
  timeout handling
- SelfPullCapability: preflight, git stash/fetch/pull flow, stash pop
  failure, source change detection
- BoardCapability: preflight checks

Closes #709

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
diberry added a commit that referenced this pull request Apr 7, 2026
Add 51 tests covering 5 watch capabilities (execute, cleanup,
decision-hygiene, self-pull, board) that shipped with zero test
coverage in PR #709.

Test coverage:
- ExecuteCapability: buildAgentPrompt, findExecutableIssues edge cases,
  preflight, execute flow (adapter mock, agent dispatch, errors, timeouts)
- CleanupCapability: preflight, round-skipping, file pruning by date,
  stale inbox warnings, config validation
- DecisionHygieneCapability: preflight, threshold logic, merge trigger,
  timeout handling
- SelfPullCapability: preflight, git stash/fetch/pull flow, stash pop
  failure, source change detection
- BoardCapability: preflight checks

Closes #709

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
tamirdresher pushed a commit that referenced this pull request Apr 21, 2026
Add 51 tests covering 5 watch capabilities (execute, cleanup,
decision-hygiene, self-pull, board) that shipped with zero test
coverage in PR #709.

Test coverage:
- ExecuteCapability: buildAgentPrompt, findExecutableIssues edge cases,
  preflight, execute flow (adapter mock, agent dispatch, errors, timeouts)
- CleanupCapability: preflight, round-skipping, file pruning by date,
  stale inbox warnings, config validation
- DecisionHygieneCapability: preflight, threshold logic, merge trigger,
  timeout handling
- SelfPullCapability: preflight, git stash/fetch/pull flow, stash pop
  failure, source change detection
- BoardCapability: preflight checks

Closes #709

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(cli): upgrade squad watch to full work monitor — triage + execute + board lifecycle

3 participants