Skip to content

fix(watch): detect rate-limit errors instead of reporting Board is clear#893

Merged
tamirdresher merged 2 commits intodevfrom
squad/806-watch-rate-limit-detection
Apr 12, 2026
Merged

fix(watch): detect rate-limit errors instead of reporting Board is clear#893
tamirdresher merged 2 commits intodevfrom
squad/806-watch-rate-limit-detection

Conversation

@diberry
Copy link
Copy Markdown
Collaborator

@diberry diberry commented Apr 7, 2026

Summary

When squad watch hits a GitHub API rate limit, runCheck() catches the error and returns an empty board state. The watch loop then prints "Board is clear - Ralph is idling" - which is misleading because the board isn't clear, the API just couldn't be reached.

Changes

packages/squad-cli/src/cli/commands/watch/index.ts

  • Added RunCheckStatus discriminated union ('ok' | 'rate-limited' | 'error') and RunCheckResult interface
  • runCheck() now returns RunCheckResult with status, using isRateLimitError() to classify failures
  • reportBoard() accepts optional scanStatus - shows rate-limit or error warning instead of "Board is clear" on failed scans
  • Rate-limit/error warnings bypass notifyLevel suppression (always visible)
  • Main loop short-circuits remaining phases when scan fails
  • Rate-limited rounds do not count as circuit-breaker successes

test/watch-rate-limit.test.ts (new)

8 tests covering rate-limit warning, error warning, normal behavior, notifyLevel bypass, and busy-board fallthrough.

.changeset/watch-rate-limit-detection.md

Patch changeset for @bradygaster/squad-cli.

Closes #806

@diberry diberry added the bug Something isn't working label Apr 7, 2026
Copilot AI review requested due to automatic review settings April 7, 2026 03:36
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 7, 2026

🟡 Impact Analysis — PR #893

Risk tier: 🟡 MEDIUM

📊 Summary

Metric Count
Files changed 3
Files added 2
Files modified 1
Files deleted 0
Modules touched 3
Critical files 1

🎯 Risk Factors

  • 3 files changed (≤5 → LOW)
  • 3 modules touched (2-4 → MEDIUM)
  • Critical files touched: packages/squad-cli/src/cli/commands/watch/index.ts

📦 Modules Affected

root (1 file)
  • .changeset/watch-rate-limit-detection.md
squad-cli (1 file)
  • packages/squad-cli/src/cli/commands/watch/index.ts
tests (1 file)
  • test/watch-rate-limit.test.ts

⚠️ Critical Files

  • packages/squad-cli/src/cli/commands/watch/index.ts

This report is generated automatically for every PR. See #733 for details.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes misleading squad watch output by distinguishing successful board scans from GitHub API rate-limit / scan failures, and reporting a warning instead of “Board is clear”.

Changes:

  • Introduces RunCheckResult / RunCheckStatus and classifies runCheck() failures via isRateLimitError()
  • Updates reportBoard() and the watch round flow to short-circuit phases on failed scans and print rate-limit/error warnings
  • Adds vitest coverage for rate-limit/error messaging behavior and includes a patch changeset

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
packages/squad-cli/src/cli/commands/watch/index.ts Adds scan status plumbing, warning output, and short-circuit behavior on failed scans
test/watch-rate-limit.test.ts New tests validating warning output and notifyLevel bypass behavior
.changeset/watch-rate-limit-detection.md Patch changeset describing the user-facing fix

Comment thread packages/squad-cli/src/cli/commands/watch/index.ts Outdated
Comment thread packages/squad-cli/src/cli/commands/watch/index.ts Outdated
@diberry diberry force-pushed the squad/806-watch-rate-limit-detection branch from d191504 to 24710fb Compare April 8, 2026 14:23
@diberry
Copy link
Copy Markdown
Collaborator Author

diberry commented Apr 8, 2026

Is this introducing issues that 916 is meant to fix?

@diberry diberry force-pushed the squad/806-watch-rate-limit-detection branch from 24710fb to dd73254 Compare April 8, 2026 15:59
@diberry diberry force-pushed the squad/806-watch-rate-limit-detection branch from dd73254 to fed9d48 Compare April 8, 2026 16:51
@diberry
Copy link
Copy Markdown
Collaborator Author

diberry commented Apr 8, 2026

🔍 Squad Review — Kaylee (Engineering)

# Check Status Notes
1 Changelog entry .changeset/watch-rate-limit-detection.md (CLI patch)
2 Squashed to 1 commit 1 commit
3 CI green All checks passed
4 Copilot comments resolved 2/2 threads resolved with fixes
5 No .squad/ files Clean
6 No unrelated files 3 files — watch index, rate-limit tests, changeset
7 Tests for changes est/watch-rate-limit.test.ts — 8 new tests
8 Not a duplicate/reversal Unique scope (closes #806)

Verdict: ✅ Ready to merge


Review by Squad AI team (Kaylee — Engineering) · requested by Dina Berry

@diberry
Copy link
Copy Markdown
Collaborator Author

diberry commented Apr 10, 2026

Review findings for PR #893

P1 required: isRateLimitError matches '403' too broadly - a 403 from insufficient token scopes or repo access denial is a permanent auth error, not a transient rate limit. Misclassifying it suppresses the error and tells the user to "retry next interval" instead of "check your token". Consider narrowing to '429' plus the explicit GitHub rate-limit message strings ('secondary rate limit', 'api rate limit exceeded').

P2 suggestion: The core isRateLimitError classifier in gh-cli.ts has no direct unit tests - only the reportBoard display logic is tested. A test asserting that a bare 403 Forbidden is NOT classified as rate-limited would catch this.

P2 suggestion: Circuit-breaker failure counter doesn't increment on generic 'error' status - repeated non-rate-limit scan errors would never trip the breaker.

P2 suggestion: notifyLevel: 'none' bypass not tested for 'error' status.

Nit: Test file in test/ root instead of test/cli/.

…t strings

Remove broad 403 match that misclassified auth errors as rate limits.
A bare 403 Forbidden (e.g. insufficient token scopes) is a permanent auth
error, not a transient rate limit. Now only matches 429 status codes and
explicit rate-limit message strings.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@diberry
Copy link
Copy Markdown
Collaborator Author

diberry commented Apr 10, 2026

✅ CI validation passed on fork: diberry#143 — 7/7 checks green

@tamirdresher tamirdresher merged commit 238c5db into dev Apr 12, 2026
@tamirdresher tamirdresher deleted the squad/806-watch-rate-limit-detection branch April 12, 2026 10:34
tamirdresher pushed a commit that referenced this pull request Apr 21, 2026
…ear (#893)

* fix(watch): detect rate-limit errors

* fix(watch): narrow rate-limit detection to 429 and explicit rate-limit strings

Remove broad 403 match that misclassified auth errors as rate limits.
A bare 403 Forbidden (e.g. insufficient token scopes) is a permanent auth
error, not a transient rate limit. Now only matches 429 status codes and
explicit rate-limit message strings.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Dina Berry <diberry@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Watch reports 'Board is clear' when GitHub API rate limit is hit

3 participants