Skip to content

fix(router): review agent trigger resilience and ack ordering#640

Merged
zbigniewsobiecki merged 1 commit intodevfrom
fix/review-trigger-resilience
Mar 7, 2026
Merged

fix(router): review agent trigger resilience and ack ordering#640
zbigniewsobiecki merged 1 commit intodevfrom
fix/review-trigger-resilience

Conversation

@zbigniewsobiecki
Copy link
Copy Markdown
Member

Summary

Fixes three bugs that caused the review agent to silently not run on llmist PR#425:

  • Check polling window too shortwaitForChecks polled 5×10s = 50s max, but CI took ~60s after the first webhook. Increased MAX_RETRIES from 5 → 12 (120s total polling window)
  • Orphaned ack comment — when pollWaitForChecks returned false, the handler silently returned without cleaning up the "👀 Reviewing" comment, leaving a misleading message on the PR. Now deletes the ack comment before returning
  • Ack posted before job enqueued — the router posted the ack comment (step 8) before addJob() (step 9). A router crash between them orphans the ack with no corresponding job. Reordered: enqueue first (durable in Redis), then post ack and patch ackCommentId onto the job via updateData()

Changes

File Change
src/triggers/github/check-suite-success.ts MAX_RETRIES 5 → 12
src/triggers/github/webhook-handler.ts Delete ack comment on waitForChecks failure
src/router/webhook-processor.ts Swap enqueue and ack ordering; patch job data after ack
src/router/platform-adapter.ts Remove ackCommentId/ackMessage from buildJob signature
src/router/adapters/github.ts Update buildJob — no ack params
src/router/adapters/trello.ts Update buildJob — no ack params
src/router/adapters/jira.ts Update buildJob — no ack params

Test plan

  • npm test — 3963 tests pass (234 files)
  • npm run lint — clean
  • npm run typecheck — clean
  • New test: webhook-processor enqueues before posting ack (call order assertion)
  • New test: webhook-processor patches ack info via updateData after ack
  • New test: webhook-handler deletes ack comment when pollWaitForChecks returns false
  • New test: webhook-handler skips deletion when no ackCommentId present
  • Adapter buildJob tests updated for new signature (github, trello, jira)

🤖 Generated with Claude Code

Three bugs caused the review agent to silently not run on PR#425:

1. Check polling window too short — `waitForChecks` polled 5×10s = 50s
   max, but CI took ~60s. Increase MAX_RETRIES from 5 to 12 (120s window).

2. Orphaned ack comment — when `pollWaitForChecks` returns false the
   handler returned without cleaning up the "👀 Reviewing" ack comment,
   leaving a misleading message on the PR. Now deletes the ack comment
   before returning.

3. Ack posted before job enqueued — the router posted the ack comment
   (step 8) before calling `addJob()` (step 9). A router crash between
   them orphans the ack with no corresponding job. Swap the ordering:
   enqueue first (durable in Redis), then post ack and patch the
   ackCommentId onto the job via `updateData()`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@zbigniewsobiecki zbigniewsobiecki merged commit 93e4bea into dev Mar 7, 2026
6 checks passed
@zbigniewsobiecki zbigniewsobiecki deleted the fix/review-trigger-resilience branch March 7, 2026 09:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant