fix(review): correct external-fork PR checkout, swap pre-fetch for compact diffs, add SKIPPED FILES contract#1111
Merged
zbigniewsobiecki merged 9 commits intodevfrom Apr 15, 2026
Merged
Conversation
Spec captures the PR #1092 incident root cause (silent external-fork checkout failure + truncated context) and decomposes the fix into: Plan 1 — checkout-and-pagination: refs/pull/N/head fetch, fail-loud setup, HEAD SHA verification, paginate every paginated GH endpoint used in review setup. Plan 2 — context-rework: swap full-file pre-fetch for compact diffs, add structured SKIPPED FILES contract for the agent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tion Status: pending → wip. Implementation begins. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AgentInput.headSha is already populated by triggers and consumed by worker-entry, sidecarManager, promptContext, etc. Adding a second prHeadSha field would be redundant. Plan now: SetupRepositoryOptions gains prHeadSha (helper signature), fed from input.headSha. Task 6 narrows from "add prHeadSha everywhere" to "verify headSha set at every PR trigger; gap-fill where missing". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…dpoints Plan 001/1 (checkout-and-pagination). See the parent spec at docs/specs/001-pr-review-correctness.md and the plan at docs/plans/001-pr-review-correctness/1-checkout-and-pagination.md.done for full context. Problem. PR #1092 (first external contribution) exposed two latent defects that together produced a confidently wrong review: (1) the worker cloned `origin` (the base repo) and then tried to check out the PR by branch name, which silently 404s for fork PRs since the branch lives on the contributor's fork; the surrounding code discarded the non-zero git exit code, and the worker continued on `dev` while believing it was on the PR branch. (2) `getPRDiff` was capped at the first page of 100 files, so on the 129-file PR the agent's view of what changed was a strict subset, missing the specific files whose content would have refuted its fabricated claims. What this commit changes: - New `fetchAndCheckoutPR` helper in src/agents/shared/repository.ts fetches `+refs/pull/<N>/head:refs/remotes/pr/<N>` and detached- checks out `pr/<N>`. This is the canonical GitHub pattern and works uniformly for same-repo and external-fork PRs. The helper's signature accepts an scmProvider parameter so the GitLab extension (follow-up after PR #1092) is one-line. - `setupRepository` (cold-start path) and `refreshSnapshotWorkspace` (snapshot-reuse path) now call the helper when `prNumber` is set. Every git command throws on non-zero exit — there is no warn-and-continue in setup. When `prHeadSha` is supplied, a final `git rev-parse HEAD` comparison catches any remaining mismatch. - `SetupRepositoryOptions` gains `prNumber` and `prHeadSha`. The legacy `prBranch` field is retained only for human-readable logging; it is no longer used to drive checkout. - `backends/adapter.ts` forwards `prNumber` and the existing `input.headSha` (renamed to `prHeadSha` at the setup-options boundary). `AgentInput.headSha` is repurposed — its original check-failure-flow role is kept and it is now also the source of truth for post-checkout HEAD verification. - Two trigger handlers (`pr-comment-mention`, `pr-review-submitted`) were not populating `headSha` on `AgentInput`; they do now, so agents triggered by those events get HEAD verification. - All seven `per_page: 100` call sites in `src/github/client.ts` now use Octokit's `paginate` helper to read to completion: `getPRReviewComments`, `getPRReviews`, `getPRIssueComments`, `getPRDiff`, `getCheckSuiteStatus` (workflow runs + per-run jobs), `getFailedWorkflowRunJobs` (same two endpoints). - Structured logs added / clarified: `Fetching PR ref`, `PR checked out` (with prNumber, ref, headSha) in the helper; `Total changed files in PR` in the review context step — now accurate because the diff list is no longer truncated. - Plan divergence documented in-commit: `AgentInput.prHeadSha` was redundant with the existing `headSha` field, so the plan was edited destructively to reuse it (see `docs/plans/001-pr-review- correctness/1-checkout-and-pagination.md.done` for details). Tests. 26 new unit tests across repository, github client, adapter, and the two affected triggers. 7513/7513 unit tests pass; typecheck, build, and lint all clean. Integration tests require a PostgreSQL instance and are unaffected (their failures in CI without a DB are pre-existing infrastructure matters). Follow-ups (tracked separately): - Plan 001/2 (context-rework) swaps the worker's full-file pre-fetch for compact diffs and adds the SKIPPED FILES contract with the agent. - After PR #1092 merges, extend `fetchAndCheckoutPR` to handle GitLab MR refs. Docs: CLAUDE.md gains a "PR Checkout (worker)" subsection. CHANGELOG.md is created with the Unreleased entry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Status: pending → wip. Implementation begins. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ES contract
Plan 001/2 (context-rework). See parent spec at
docs/specs/001-pr-review-correctness.md and the plan at
docs/plans/001-pr-review-correctness/2-context-rework.md.done.
Also bundles in-flight web-wizard work that was sitting in the working
tree (provider-switch credential cleanup); see "Bundled" section below.
## Plan 001/2 — context-rework
The review agent's pre-fetched PR context shape is reshaped from "echo
full file contents up to a 25k-token cap" to "compact per-file diffs +
explicit list of what was omitted." This scales with PR size rather
than repo size, mitigates LLM context rot, and ensures the agent is
aware of (rather than blind to) anything it didn't get.
What this commit changes:
- `src/config/reviewConfig.ts`: `REVIEW_FILE_CONTENT_TOKEN_LIMIT`
(25_000) replaced by `REVIEW_DIFF_CONTEXT_TOKEN_LIMIT` (200_000).
Diffs are denser than full files, so a much larger budget fits the
median PR comfortably while staying well under the model's window.
- `src/agents/shared/prFormatting.ts`: removed the filesystem-reading
`readPRFileContents` and its `PRFileContents` type. Added new types
`SkippedFile` and `PRDiffContext`, and four pure functions:
- `extractPRDiffs(prDiff)` — produces a `PRDiffContext` from
GitHub's PR-files response. Skip rules in order: `deleted` →
`no-patch` → `patch-too-large` (per-file cap = 10% of total
budget) → `over-budget`. No filesystem reads.
- `formatPRDiffContext(ctx)` — renders the included diffs as one
back-to-back text block with `### <filename> (<status>, +N -M)`
per-file headers.
- `formatSkippedFilesInjection(skipped, prNumber)` — renders the
skipped list as a SELF-DOCUMENTING block: filenames + reasons
plus the fetch guidance (`gh pr diff <N> -- <path>`, `Read`,
`Grep`). Inlining the guidance means agents that don't yet have
explicit prompt updates still see how to use the data.
- `countSkipsByReason(skipped)` — group-by for the log entry.
- `src/agents/definitions/contextSteps.ts`: `fetchPRContextStep`
rewritten. Instead of a per-file `ReadFile` injection loop, it now
pushes one `GetPRDiffContext` injection plus, conditionally, one
`SkippedFiles` injection. New log entry `PR context prepared` with
`included`, `skipped`, and `skipReasons` fields (replaces the
previous `File contents loaded`).
- `src/agents/definitions/review.yaml`: prompt expanded to describe
the compact-diff context shape and explicitly name the `SKIPPED
FILES` injection, with the same fetch guidance the injection
itself carries. (AC #7 specifically required the YAML mention.)
- Cross-agent impact: `fetchPRContextStep` is registered as
`prContext` in the strategy registry and is consumed by 5 agents
(`review`, `respond-to-ci`, `respond-to-pr-comment`,
`respond-to-review`, `resolve-conflicts`). All 5 receive the new
context shape. Because the SKIPPED FILES injection is
self-documenting, no prompt edits are needed in the other 4 YAMLs.
Tests: 49 new tests (28 prFormatting, 21 contextSteps incl 4 new on
fetchPRContextStep, 4 review.yaml prompt-contract, 17 reviewConfig,
plus updated agent-profiles tests). 7533/7533 unit tests pass; lint,
typecheck, and build all clean.
Docs: CLAUDE.md gains a "Review Agent — Context Shape" subsection
under "Debugging Production Sessions". CHANGELOG.md gains an
Unreleased "Changed" entry. docs/architecture/03-trigger-system.md
table row for `prContext` updated.
## Bundled — web pm-wizard provider-switch cleanup
Pre-existing in-flight work in the working tree (Linear/PM-wizard
related). When a user edits an existing project's PM integration and
switches provider mid-edit (e.g. Trello → Linear), the save flow now:
- Snapshots the originally-loaded provider as `previousProvider` at
`INIT_EDIT` time (`pm-wizard-state.ts`).
- Preserves `isEditing` + `previousProvider` across `SET_PROVIDER`
actions so the switch doesn't lose edit-mode context.
- On save (`pm-wizard-hooks.ts`), if `previousProvider !== provider`,
deletes the old provider's project credentials before persisting
the new ones. Uses `getCredentialRoles` from
`src/config/integrationRoles.ts` to enumerate which env-var keys
to clear.
`web/tsconfig.json` adds `../src/config/**/*` to its `include` list
so the web workspace can typecheck the shared `getCredentialRoles`
import.
Tests: `tests/unit/web/pm-wizard-state.test.ts` extended with
`previousProvider`-snapshot coverage (initial state default, INIT_EDIT
snapshot, SET_PROVIDER preserves it in edit mode) — included in the
7533 unit-test count above.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mplete Plan 001/1 (checkout-and-pagination) and plan 001/2 (context-rework) have both shipped. Closing the spec. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Companion to the rename in the previous commit — the frontmatter edit hadn't been staged before the rename happened. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three pre-existing complexity warnings were blocking a clean lint baseline. Each is addressed by extracting a focused helper without changing behavior: - src/api/routers/users.ts (was 16, max 15) Permission rules for the user `update` and `delete` mutations duplicated five if-throw branches across the two handlers. Extracted into a single `assertUserAccessAllowed(actor, target, options)` helper that centralizes the cross-org access rule, the superadmin-target rule, the no-self-role- change rule, and the role-elevation/demotion rule. Both mutations now call it once. Existing 41 users.test.ts tests cover both handlers' behavior end to end and remain unmodified. - src/api/routers/webhooks.ts (was 20, max 15) The `create` mutation handler had four near-identical create-or-skip blocks (Trello, JIRA, GitHub, Sentry/Linear display-only). Extracted three private helpers — `maybeCreateTrelloWebhook`, `maybeCreateJiraWebhook`, `maybeCreateGitHubWebhook` — each owning its own provider-specific duplicate detection (callbackURL vs url vs config.url) and returning either the created webhook, a duplicate marker, or undefined. Sentry and Linear build-display-info helpers extracted alongside. The mutation body becomes a flat sequence of helper calls + result assignments. Existing 36 webhooks.test.ts tests cover the same surface and remain unmodified. - web/src/components/projects/pm-wizard.tsx (was 20, max 15) Extracted the active-webhooks derivation (the `state.provider === 'trello' ? ... : state.provider === 'jira' ? ... : []` ternary chain) into a pure helper `deriveActiveWebhooks(provider, webhooksData)`, placed in `pm-wizard-state.ts` (zero React deps, fits the file's role as the wizard's pure-data layer). The component now reads `deriveActiveWebhooks(state.provider, webhooksQuery.data)` once. Tests: 5 new tests in pm-wizard-state.test.ts cover deriveActiveWebhooks (Trello mapping, JIRA mapping with renamed `enabled`→`active`, Linear returns empty, undefined input, numeric ID coercion). 7538/7538 unit tests pass; lint is now clean (0 warnings, 0 errors); typecheck clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements spec 001 (pr-review-correctness) end-to-end. The review agent was caught producing a confidently fabricated review on PR #1092 (CASCADE's first external contribution from @suda). Root cause: two latent defects compounded — silent
git checkoutfailure on external-fork PRs + a pre-fetch that loaded full file contents capped at 25k tokens (only 20 of 129 files visible to the agent). This PR fixes the entire class.Also bundles in-flight Linear/PM-wizard work that was sitting in the working tree, plus three pre-existing biome cognitive-complexity warnings (lint baseline now clean).
What's in the PR
1. Worker-side: external-fork PR checkout (plan 001/1)
fetchAndCheckoutPRhelper using GitHub's canonicalrefs/pull/N/headref. Works uniformly for same-repo branches and external-fork branches.prNumberis set, and now fail loudly on any non-zero git exit code (no more warn-and-continue).headShais supplied. Mismatches throw before any review work begins.per_page: 100call sites insrc/github/client.tsnow use Octokit'spaginatehelper to read to completion (getPRDiff,getPRReviewComments,getPRReviews,getPRIssueComments,getCheckSuiteStatus— workflow runs + per-run jobs,getFailedWorkflowRunJobs— same two endpoints).pr-comment-mention,pr-review-submitted) were not populatingheadShaonAgentInput; they do now, so the agents they fire get HEAD verification.Fetching PR ref,PR checked out(withprNumber+ref+headSha),Total changed files in PR.2. Agent-side: compact-diff context + SKIPPED FILES contract (plan 001/2)
readPRFileContents(full-file, 25k cap) replaced byextractPRDiffs(compact per-file diffs from GitHub'sfile.patch, 200k cap). Scales with PR size, not repo size; mitigates LLM context rot; aligns with industry best practice.SkippedFile+PRDiffContexttypes and four pure helpers (extractPRDiffs,formatPRDiffContext,formatSkippedFilesInjection,countSkipsByReason).fetchPRContextStepnow pushes oneGetPRDiffContextinjection plus, conditionally, oneSkippedFilesinjection. The skipped-files payload is self-documenting — it lists each filename + reason and inlines the fetch guidance (gh pr diff,Read,Grep), so agents that consume the same step (respond-to-ci,respond-to-pr-comment,respond-to-review,resolve-conflicts) get the contract automatically without per-YAML prompt edits.review.yamlprompt updated to explicitly name the SKIPPED FILES injection (per AC ci: add GitHub Actions for CI and Fly.io deployment #7).PR context preparedwithincluded,skipped, andskipReasonsmap (replacesFile contents loaded).readPRFileContents,PRFileContents,REVIEW_FILE_CONTENT_TOKEN_LIMIT. Grep confirms no remaining references insrc/.3. Bundled — pm-wizard provider-switch credential cleanup
Pre-existing in-flight Linear/PM-wizard work in the working tree. When a user edits a project's PM integration and switches provider mid-edit (e.g. Trello → Linear), the save flow now snapshots the original provider as
previousProvideratINIT_EDITand, on save, deletes the old provider's project credentials before persisting the new ones. Uses the centralgetCredentialRoleshelper.web/tsconfig.jsonextends itsincludeto cover../src/config/**/*so the web workspace can typecheck the import.4. Cleanup — biome complexity warnings
Three pre-existing cognitive-complexity warnings cleared (no behavior changes):
src/api/routers/users.ts— extractedassertUserAccessAllowedhelper that centralizes the cross-org / superadmin-target / no-self-role-change / role-elevation rules; bothupdateanddeletemutations now call it once.src/api/routers/webhooks.ts— extractedmaybeCreateTrelloWebhook,maybeCreateJiraWebhook,maybeCreateGitHubWebhook,buildSentryDisplayInfo,buildLinearDisplayInfofrom thecreatemutation. The mutation body becomes a flat sequence of helper calls.web/src/components/projects/pm-wizard.tsx— extractedderiveActiveWebhookstopm-wizard-state.ts(pure data, no React deps).Tests
fetchAndCheckoutPR, 9extractPRDiffs, 9 pagination, 4 review.yaml prompt contract, 4fetchPRContextStep, 7 setup-path behavior including HEAD verification, 5deriveActiveWebhooks, 6 trigger / adapter coverage).dev).users.test.ts:41,webhooks.test.ts:36) cover the same surface end-to-end and remain unmodified — the helpers are tested via the existing test cases.TEST_DATABASE_URLis unreachable).Verification
npm run buildnpm run typechecknpm run lint— 0 warnings, 0 errors (was 3 pre-existing complexity warnings)npm test— 378 files, 7538 tests passDocumentation
CLAUDE.md— new "PR Checkout (worker)" subsection under Development, new "Review Agent — Context Shape" subsection under Debugging Production Sessions.CHANGELOG.md— Unreleased entries for the Fixed (checkout + pagination) and Changed (compact diffs + SKIPPED FILES) categories.docs/architecture/03-trigger-system.md—prContexttable row updated to reflect the compact-diff shape and SKIPPED FILES injection.docs/specs/001-pr-review-correctness.md.done— spec marked done; both plans.done.Follow-up work (not in this PR)
fetchAndCheckoutPRwith GitLab MR-ref support. The helper signature already takes thescmProviderparameter (one-line follow-up).followups-pr-1092.md(kept in cwd, intentionally untracked per session conventions): gadget test coverage parity, personas dedup,glabchecksum.Test plan
npm test— 7538/7538 passnpm run typecheckcleannpm run lintclean (0 warnings)npm run buildsucceeds