Skip to content

feat: add /fix workflow — iterative review → fix → re-review loop#640

Closed
PureWeen wants to merge 10 commits intomainfrom
feat/review-fix-workflow
Closed

feat: add /fix workflow — iterative review → fix → re-review loop#640
PureWeen wants to merge 10 commits intomainfrom
feat/review-fix-workflow

Conversation

@PureWeen
Copy link
Copy Markdown
Owner

Summary

Adds a new gh-aw workflow that implements an iterative review → fix → re-review loop, triggered via the /fix slash command on any PR.

How It Works

/fix on a PR
  ↓
┌─────────────────────────────────┐
│  Round 1: 3-model review        │
│  (Opus + Sonnet + Codex)        │
│  → Adversarial consensus        │
│  → CRITICAL/MODERATE findings?  │
│     YES → fix locally, commit   │
│     NO  → done ✅                │
└────────────┬────────────────────┘
             ↓
┌─────────────────────────────────┐
│  Round 2: Re-review updated code│
│  → New findings?                │
│     YES → fix, commit           │
│     NO  → done ✅                │
└────────────┬────────────────────┘
             ↓
        (up to 3 rounds)
             ↓
┌─────────────────────────────────┐
│  Push all commits at once       │
│  Post summary + final review    │
└─────────────────────────────────┘

Key Design Decisions

  1. Single-run iteration — All review/fix cycles happen within one agent session. push-to-pull-request-branch pushes all commits at the end (safe-output runs after agent exits).

  2. Max 3 fix rounds — Hard limit prevents infinite loops. After 3 rounds, remaining findings are posted as unresolved inline comments.

  3. MINOR findings don't block — Only 🔴 CRITICAL and 🟡 MODERATE trigger another cycle. 🟢 MINOR are fixed opportunistically.

  4. Shared concurrency group — Uses review-<PR> group, same as /review. A /fix cancels any in-progress /review on the same PR (and vice versa).

  5. Protected filesprotected-files: fallback-to-issue prevents the agent from modifying .github/, package manifests, etc. If it tries, an issue is created for human review instead.

  6. One commit per finding — Keeps git history reviewable. Each commit references the finding it addresses.

Files

File Purpose
.github/workflows/fix.agent.md Slash command entry point (/fix)
.github/workflows/shared/fix-shared.md Iterative loop prompt + safe-outputs
.github/workflows/fix.agent.lock.yml Auto-generated lock file

Safe Outputs

  • push-to-pull-request-branch (max 1) — pushes fix commits
  • submit-pull-request-review (max 1, COMMENT only)
  • create-pull-request-review-comment (max 30) — unresolved findings
  • add-comment (max 5, hide-older-comments) — iteration summary

PureWeen and others added 3 commits April 19, 2026 21:24
Adds a new gh-aw workflow triggered by the /fix slash command that:

1. Runs 3-model adversarial review (Opus, Sonnet, Codex)
2. Applies fixes for CRITICAL and MODERATE findings locally
3. Runs tests to verify fixes
4. Re-reviews the updated code
5. Iterates up to 3 rounds until clean
6. Pushes all fix commits via push-to-pull-request-branch
7. Posts a summary comment with full iteration history

Architecture:
- fix.agent.md — slash command entry point (/fix)
- shared/fix-shared.md — iterative loop prompt + safe-outputs config
- Uses push-to-pull-request-branch (max 1) with protected-files: fallback-to-issue
- Shares concurrency group with review workflows (intentional)
- COMMENT-only reviews (no REQUEST_CHANGES per stale review policy)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Each round now includes: full findings table (severity, consensus,
file, line), discarded findings, actions taken with commit refs,
and previous-findings status tracking on re-reviews.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Triggers on pull_request closed (merged only). Analyzes all expert
review runs on the PR to identify:

- Missed skill invocations (which skills were relevant but not used)
- False positives (incorrect findings that wasted reviewer time)
- False negatives (real issues the review should have caught)
- Improvement suggestions for review prompts and skill content

Only creates an issue if there are actionable findings. Uses
close-older-issues and expires: 30 for lifecycle management.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Additional Design-Level Observations

These findings are not tied to specific diff lines but are relevant to the workflow design:

🟢 cancel-in-progress silently discards pre-push fix commits (3/3 reviewers after follow-up)

If /fix is triggered twice rapidly on the same PR, the first run is cancelled mid-execution. Since fixes are accumulated as local git commits and pushed only at the end (Step 4 via push-to-pull-request-branch), all uncommitted progress is silently lost — potentially 60–100 minutes of agent work with no trace on the PR.

This is the correct pattern for review-only workflows but the lost-work window is significant for a mutating workflow. Consider either:

  • Adding a prompt note: "If interrupted before pushing, fixes cannot be recovered — user must re-trigger /fix"
  • Restructuring to push after each fix round (requires push-to-pull-request-branch: max: 3)

🟢 allowed-events: [COMMENT] not enforced at infrastructure level (2/3 reviewers, informational)

The compiled lock file shows "submit_pull_request_review":{"max":1} with no allowed_events restriction in the handler config. The source declares allowed-events: [COMMENT] but this isn't propagated to the runtime handler in gh-aw v0.62.2. The restriction relies on LLM prompt compliance.

Note: The existing review-shared.md compiles identically (also missing enforcement), so this is a platform gap, not unique to this PR. The stronger intent ([COMMENT] only vs. [COMMENT, REQUEST_CHANGES]) is equally unenforced in both cases.

Generated by Expert Code Review (auto) for issue #640 ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expert Code Review — PR #640

Methodology: 3 independent reviewers with adversarial consensus. Findings flagged by only 1 reviewer were validated through follow-up rounds with the other 2 models.

Findings Summary

# Severity Finding Consensus File
1 🔴 CRITICAL workflow_dispatch commits fixes against wrong branch — no PR checkout for manual dispatch 2/3 fix.agent.md:10
2 🟡 MODERATE Concurrency group review-(N) doesn't match review workflow's auto-generated group — no cross-cancellation 2/3 fix.agent.md:28
3 🟡 MODERATE add-comment missing target: "*"workflow_dispatch may fail to resolve target 3/3 after follow-up fix-shared.md:28
4 🟡 MODERATE dotnet test --no-restore fails in fresh agent workspace + security risk of running build on untrusted code 2/3 each fix-shared.md:110
5 🟢 MINOR "fix rounds < 3" counter ambiguity — LLM may miscount rounds 2/3 after follow-up fix-shared.md:98
6 🟢 MINOR Re-review diff git diff origin/main...HEAD may include unrelated changes 2/3 fix-shared.md:127
7 🟢 MINOR Step 4c inline comment validation underspecified vs. review workflow 3/3 after follow-up fix-shared.md:170
8 🟢 MINOR cancel-in-progress silently discards pre-push fix commits 3/3 after follow-up design-level
9 🟢 MINOR allowed-events: [COMMENT] not compiled into handler config (platform gap) 2/3 design-level

CI Status

No CI checks are configured for this PR type (workflow-only changes).

Test Coverage

No tests apply — these are gh-aw workflow prompt files (.md + auto-generated .lock.yml). The workflow's correctness is validated at runtime by the gh-aw platform.

Key Recommendations

  1. Fix #1 is the most important — add a steps: or pre-agent-steps: block to handle PR checkout for workflow_dispatch
  2. Fix #4 — remove --no-restore from the test command
  3. Fix #2 — either add matching concurrency to review.agent.md or update the comment to reflect that cross-cancellation doesn't actually work
  4. Fix #3 — add target: "*" to add-comment safe output

Generated by Expert Code Review (auto) for issue #640

name: fix
events: [pull_request_comment]
workflow_dispatch:
inputs:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL — workflow_dispatch commits fixes against wrong branch (Flagged by: 2/3 reviewers)

When triggered via workflow_dispatch, the platform's checkout_pr_branch.cjs step is skipped (it only runs for pull_request/issue_comment events). The workspace stays on the default branch (main).

Failing scenario: User triggers /fix via workflow_dispatch with a PR number. The agent reads the PR diff via API (works), but makes git commit calls against main-branch files — not the PR branch. The resulting patch from push-to-pull-request-branch is built from the wrong base and will either fail to apply or apply incorrect changes.

Fix: Add a steps: or pre-agent-steps: block that handles checkout for workflow_dispatch, or use the shared Checkout-GhAwPr.ps1 pattern per .github/instructions/gh-aw-workflows.instructions.md.


**After all fixes are committed**, go back to **Step 2** for a re-review of the updated code. For re-reviews, generate the diff with:
```bash
git diff origin/main...HEAD
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 MINOR — Re-review diff may include unrelated changes (Flagged by: 2/3 reviewers)

git diff origin/main...HEAD computes the three-dot diff from the merge base. For PRs targeting non-main branches or rebased branches, this may include changes unrelated to the PR.

Suggested fix: Replace with gh pr diff (number) for re-review rounds, which always returns exactly the PR diff regardless of git topology.

Comment thread .github/workflows/shared/fix-shared.md Outdated
### 2c. Decision Gate

- **Zero 🔴 or 🟡 findings?** → Go to Step 4 (done — post results)
- **Has 🔴 or 🟡 findings AND fix rounds < 3?** → Go to Step 3 (fix)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 MINOR — "fix rounds" counter ambiguity (Flagged by: 2/3 reviewers after follow-up)

"fix rounds < 3" vs "fix rounds = 3" is ambiguous: does it count completed rounds (retrospective) or the about-to-start round (prospective)? An LLM tracking state across multiple sub-agent dispatches may miscount.

Suggested fix: Restate as: "After completing your Nth fix round, if N ≥ 3, go to Step 4 regardless of findings. Otherwise continue to Step 3."

allowed-events: [COMMENT]
add-comment:
max: 5
hide-older-comments: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — add-comment missing target: "*" for workflow_dispatch (Flagged by: 3/3 reviewers after follow-up)

The gh-aw instructions state target: "*" is "Required for workflow_dispatch (no triggering PR context)". Without it, the add_comment safe output may fail to resolve the target PR when triggered via manual dispatch.

Note: The existing review-shared.md has the same omission — both should be fixed.

Fix:

  add-comment:
    max: 5
    target: "*"
    hide-older-comments: true


### 4c. Inline Comments (if any remain)

For any **unresolved** findings (e.g., too complex to auto-fix, or design-level):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 MINOR — Inline comment validation underspecified vs. review workflow (Flagged by: 3/3 reviewers after follow-up)

Compare with review-shared.md which includes: (1) explicit gh pr diff --name-only path validation, (2) @@ hunk line range parsing, (3) the critical fallback "move to add_comment", and (4) the warning "A single invalid inline comment causes the entire submit_pull_request_review to fail and ALL inline comments are lost."

This failure mode is especially important in the fix workflow since losing the review submission also loses the summary of committed fixes.

Fix: Port the full validation block from review-shared.md Step 4.


# Intentional: shared prefix with review workflows — /fix cancels in-progress /review.
concurrency:
group: "review-${{ github.event.issue.number || inputs.pr_number || github.run_id }}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — Concurrency group does not cross-cancel /review as claimed (Flagged by: 2/3 reviewers)

The comment on line 26 states this shares a prefix with review workflows, but the groups are different:

  • /fix compiles to: review-(N) (custom group)
  • /review compiles to: gh-aw-Expert Code Review-(N) (compiler-generated default — review.agent.md has no custom concurrency: block)

Failing scenario: Running /fix on a PR while /review is in-progress. Both run simultaneously instead of /fix cancelling the /review. Interleaved comments, stale review findings, and potential race on push-to-pull-request-branch vs. review comment posting.

Fix (either):

  1. Add a matching concurrency: block to review.agent.md so both use review-(N)
  2. Or remove the custom concurrency block here — the compiler will generate gh-aw-Review & Fix-(N) (but then they won't cross-cancel either, which matches the current reality)

3. **Verify the fix** — re-read the file to confirm the change is correct
4. **Run tests if applicable** — look for test commands in `.github/copilot-instructions.md`. For this repo:
```bash
cd PolyPilot.Tests && dotnet test --no-restore --verbosity quiet
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — Two issues with test command (Flagged by: 2/3 reviewers each)

Issue A: --no-restore will fail in agent workspace. The gh-aw agent runs in a fresh checkout with no NuGet cache. --no-restore skips the implicit restore, causing "unable to find package" failures. The agent will burn fix rounds trying to "fix" restore failures. Repo convention (copilot-instructions.md) is dotnet test without --no-restore.

Issue B: Executing dotnet test on untrusted PR code. The gh-aw instructions warn: "DO NOT run dotnet build, npm install, or any build command on untrusted PR code inside the agent — build tool hooks (MSBuild targets, postinstall scripts) can read COPILOT_TOKEN." dotnet test implicitly runs dotnet build.

Mitigations already present: roles: [admin, maintainer, write] limits triggers to trusted collaborators; no forks: blocks fork PRs; protected-files blocks .csproj modifications.

Fix for Issue A: Remove --no-restore:

cd PolyPilot.Tests && dotnet test --verbosity quiet

For Issue B: Accept the risk given role restrictions and add a comment documenting the tradeoff, or move test execution to pre-agent-steps:.

PureWeen and others added 7 commits April 19, 2026 22:06
Every finding matters — MINOR issues (naming, docs, suboptimal patterns)
are fixed with the same rigor as CRITICAL/MODERATE. The decision gate
now requires zero findings at any severity before stopping, not just
zero CRITICAL/MODERATE.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
After the agent pushes fixes, it dispatches verify-build.yml which
runs in parallel on:
- macOS: tests + Mac Catalyst build
- Windows: Windows target build

The verify workflow posts results as a PR comment with per-platform
pass/fail status. Only dispatched when fixes were actually pushed.

Uses dispatch-workflow safe output (same-repo, compile-time validated).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The report job now gathers previous automated review comments from the
PR and includes a summary showing that the build verification validates
all review-driven fixes compile across platforms. Adds diagnostic links
when builds fail.

Note: Running Copilot CLI directly in GHA for AI-powered verification
is not currently feasible — Copilot CLI authenticates via gh auth
(OAuth/keyring), not via GITHUB_TOKEN. The gh-aw engine handles this
internally but only runs on Linux. For now, cross-platform verification
is build+test only; AI review runs via gh-aw on Linux.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Major sync with MAUI PR 35027 latest commits (11 commits):

gh-aw-guide/SKILL.md:
- 21-row anti-patterns table (was 15) — new rows for concurrency
  races, approve-and-run gate, cancel-in-progress: false,
  narrow slash_command events, LabelOps
- LabelOps section: label_command vs names: filtering, remove_label
- 4 security-critical patterns with code examples

gh-aw-guide/references/architecture.md:
- Authorization Model section (on.roles defaults, roles: all warning)
- Concurrency Race Conditions (slash_command + cancel-in-progress)
- Approve and Run gate risk (pull_request trigger)
- Expanded workflow author rules

instruction-drift:
- Full 646-line Check-Staleness.ps1 with: Get-ContentHash,
  Get-IndexPageLinks (crawl doc indexes), Find-UntrackedPages,
  Get-RecentClosedIssues (90-day window), Get-GitHubLatestRelease
- Coverage gaps tracking via manifest coverage_gaps: field
- P0-P3 classification in SKILL.md
- JSON report output format

sync.yaml:
- coverage_gaps fields on source URLs
- Expanded to 86 lines

PolyPilot-specific divergences preserved:
- COMMENT-only reviews (not REQUEST_CHANGES)
- Known Limitation: Stale Blocking Reviews
- Issue state comparison fix (actual vs expected)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Mines github/gh-aw commits for high-signal changes to docs, compiler,
safe-outputs, triggers, and security. Tracks a watermark (last checked
SHA) and on each run:

1. Fetches commits since watermark
2. Filters to high-signal (feat:, BREAKING, safe-output, trigger,
   compiler, security, engine changes)
3. Categorizes by type with commit refs
4. Samples 20 random shared/ workflow configs for real-world
   safe-output patterns
5. Outputs JSON knowledge report

Complements Check-Staleness.ps1: staleness tells you THAT something
changed, this tells you WHAT changed at the commit level.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Check-WorkflowSecurity.ps1 scans workflow .md files for dangerous
trigger patterns and missing security gates. Adds Dangerous Triggers
Checklist to architecture.md documenting gh-aw#23769 fixes and
remaining risks.

Fixes found by scanner in our own workflows:
- fix.agent.md: cancel-in-progress true to false (slash_command)
- dep-update.md: added protected-files: fallback-to-issue

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three gaps identified from the tailspin-toys workshop:
- checkout: false — skip repo checkout for API-only workflows
- rate-limit: max/window — throttle slash commands to prevent abuse
- web-fetch tool — for workflows that call external APIs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen added a commit that referenced this pull request Apr 22, 2026
Adds /fix slash command for iterative review/fix cycles on PRs:
- 3-model adversarial review (Opus + Sonnet + Codex)
- Auto-fix all findings (CRITICAL, MODERATE, MINOR)
- Re-review after fixes, up to 3 rounds
- Push all commits via push-to-pull-request-branch safe output
- Cross-platform verification via verify-build.yml

Also adds:
- review-retro.agent.md: Post-merge review retrospective
- verify-build.yml: Cross-platform build + test workflow
- noop safe-output for clean PRs

All 9 review findings from PR #640 addressed:
- workflow_dispatch checkout, concurrency groups, target: *,
  --no-restore removed, explicit round counter, scoped re-review
  diff, inline validation aligned, cancel-in-progress documented

Supersedes #640.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen added a commit that referenced this pull request Apr 22, 2026
Adds /fix slash command, verify-build, review-retro workflows. All 9 review findings from #640 addressed. Supersedes #640.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen
Copy link
Copy Markdown
Owner Author

Superseded by #661 (merged). Clean rebased version with all 9 review findings addressed.

@PureWeen PureWeen closed this Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant