feat: auto-fix feedback loop — tests fail → /fix → retest#750
Conversation
When verify-build or polypilot-integration completes with failure
on a PR branch, this workflow:
1. Finds the associated PR
2. Posts a comment with the failure context
3. Dispatches /fix on the PR to review and fix the failure
4. /fix pushes fixes → CI re-runs → if fails again → loop
5. Max 3 attempts to prevent infinite loops
This closes the feedback loop:
agent-fix → PR → tests fail → /fix → push fix → tests re-run → pass ✅
↑ │
└────── fail ────────┘
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Expert Code Review — PR #750: Auto-Fix Feedback Loop
Methodology: 3 independent reviewers with adversarial consensus. Findings confirmed by ≥2/3 reviewers are included; single-reviewer findings were cross-examined by the other two before inclusion or discard.
Findings by Severity
| # | Severity | Finding | Consensus | Line(s) |
|---|---|---|---|---|
| 1 | 🔴 CRITICAL | pull-requests: read blocks all gh pr comment calls — workflow always 403s |
3/3 | 26 |
| 2 | 🟡 MODERATE | Script injection: ${{ }} expressions spliced into shell without env: indirection |
3/3 | 32–34 |
| 3 | 🟡 MODERATE | No concurrency guard → parallel triggers cause duplicate /fix dispatches (TOCTOU race) |
3/3 | 20–22 |
| 4 | 🟡 MODERATE | Fragile substring-based loop guard — spoofable, no author filter, no hidden marker | 3/3 | 65 |
| 5 | 🟡 MODERATE | branches-ignore: main may silently suppress all workflow_run triggers |
2/3 | 13–14 |
| 6 | 🟡 MODERATE | --ref main dispatch ignores PR-side changes to fix agent (1 reviewer argues this is intentional security) |
2/3 | 99 |
| 7 | 🟡 MODERATE | gh pr list --head unreliable for fork PRs (proposed fix also limited) |
2/3 | 41 |
Discarded (single reviewer, under cap): Redundant GH_TOKEN env declaration (1/3, 🟢 MINOR).
CI & Test Coverage
- CI status: Pending (agent checks in progress, no build/test workflows triggered on this workflow-only change)
- Test coverage: This is a GitHub Actions workflow file — no unit test coverage applicable. The loop guard and permissions should be validated by a manual test run.
- Prior reviews: None
Key Recommendation
Finding #1 (pull-requests: write) is a blocking bug — without it, the workflow silently fails on every run. Findings #2–4 (injection, race, fragile counter) are important hardening items. Finding #5 (branches-ignore) should be validated against the actual dispatch behavior of upstream workflows.
Generated by Expert Code Review (auto) for issue #750 · ● 7.7M
| if: github.event.workflow_run.conclusion == 'failure' | ||
| permissions: | ||
| actions: write | ||
| pull-requests: read |
There was a problem hiding this comment.
🔴 CRITICAL — pull-requests: read is insufficient for gh pr comment (Flagged by: 3/3 reviewers)
Both the "Dispatch /fix" step (line 88) and the "Post max-attempts notice" step (line 106) call gh pr comment, which requires pull-requests: write. With read, the GITHUB_TOKEN will receive HTTP 403 on every comment attempt, silently breaking the entire feedback loop.
Fix:
pull-requests: write| branches-ignore: | ||
| - main |
There was a problem hiding this comment.
🟡 MODERATE — branches-ignore: main may silently suppress all triggers (Flagged by: 2/3 reviewers)
If the upstream watched workflows (verify-build, polypilot-integration) are dispatched with --ref main (common for workflow_dispatch), the workflow_run event's head_branch will be main, and this filter silently drops the event. The auto-fix loop would never fire.
Fix: Remove the branches-ignore filter and rely on the existing "Find associated PR" step's no-PR-found early exit to skip non-PR runs. Alternatively, move the filtering into the job's if: condition where you can inspect the actual PR branch.
| # Count how many auto-fix comments already exist on this PR | ||
| # to prevent infinite loops | ||
| FIX_COUNT=$(gh pr view "$PR" --repo ${{ github.repository }} \ | ||
| --json comments --jq '[.comments[].body | select(contains("Auto-fix triggered"))] | length') |
There was a problem hiding this comment.
🟡 MODERATE — Loop guard uses fragile substring match (Flagged by: 3/3 reviewers)
contains("Auto-fix triggered") matches any comment containing that phrase — including quotes, review discussions, or bot responses. False positives prematurely disable auto-fix; deleted comments weaken the guard. If the gh pr comment call fails (e.g., the permissions bug), no comment is posted, FIX_COUNT stays 0, and the loop runs unboundedly.
Fix: Use a hidden HTML marker and filter by author:
# In the comment body:
"<!-- auto-fix-marker -->\n## 🔄 Auto-fix triggered ..."
# In the count check:
'[.comments[] | select(.author.login == "github-actions[bot]") | .body | select(contains("<!-- auto-fix-marker -->"))] | length'| check-and-fix: | ||
| name: Check results and trigger fix if needed | ||
| runs-on: ubuntu-latest |
There was a problem hiding this comment.
🟡 MODERATE — No concurrency guard → duplicate /fix dispatches (Flagged by: 3/3 reviewers)
If both "Verify Build" and "PolyPilot Integration Test" fail for the same push, two workflow_run events fire simultaneously. Both instances read FIX_COUNT=0 (TOCTOU race), both pass the < 3 check, and both dispatch /fix — causing duplicate conflicting fix agents on the same PR.
Fix: Add a concurrency group at the job level:
concurrency:
group: auto-fix-$\{\{ github.event.workflow_run.head_branch }}
cancel-in-progress: true| - name: Find associated PR | ||
| id: find-pr | ||
| run: | | ||
| BRANCH="${{ github.event.workflow_run.head_branch }}" |
There was a problem hiding this comment.
🟡 MODERATE — Script injection via $\{\{ }} expression in shell context (Flagged by: 3/3 reviewers after follow-up)
$\{\{ github.event.workflow_run.head_branch }} is interpolated into the shell script by the Actions runner before bash executes it. A branch name containing backticks or $(...) would execute as a shell command inside the double quotes. The same risk applies to workflow_run.name on line 33.
Fix: Pass untrusted values via the step's env: mapping:
env:
HEAD_BRANCH: $\{\{ github.event.workflow_run.head_branch }}
WORKFLOW_NAME: $\{\{ github.event.workflow_run.name }}
RUN_ID: $\{\{ github.event.workflow_run.id }}
run: |
BRANCH="$HEAD_BRANCH"
FAILED_WORKFLOW="$WORKFLOW_NAME"
FAILED_RUN="$RUN_ID"| # Dispatch the fix.agent workflow against this PR | ||
| gh workflow run fix.agent.lock.yml \ | ||
| --repo ${{ github.repository }} \ | ||
| --ref main \ |
There was a problem hiding this comment.
🟡 MODERATE — Fix agent always dispatched from main (Flagged by: 2/3 reviewers — 1 dissented, noting this is an intentional security pattern)
--ref main means if the PR being fixed also modifies fix.agent.lock.yml, the dispatched agent uses the old main version. One reviewer argued this is the correct security posture (prevents untrusted PRs from modifying the fix agent itself). Worth documenting the intent explicitly if this is by design.
|
|
||
| # Find open PR for this branch | ||
| PR_NUMBER=$(gh pr list --repo ${{ github.repository }} \ | ||
| --head "$BRANCH" --state open \ |
There was a problem hiding this comment.
🟡 MODERATE — gh pr list --head may miss fork PRs (Flagged by: 2/3 reviewers — 1 dissented)
For fork PRs, --head "$BRANCH" uses just the branch name, but fork PRs are qualified as owner:branch. One reviewer suggested using github.event.workflow_run.pull_requests[0].number instead, but another noted that field is also empty for cross-repo forks (GitHub omits it for security). Neither approach fully solves the fork case — the current approach is adequate for same-repo branches, which is the primary use case here.
Closes the loop between test failures and the fix agent.
Uses
workflow_runtrigger on verify-build + polypilot-integration completion. Max 3 attempts with comment tracking to prevent infinite loops.