fix: auto-fix-on-failure never triggers for workflow_dispatch runs by PureWeen · Pull Request #756 · PureWeen/PolyPilot

PureWeen · 2026-04-23T18:54:00Z

Summary

The auto-fix feedback loop (auto-fix-on-failure.yml) has never triggered (0 runs) because two bugs prevented it from working with workflow_dispatch-based integration tests.

Bug 1: `branches-ignore: main`

Integration tests dispatched by the agent-fix workflow use workflow_dispatch, which always runs on main (the ref input only controls which code is checked out). The branches-ignore: main filter excluded all such runs.

Bug 2: PR lookup by `head_branch`

For workflow_dispatch runs, head_branch is main, not the PR branch. The original code searched for a PR with --head main, finding nothing.

Fix: Read pr_number from the failed run's inputs via the API first, fall back to branch-based lookup for push/PR-triggered runs. Also resolve the actual PR branch name for the comment.

Bug 3: Insufficient permissions

pull-requests: read → pull-requests: write (needed for gh pr comment).

Testing

This was discovered while validating the end-to-end agent-fix pipeline for PR #745. The agent correctly dispatched integration tests, but when they failed (due to workflow YAML bugs fixed in #755), the auto-fix loop never fired.

Two bugs prevented the auto-fix feedback loop from ever triggering: 1. branches-ignore: main — workflow_dispatch integration tests always run on main (the ref input only controls checkout, not head_branch). Removed the filter entirely; the PR-lookup step already handles non-PR runs by exiting early. 2. PR lookup by head_branch — for workflow_dispatch runs, head_branch is 'main', not the PR branch. Now reads pr_number from the failed run's inputs via the API first, falls back to branch-based lookup. Also fixed permissions: pull-requests: read → write (needed for gh pr comment). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-04-23T19:06:57Z

Design-Level Findings (outside diff hunks)

🔴 CRITICAL — Shell injection via ${{ }} interpolation in downstream steps (Flagged by 3/3 reviewers after consensus round)

Several steps downstream of the find-pr step interpolate its outputs directly into shell scripts using ${{ steps.find-pr.outputs.* }}. GitHub Actions expands these expressions before bash interprets the script, so if any output contains shell metacharacters, they execute as commands with GH_TOKEN in scope.

Affected locations (outside this PR's diff but consuming the outputs it sets):

check-count step: PR=${{ steps.find-pr.outputs.pr_number }}
Dispatch /fix step: PR=${{ steps.find-pr.outputs.pr_number }}, BRANCH="${{ steps.find-pr.outputs.branch }}"
post max-attempts notice step: similar pattern

Fix: Pass step outputs via env: blocks instead of inline ${{ }}:

      - name: Check fix count
        env:
          PR: ${{ steps.find-pr.outputs.pr_number }}
        run: |
          # $PR is now safely quoted as an env var

This is a pre-existing pattern but worsened by this PR's scope expansion (removing branches-ignore: main increases the attack surface).

🟢 MINOR — Concurrent failure race condition in fix-count guard (Flagged by 3/3 reviewers after consensus round)

With branches-ignore: main removed, both PR-branch builds and workflow_dispatch integration tests can now trigger this workflow for the same PR simultaneously. Two concurrent failures will both read FIX_COUNT from PR comments before either posts its "Auto-fix triggered" comment, both see count=0, and both dispatch the fix agent.

Fix: Add a concurrency group keyed on the PR number to serialize runs:

    concurrency:
      group: auto-fix-${{ github.event.workflow_run.head_branch }}
      cancel-in-progress: false

Discarded findings (single reviewer only)

|| true swallows jq error output — 1/3 reviewers. Two follow-up reviewers disagreed: jq errors go to stderr (already redirected to /dev/null), and GitHub API always returns well-formed JSON, making this scenario unrealistic.
Extra runs from removing branches-ignore: main — 1/3 reviewers. Acknowledged as a necessary trade-off by the reviewer who flagged it.

Generated by Expert Code Review (auto) for issue #756 · ● 5.5M · ◷

github-actions

Review Summary — `.github/workflows/auto-fix-on-failure.yml`

The three bug fixes are correct and well-reasoned. I traced the full data flow (trigger → PR lookup → comment → dispatch) and the upstream input definitions. Two findings worth noting:

🟡 MODERATE — `GITHUB_OUTPUT` injection via unsanitized `pr_number` (line 40)

PR_NUMBER=$(gh api "repos/.../actions/runs/$FAILED_RUN" \
  --jq '.inputs.pr_number // empty' 2>/dev/null || true)
...
echo "pr_number=$PR_NUMBER" >> $GITHUB_OUTPUT

The value from .inputs.pr_number is written to $GITHUB_OUTPUT with simple echo "key=value" syntax. If the value contained a newline character, it could inject additional key-value pairs into the step outputs.

Mitigating factors: Both upstream workflows (verify-build.yml line 5, polypilot-integration.yml line 6) declare pr_number with type: number, so GitHub validates at dispatch time. Also, all outputs (has_pr, branch, failed_workflow, failed_run) are written after pr_number, so duplicates would be overwritten (last-write-wins). Practical exploitation is very unlikely.

Suggested hardening (defense-in-depth): Validate before use:

if ! [[ "$PR_NUMBER" =~ ^[0-9]+$ ]]; then
  PR_NUMBER=""
fi

Or use the delimiter syntax for setting outputs.

🟢 MINOR — Removing `branches-ignore: main` broadens trigger scope

Removing branches-ignore: main is necessary for the fix (dispatch runs always report head_branch: main), but it also means the workflow now fires for direct-push-to-main failures. These are handled correctly — the PR lookup returns empty and the job exits with has_pr=false — but they consume a runner for ~15-30s unnecessarily.

No action required; just something to be aware of if CI costs become a concern.

✅ Verified correct

pull-requests: write — Needed for gh pr comment. The original read permission would have caused the comment step to fail silently (stderr suppressed). Good catch.
API-first PR lookup — Correctly reads pr_number from the failed run's inputs, then falls back to branch-based lookup. This handles both workflow_dispatch and push/pull_request-triggered runs.
PR_BRANCH resolution — Using gh pr view to get the real head branch ensures the comment references the correct branch, not main. The || echo "$BRANCH" fallback is safe.
Loop guard — The 3-attempt limit (via comment counting) remains intact and works with the new lookup logic.
Shell quoting — All variable expansions in gh commands are properly double-quoted, preventing command injection even with unexpected input values.

Generated by Expert Code Review (auto) for issue #756 · ● 5.5M

github-actions · 2026-04-23T19:06:58Z

+          # workflow_dispatch runs execute on main but carry the PR number
+          # in their inputs. Try reading it from the API first.
+          PR_NUMBER=$(gh api "repos/${{ github.repository }}/actions/runs/$FAILED_RUN" \
+            --jq '.inputs.pr_number // empty' 2>/dev/null || true)


🟡 MODERATE — Defense-in-depth: validate pr_number before use

The value from .inputs.pr_number is user-controllable (set at workflow dispatch time) and written to $GITHUB_OUTPUT with simple echo "key=value" syntax. A newline in the value could inject extra output keys.

Practical risk is low because both upstream workflows (verify-build.yml, polypilot-integration.yml) declare this input as type: number, and all subsequent outputs overwrite any injected duplicates. Still, a one-line guard follows the principle of least trust:

Suggested change

--jq '.inputs.pr_number // empty' 2>/dev/null || true)

PR_NUMBER=$(gh api "repos/$\{\{ github.repository }}/actions/runs/$FAILED_RUN" \

--jq '.inputs.pr_number // empty' 2>/dev/null || true)

# Sanitize: must be a bare integer (defense-in-depth)

[[ "$PR_NUMBER" =~ ^[0-9]+$ ]] || PR_NUMBER=""

github-actions · 2026-04-23T19:06:58Z

          fi

-          echo "Found PR #$PR_NUMBER"
+          # Resolve the PR's head branch for the comment
+          PR_BRANCH=$(gh pr view "$PR_NUMBER" --repo ${{ github.repository }} \


🟡 MODERATE — PR_NUMBER should be validated as numeric before use (Flagged by 3/3 reviewers)

PR_NUMBER originates from the failed run's .inputs.pr_number (line 39), which is attacker-influenceable if the dispatch chain can be reached from a PR author. Before using it in API calls or writing it to $GITHUB_OUTPUT, validate it is a bare integer:

if ! [[ "$PR_NUMBER" =~ ^[0-9]+$ ]]; then echo "Invalid PR number '$PR_NUMBER' — skipping" echo "has_pr=false" >> $GITHUB_OUTPUT exit 0 fi

This also mitigates the 🔴 shell injection finding below by sanitizing the value at the source before it flows into downstream steps.

## Problem PR #756 fixed the `branches-ignore: main` blocker and added API-based input reading, but the GitHub API returns `inputs: null` for completed `workflow_dispatch` runs. The auto-fix-on-failure workflow still can't find the PR number. ## Fix 1. **`polypilot-integration.yml`** and **`verify-build.yml`**: Add `run-name` that embeds `PR #NNN` in the display title when `pr_number` input is provided 2. **`auto-fix-on-failure.yml`**: Parse PR number from `github.event.workflow_run.display_title` instead of trying to read inputs from the API This completes the auto-fix feedback loop: agent-fix dispatches tests → tests fail → auto-fix-on-failure finds the PR from the run name → dispatches `/fix`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

## Summary Closes the last gaps in the agent-fix pipeline so it runs **fully end-to-end** without human intervention: **label issue → agent implements fix → self-review → non-draft PR → integration tests + expert review dispatched → auto-fix if tests fail** ## Changes 1. **`draft: false`** — PRs are now created as non-draft, so `review-on-open.agent` can trigger on `ready_for_review` 2. **Dispatch `review.agent` directly** — bypasses the `action_required` approval gate that blocks `pull_request`-triggered workflows for bot-created PRs 3. **Increase dispatch max** from 2 → 3 (verify-build + integration + review) 4. **Updated Step 8** — instructions now include dispatching the expert review with `pr_number` 5. **Recompiled lock file** — `gh aw compile` auto-detected `review.agent` as a gh-aw workflow and added `.lock.yml` extension mapping ## Pipeline Flow (after this PR) ``` Issue labeled 'agent-fix' → Agent reads issue, explores code, implements fix → 3-model self-review (Opus 4.6, Sonnet 4.6, GPT-5.3-Codex) → Non-draft PR created → Dispatches: verify-build + polypilot-integration + review.agent → If tests fail: auto-fix-on-failure dispatches /fix → Result: fully reviewed, tested PR ready for merge ``` ## Context This is the final piece of the pipeline work started across PRs #755 (integration test fixes), #756 (auto-fix-on-failure), and #757 (run-name embedding for PR discovery). Those PRs fixed bugs that prevented the existing pipeline from completing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

PureWeen merged commit 2da5ae1 into main Apr 23, 2026
1 check passed

PureWeen deleted the fix/auto-fix-trigger-bug branch April 23, 2026 18:54

github-actions Bot reviewed Apr 23, 2026

View reviewed changes

PureWeen mentioned this pull request Apr 23, 2026

fix: auto-fix PR discovery via run-name embedding #757

Merged

PureWeen mentioned this pull request Apr 23, 2026

feat: make agent-fix pipeline fully autonomous (end-to-end) #758

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: auto-fix-on-failure never triggers for workflow_dispatch runs#756

fix: auto-fix-on-failure never triggers for workflow_dispatch runs#756
PureWeen merged 1 commit intomainfrom
fix/auto-fix-trigger-bug

PureWeen commented Apr 23, 2026

Uh oh!

Uh oh!

github-actions Bot commented Apr 23, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot Apr 23, 2026

Uh oh!

github-actions Bot Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-            --jq '.inputs.pr_number // empty' 2>/dev/null || true)
+          PR_NUMBER=$(gh api "repos/$\{\{ github.repository }}/actions/runs/$FAILED_RUN" \
+            --jq '.inputs.pr_number // empty' 2>/dev/null || true)
+          # Sanitize: must be a bare integer (defense-in-depth)
+          [[ "$PR_NUMBER" =~ ^[0-9]+$ ]] || PR_NUMBER=""

Conversation

PureWeen commented Apr 23, 2026

Summary

Bug 1: branches-ignore: main

Bug 2: PR lookup by head_branch

Bug 3: Insufficient permissions

Testing

Uh oh!

Uh oh!

github-actions Bot commented Apr 23, 2026

Design-Level Findings (outside diff hunks)

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Review Summary — .github/workflows/auto-fix-on-failure.yml

🟡 MODERATE — GITHUB_OUTPUT injection via unsanitized pr_number (line 40)

🟢 MINOR — Removing branches-ignore: main broadens trigger scope

✅ Verified correct

Uh oh!

github-actions Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Bug 1: `branches-ignore: main`

Bug 2: PR lookup by `head_branch`

Review Summary — `.github/workflows/auto-fix-on-failure.yml`

🟡 MODERATE — `GITHUB_OUTPUT` injection via unsanitized `pr_number` (line 40)

🟢 MINOR — Removing `branches-ignore: main` broadens trigger scope