Skip to content

fix: auto-fix-on-failure never triggers for workflow_dispatch runs#756

Merged
PureWeen merged 1 commit intomainfrom
fix/auto-fix-trigger-bug
Apr 23, 2026
Merged

fix: auto-fix-on-failure never triggers for workflow_dispatch runs#756
PureWeen merged 1 commit intomainfrom
fix/auto-fix-trigger-bug

Conversation

@PureWeen
Copy link
Copy Markdown
Owner

Summary

The auto-fix feedback loop (auto-fix-on-failure.yml) has never triggered (0 runs) because two bugs prevented it from working with workflow_dispatch-based integration tests.

Bug 1: branches-ignore: main

Integration tests dispatched by the agent-fix workflow use workflow_dispatch, which always runs on main (the ref input only controls which code is checked out). The branches-ignore: main filter excluded all such runs.

Bug 2: PR lookup by head_branch

For workflow_dispatch runs, head_branch is main, not the PR branch. The original code searched for a PR with --head main, finding nothing.

Fix: Read pr_number from the failed run's inputs via the API first, fall back to branch-based lookup for push/PR-triggered runs. Also resolve the actual PR branch name for the comment.

Bug 3: Insufficient permissions

pull-requests: readpull-requests: write (needed for gh pr comment).

Testing

This was discovered while validating the end-to-end agent-fix pipeline for PR #745. The agent correctly dispatched integration tests, but when they failed (due to workflow YAML bugs fixed in #755), the auto-fix loop never fired.

Two bugs prevented the auto-fix feedback loop from ever triggering:

1. branches-ignore: main — workflow_dispatch integration tests always
   run on main (the ref input only controls checkout, not head_branch).
   Removed the filter entirely; the PR-lookup step already handles
   non-PR runs by exiting early.

2. PR lookup by head_branch — for workflow_dispatch runs, head_branch
   is 'main', not the PR branch. Now reads pr_number from the failed
   run's inputs via the API first, falls back to branch-based lookup.

Also fixed permissions: pull-requests: read → write (needed for
gh pr comment).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen merged commit 2da5ae1 into main Apr 23, 2026
1 check passed
@PureWeen PureWeen deleted the fix/auto-fix-trigger-bug branch April 23, 2026 18:54
@github-actions
Copy link
Copy Markdown
Contributor

Design-Level Findings (outside diff hunks)

🔴 CRITICAL — Shell injection via ${{ }} interpolation in downstream steps (Flagged by 3/3 reviewers after consensus round)

Several steps downstream of the find-pr step interpolate its outputs directly into shell scripts using ${{ steps.find-pr.outputs.* }}. GitHub Actions expands these expressions before bash interprets the script, so if any output contains shell metacharacters, they execute as commands with GH_TOKEN in scope.

Affected locations (outside this PR's diff but consuming the outputs it sets):

  • check-count step: PR=${{ steps.find-pr.outputs.pr_number }}
  • Dispatch /fix step: PR=${{ steps.find-pr.outputs.pr_number }}, BRANCH="${{ steps.find-pr.outputs.branch }}"
  • post max-attempts notice step: similar pattern

Fix: Pass step outputs via env: blocks instead of inline ${{ }}:

      - name: Check fix count
        env:
          PR: ${{ steps.find-pr.outputs.pr_number }}
        run: |
          # $PR is now safely quoted as an env var

This is a pre-existing pattern but worsened by this PR's scope expansion (removing branches-ignore: main increases the attack surface).


🟢 MINOR — Concurrent failure race condition in fix-count guard (Flagged by 3/3 reviewers after consensus round)

With branches-ignore: main removed, both PR-branch builds and workflow_dispatch integration tests can now trigger this workflow for the same PR simultaneously. Two concurrent failures will both read FIX_COUNT from PR comments before either posts its "Auto-fix triggered" comment, both see count=0, and both dispatch the fix agent.

Fix: Add a concurrency group keyed on the PR number to serialize runs:

    concurrency:
      group: auto-fix-${{ github.event.workflow_run.head_branch }}
      cancel-in-progress: false

Discarded findings (single reviewer only)
  • || true swallows jq error output — 1/3 reviewers. Two follow-up reviewers disagreed: jq errors go to stderr (already redirected to /dev/null), and GitHub API always returns well-formed JSON, making this scenario unrealistic.
  • Extra runs from removing branches-ignore: main — 1/3 reviewers. Acknowledged as a necessary trade-off by the reviewer who flagged it.

Generated by Expert Code Review (auto) for issue #756 · ● 5.5M ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary — .github/workflows/auto-fix-on-failure.yml

The three bug fixes are correct and well-reasoned. I traced the full data flow (trigger → PR lookup → comment → dispatch) and the upstream input definitions. Two findings worth noting:


🟡 MODERATE — GITHUB_OUTPUT injection via unsanitized pr_number (line 40)

PR_NUMBER=$(gh api "repos/.../actions/runs/$FAILED_RUN" \
  --jq '.inputs.pr_number // empty' 2>/dev/null || true)
...
echo "pr_number=$PR_NUMBER" >> $GITHUB_OUTPUT

The value from .inputs.pr_number is written to $GITHUB_OUTPUT with simple echo "key=value" syntax. If the value contained a newline character, it could inject additional key-value pairs into the step outputs.

Mitigating factors: Both upstream workflows (verify-build.yml line 5, polypilot-integration.yml line 6) declare pr_number with type: number, so GitHub validates at dispatch time. Also, all outputs (has_pr, branch, failed_workflow, failed_run) are written after pr_number, so duplicates would be overwritten (last-write-wins). Practical exploitation is very unlikely.

Suggested hardening (defense-in-depth): Validate before use:

if ! [[ "$PR_NUMBER" =~ ^[0-9]+$ ]]; then
  PR_NUMBER=""
fi

Or use the delimiter syntax for setting outputs.


🟢 MINOR — Removing branches-ignore: main broadens trigger scope

Removing branches-ignore: main is necessary for the fix (dispatch runs always report head_branch: main), but it also means the workflow now fires for direct-push-to-main failures. These are handled correctly — the PR lookup returns empty and the job exits with has_pr=false — but they consume a runner for ~15-30s unnecessarily.

No action required; just something to be aware of if CI costs become a concern.


✅ Verified correct

  • pull-requests: write — Needed for gh pr comment. The original read permission would have caused the comment step to fail silently (stderr suppressed). Good catch.
  • API-first PR lookup — Correctly reads pr_number from the failed run's inputs, then falls back to branch-based lookup. This handles both workflow_dispatch and push/pull_request-triggered runs.
  • PR_BRANCH resolution — Using gh pr view to get the real head branch ensures the comment references the correct branch, not main. The || echo "$BRANCH" fallback is safe.
  • Loop guard — The 3-attempt limit (via comment counting) remains intact and works with the new lookup logic.
  • Shell quoting — All variable expansions in gh commands are properly double-quoted, preventing command injection even with unexpected input values.

Generated by Expert Code Review (auto) for issue #756 · ● 5.5M

# workflow_dispatch runs execute on main but carry the PR number
# in their inputs. Try reading it from the API first.
PR_NUMBER=$(gh api "repos/${{ github.repository }}/actions/runs/$FAILED_RUN" \
--jq '.inputs.pr_number // empty' 2>/dev/null || true)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — Defense-in-depth: validate pr_number before use

The value from .inputs.pr_number is user-controllable (set at workflow dispatch time) and written to $GITHUB_OUTPUT with simple echo "key=value" syntax. A newline in the value could inject extra output keys.

Practical risk is low because both upstream workflows (verify-build.yml, polypilot-integration.yml) declare this input as type: number, and all subsequent outputs overwrite any injected duplicates. Still, a one-line guard follows the principle of least trust:

Suggested change
--jq '.inputs.pr_number // empty' 2>/dev/null || true)
PR_NUMBER=$(gh api "repos/$\{\{ github.repository }}/actions/runs/$FAILED_RUN" \
--jq '.inputs.pr_number // empty' 2>/dev/null || true)
# Sanitize: must be a bare integer (defense-in-depth)
[[ "$PR_NUMBER" =~ ^[0-9]+$ ]] || PR_NUMBER=""

Comment on lines 53 to +56
fi

echo "Found PR #$PR_NUMBER"
# Resolve the PR's head branch for the comment
PR_BRANCH=$(gh pr view "$PR_NUMBER" --repo ${{ github.repository }} \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — PR_NUMBER should be validated as numeric before use (Flagged by 3/3 reviewers)

PR_NUMBER originates from the failed run's .inputs.pr_number (line 39), which is attacker-influenceable if the dispatch chain can be reached from a PR author. Before using it in API calls or writing it to $GITHUB_OUTPUT, validate it is a bare integer:

          if ! [[ "$PR_NUMBER" =~ ^[0-9]+$ ]]; then
            echo "Invalid PR number '$PR_NUMBER' — skipping"
            echo "has_pr=false" >> $GITHUB_OUTPUT
            exit 0
          fi

This also mitigates the 🔴 shell injection finding below by sanitizing the value at the source before it flows into downstream steps.

PureWeen added a commit that referenced this pull request Apr 23, 2026
## Problem

PR #756 fixed the `branches-ignore: main` blocker and added API-based
input reading, but the GitHub API returns `inputs: null` for completed
`workflow_dispatch` runs. The auto-fix-on-failure workflow still can't
find the PR number.

## Fix

1. **`polypilot-integration.yml`** and **`verify-build.yml`**: Add
`run-name` that embeds `PR #NNN` in the display title when `pr_number`
input is provided
2. **`auto-fix-on-failure.yml`**: Parse PR number from
`github.event.workflow_run.display_title` instead of trying to read
inputs from the API

This completes the auto-fix feedback loop: agent-fix dispatches tests →
tests fail → auto-fix-on-failure finds the PR from the run name →
dispatches `/fix`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen added a commit that referenced this pull request Apr 23, 2026
## Summary

Closes the last gaps in the agent-fix pipeline so it runs **fully
end-to-end** without human intervention:

**label issue → agent implements fix → self-review → non-draft PR →
integration tests + expert review dispatched → auto-fix if tests fail**

## Changes

1. **`draft: false`** — PRs are now created as non-draft, so
`review-on-open.agent` can trigger on `ready_for_review`
2. **Dispatch `review.agent` directly** — bypasses the `action_required`
approval gate that blocks `pull_request`-triggered workflows for
bot-created PRs
3. **Increase dispatch max** from 2 → 3 (verify-build + integration +
review)
4. **Updated Step 8** — instructions now include dispatching the expert
review with `pr_number`
5. **Recompiled lock file** — `gh aw compile` auto-detected
`review.agent` as a gh-aw workflow and added `.lock.yml` extension
mapping

## Pipeline Flow (after this PR)

```
Issue labeled 'agent-fix'
  → Agent reads issue, explores code, implements fix
  → 3-model self-review (Opus 4.6, Sonnet 4.6, GPT-5.3-Codex)
  → Non-draft PR created
  → Dispatches: verify-build + polypilot-integration + review.agent
  → If tests fail: auto-fix-on-failure dispatches /fix
  → Result: fully reviewed, tested PR ready for merge
```

## Context

This is the final piece of the pipeline work started across PRs #755
(integration test fixes), #756 (auto-fix-on-failure), and #757 (run-name
embedding for PR discovery). Those PRs fixed bugs that prevented the
existing pipeline from completing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant