feat: make agent-fix pipeline fully autonomous (end-to-end) by PureWeen · Pull Request #758 · PureWeen/PolyPilot

PureWeen · 2026-04-23T19:33:18Z

Summary

Closes the last gaps in the agent-fix pipeline so it runs fully end-to-end without human intervention:

label issue → agent implements fix → self-review → non-draft PR → integration tests + expert review dispatched → auto-fix if tests fail

Changes

draft: false — PRs are now created as non-draft, so review-on-open.agent can trigger on ready_for_review
Dispatch review.agent directly — bypasses the action_required approval gate that blocks pull_request-triggered workflows for bot-created PRs
Increase dispatch max from 2 → 3 (verify-build + integration + review)
Updated Step 8 — instructions now include dispatching the expert review with pr_number
Recompiled lock file — gh aw compile auto-detected review.agent as a gh-aw workflow and added .lock.yml extension mapping

Pipeline Flow (after this PR)

Issue labeled 'agent-fix'
  → Agent reads issue, explores code, implements fix
  → 3-model self-review (Opus 4.6, Sonnet 4.6, GPT-5.3-Codex)
  → Non-draft PR created
  → Dispatches: verify-build + polypilot-integration + review.agent
  → If tests fail: auto-fix-on-failure dispatches /fix
  → Result: fully reviewed, tested PR ready for merge

Context

This is the final piece of the pipeline work started across PRs #755 (integration test fixes), #756 (auto-fix-on-failure), and #757 (run-name embedding for PR discovery). Those PRs fixed bugs that prevented the existing pipeline from completing.

- Change draft: true → draft: false so PRs are created as non-draft - Add review.agent to dispatch-workflow list (dispatches expert code review directly, bypassing the action_required approval gate that blocks pull_request-triggered workflows for bot-created PRs) - Increase dispatch max from 2 to 3 (verify-build + integration + review) - Update Step 8 instructions to dispatch review.agent with pr_number - Recompile lock file via gh aw compile This completes the end-to-end pipeline: label issue → agent implements fix → self-review → PR created (non-draft) → integration tests + expert review dispatched → auto-fix-on-failure if tests fail. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions

Review Summary

Reviewed the full workflow source, lock file, review-on-open.agent, review.agent, and auto-fix-on-failure.yml.

✅ No issues found

Lock file consistency: Both config.json and GH_AW_SAFE_OUTPUTS_HANDLER_CONFIG correctly reflect draft: false, max: 3, and the added review.agent workflow. The review_agent tool schema, aw_context_workflows, and workflow_files entries are all correct.
No infinite loop risk: review.agent does NOT have dispatch_workflow capability — it can only post review comments. It cannot re-trigger itself or agent-fix.
No race conditions: The 3 dispatched workflows (verify-build, polypilot-integration, review.agent) are fully independent. The review workflows share a concurrency group with cancel-in-progress: false, so concurrent runs serialize safely rather than conflicting.
Auto-fix loop protection: auto-fix-on-failure.yml only triggers on verify-build and polypilot-integration failures, not review workflow completions, so the review dispatch doesn't create a new failure loop vector.

Findings (2)

#	Sev	Issue
1	🟡	Duplicate reviews: `draft: false` causes `review-on-open.agent` to trigger on PR creation (`opened` event), AND Step 8 explicitly dispatches `review.agent`. Both share concurrency group `review-{PR}` with `cancel-in-progress: false`, so they serialize — producing two full review runs on the same diff.
2	🟡	`pr_number` type mismatch: Prompt example shows string `"<PR number>"` but the tool schema declares `type: number`. Agent may send a string value that fails schema validation.

Generated by Expert Code Review (auto) for issue #758 · ● 9.5M

github-actions · 2026-04-23T19:44:52Z

  create-pull-request:
    auto-merge: false
-    draft: true
+    draft: false


🟡 MODERATE — Duplicate review runs

With draft: false, the PR is created as non-draft immediately. This fires the pull_request: [opened] event, which triggers review-on-open.agent (it matches opened + draft == false). Then in Step 8, agent-fix also explicitly dispatches review.agent via workflow_dispatch.

Both workflows share the concurrency group review-$\{\{ PR_NUMBER }} with cancel-in-progress: false, so they serialize rather than deduplicate — the second review queues behind the first and runs in full. This doubles compute cost (~90 min × 2) and produces two separate review comment threads on the same PR for the same diff.

Mitigation options (pick one):

Keep draft: true and rely solely on the explicit review.agent dispatch (original approach minus the approval-gate problem).

Keep draft: false but remove the explicit dispatch_workflow for review.agent in Step 8, relying on review-on-open.agent to auto-trigger.

Add a conditional label or environment flag so review-on-open.agent skips PRs created by agent-fix (e.g., if: github.actor != 'copilot-agentic-workflow[bot]').

github-actions · 2026-04-23T19:44:52Z

+    "pr_number": "<PR number>"
+  }
+})
 ```


🟡 MODERATE — pr_number type mismatch between prompt example and tool schema

The prompt example passes pr_number as a string ("<PR number>"), but the compiled tool schema in the lock file defines pr_number with "type": "number". If the agent sends a string (e.g. "42" instead of 42), schema validation may reject the dispatch call at runtime.

The review.agent.md frontmatter also declares pr_number as type: number, so the lock file schema is correct. However, the prompt example should make it clear this must be an integer, not a quoted string.

Suggested fix: Change the example to use an unquoted placeholder:

"pr_number": <PR number>

or add a note: "Pass as an integer, not a string."

github-actions · 2026-04-23T19:44:52Z

  create-pull-request:
    auto-merge: false
-    draft: true
+    draft: false


🟡 MODERATE — Duplicate reviews on bot-created PRs (Flagged by: 3/3 reviewers)

With draft: false, a non-draft PR created by the agent will immediately trigger review-on-open.agent (via pull_request: opened / ready_for_review). Then Step 8 also explicitly dispatches review.agent via workflow_dispatch.

Both workflows share the concurrency group review-<PR#> with cancel-in-progress: false, so they serialize rather than cancel — both run to completion, posting independent review comment sets on the same PR and doubling compute cost.

Concrete scenario: Agent-fix creates PR #100 (non-draft) → review-on-open.agent triggers on opened → Step 8 dispatches review.agent → two full expert reviews post on the same PR.

Suggested fix: Add a bot-user guard to review-on-open.agent (e.g., && github.event.pull_request.user.type != 'Bot') so it skips bot-created PRs where the explicit dispatch is the intended path. Alternatively, remove the explicit dispatch from Step 8 and rely solely on review-on-open.agent.

github-actions · 2026-04-23T19:44:52Z

+
+dispatch_workflow({


🟢 MINOR — Review dispatched concurrently with build; may review stale code (Flagged by: 1/3 reviewers initially, confirmed 3/3 after follow-up)

All three dispatches (verify-build, polypilot-integration, review.agent) fire simultaneously. If verify-build fails, auto-fix-on-failure dispatches fix.agent which pushes new commits — but the review has already started (or completed) on the pre-fix code. The resulting review comments will refer to code that no longer exists.

This is correctly a minor concern: the happy path (build passes) is unaffected, and adding sequencing would increase latency for every pipeline run. But it may be worth documenting as a known trade-off with a comment here.

github-actions · 2026-04-23T19:44:52Z

+dispatch_workflow({
+  "workflow": "review.agent",
+  "inputs": {
+    "pr_number": "<PR number>"


🟡 MODERATE — pr_number placeholder will be interpreted as a string (Flagged by: 3/3 reviewers)

The placeholder "<PR number>" is enclosed in double-quotes, making it look like a JSON string. The compiled lock file's schema for review_agent declares "pr_number": { "type": "number" }.

An LLM following this template will likely substitute as "pr_number": "42" (string) instead of "pr_number": 42 (number), which may fail strict JSON Schema validation in the safe-outputs handler.

Suggested fix: Remove the quotes to signal numeric intent:

"pr_number": <PR number as integer, e.g. 42>

PureWeen merged commit 393b73c into main Apr 23, 2026
1 check passed

PureWeen deleted the fix/agent-fix-end-to-end branch April 23, 2026 19:33

github-actions Bot reviewed Apr 23, 2026

View reviewed changes

github-actions Bot mentioned this pull request Apr 23, 2026

[review-retro] Review Retrospective - PR #758 (feat: make agent-fix pipeline fully autonomous) #759

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: make agent-fix pipeline fully autonomous (end-to-end)#758

feat: make agent-fix pipeline fully autonomous (end-to-end)#758
PureWeen merged 1 commit intomainfrom
fix/agent-fix-end-to-end

PureWeen commented Apr 23, 2026

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot Apr 23, 2026

Uh oh!

github-actions Bot Apr 23, 2026

Uh oh!

github-actions Bot Apr 23, 2026

Uh oh!

github-actions Bot Apr 23, 2026

Uh oh!

github-actions Bot Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		dispatch_workflow({

Conversation

PureWeen commented Apr 23, 2026

Summary

Changes

Pipeline Flow (after this PR)

Context

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Review Summary

✅ No issues found

Findings (2)

Uh oh!

github-actions Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant