From 56f9ae2867e5097593edc7407da79bf89c846e82 Mon Sep 17 00:00:00 2001 From: Sasa Junuzovic Date: Tue, 17 Mar 2026 12:51:49 -0700 Subject: [PATCH 1/4] docs: update changelog, agentic-workflows, and test-analysis schedule - Changelog: added entries for PRs #140-#143, dependabot bumps, orchestrator v1/v2, PR #113 auto-merge success - Agentic-workflows: replaced removed orchestrator section with current bash implementation, updated agent inventory, added history entry - Test-analysis: changed cron from weekly to daily (0 9 * * *) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/workflows/test-analysis.lock.yml | 5 +-- .github/workflows/test-analysis.md | 2 +- docs/agentic-workflows.md | 49 ++++++++++++++++++------ docs/changelog.md | 35 +++++++++++++++++ 4 files changed, 76 insertions(+), 15 deletions(-) diff --git a/.github/workflows/test-analysis.lock.yml b/.github/workflows/test-analysis.lock.yml index 1e2377a..5fea440 100644 --- a/.github/workflows/test-analysis.lock.yml +++ b/.github/workflows/test-analysis.lock.yml @@ -22,13 +22,12 @@ # For more information: https://github.github.com/gh-aw/introduction/overview/ # # -# gh-aw-metadata: {"schema_version":"v2","frontmatter_hash":"b9b1d4c9873e57676d3d4c1863c16aecf1033ce5568c4146fdf839ebc2774f19","compiler_version":"v0.60.0","strict":true} +# gh-aw-metadata: {"schema_version":"v2","frontmatter_hash":"fe7f86f8f8350e9d3c3e504af3f1dfd742745dece8a685f5e9c30d65d87840d2","compiler_version":"v0.60.0","strict":true} name: "Test Suite Analysis" "on": schedule: - - cron: "24 1 * * 1" - # Friendly format: weekly on monday (scattered) + - cron: "0 9 * * *" workflow_dispatch: permissions: {} diff --git a/.github/workflows/test-analysis.md b/.github/workflows/test-analysis.md index eeabff7..1aecac9 100644 --- a/.github/workflows/test-analysis.md +++ b/.github/workflows/test-analysis.md @@ -1,7 +1,7 @@ --- # Weekly test suite analysis on: - schedule: weekly on monday + schedule: '0 9 * * *' workflow_dispatch: permissions: diff --git a/docs/agentic-workflows.md b/docs/agentic-workflows.md index 3be3aad..0b3ca61 100644 --- a/docs/agentic-workflows.md +++ b/docs/agentic-workflows.md @@ -388,23 +388,38 @@ done This is a known limitation for solo repos. Agent PRs don't need this — the quality gate approves them. -### Pipeline Orchestrator (removed) +### Pipeline Orchestrator (bash-based) -We built a gh-aw agent orchestrator (`pipeline-orchestrator.md`, PR #130) to shepherd PRs through the full lifecycle. It was removed (PR #137) after failing in production — 7-10 minute runs for deterministic logic, auth failures preventing thread resolution, wrong action ordering. See `docs/auto_pr_orchestrator_aw.md` for the full postmortem. +The pipeline orchestrator (`pipeline-orchestrator.yml`) is a regular GitHub Action (pure bash, no LLM) that automates PR lifecycle management for `aw`-labeled PRs. Runs in seconds, not minutes. -**Replacement**: A regular GitHub Action (`pipeline-orchestrator.yml`) will handle the same logic in bash. See issue #135. +**Triggers:** +- `workflow_run`: after Review Responder completes +- `push: branches: [main]`: after merges (rebases stale PRs) +- `schedule: */15 * * * *`: cron (issue dispatch, cleanup) — planned for v3 +- `workflow_dispatch`: manual with optional `pr_number` input -**Current state**: Issue dispatch (implementer) and CI fixer dispatch are inactive until #135 is implemented. Review-responder and quality-gate continue to work normally. +**Capabilities (current):** +- **v1 — Thread resolution**: Queries review threads via GraphQL, resolves unresolved threads where the last commenter is not `copilot-pull-request-reviewer` (meaning someone addressed the comment) +- **v2 — Auto-rebase**: Detects PRs behind main via `mergeStateStatus: BEHIND`, rebases onto `origin/main`, force-pushes with lease. On conflict: adds `aw-needs-rebase` label + comment for human -### Review Responder Thread ID Lookup +**Planned:** +- **v3** — Issue dispatch (find `aw` issues, dispatch implementer), cron trigger, review-fix loop management +- **v4** — CI fixer dispatch +- **v5** — Stale PR cleanup -The Review Responder instructions include a step to query real `PRRT_` thread IDs via `gh api graphql` before resolving. Without this, the agent hallucinates thread IDs because the MCP server doesn't expose them (#114). The responder runs `gh api graphql` to fetch thread IDs upfront, then uses those real IDs in resolve calls. +**Auth:** Uses `GH_AW_WRITE_TOKEN` (PAT) for all operations. All user inputs passed via env vars to prevent script injection. -No `bash:` tool config is needed — the responder already has `--allow-all-tools` from the compiler default (adding explicit `bash:` would restrict the allowlist and break CI commands like `uv`, `ruff`, `pyright`, `pytest`). +**Concurrency:** `cancel-in-progress: true` prevents cron runs from stacking up. -**Limitation**: This fix only applies to PRs whose head branch has the updated `.md` (loaded via `{{#runtime-import}}`). Existing PRs need a rebase onto main to pick it up. +See issue #135 for the full roadmap. See `docs/auto_pr_orchestrator_aw.md` for the postmortem on the previous gh-aw agent approach. -This is a workaround until gh-aw upgrades their pinned MCP server (`github/gh-aw#21130`). +### Review Responder — Thread Resolution Delegation + +The review-responder no longer attempts to resolve threads. The `resolve-pull-request-review-thread` safe-output was removed (PR #141) because it never worked reliably — the MCP server doesn't expose `PRRT_` thread IDs, causing the agent to hallucinate IDs. + +Thread resolution is now handled by the pipeline orchestrator (`pipeline-orchestrator.yml`), which queries real thread IDs via `gh api graphql` and resolves them directly. This runs automatically after the responder completes (via `workflow_run` trigger). + +The responder's job is now: fix code, reply to threads, push changes. No thread resolution. @@ -516,8 +531,9 @@ gh run view --log-failed # View failed job logs | `code-health.md` | schedule (daily) / manual | Find refactoring/cleanup opportunities | `create-issue` (max 2), `dispatch-workflow` (implementer) | | `issue-implementer.md` | `workflow_dispatch` (issue number) | Implement fix from issue spec, open PR | `create-pull-request` (draft: false, auto-merge), `push-to-pull-request-branch` | | `ci-fixer.md` | `workflow_dispatch` (PR number) | Fix CI failures on agent PRs | `push-to-pull-request-branch`, `add-labels`, `add-comment` | -| `review-responder.md` | `pull_request_review` | Address Copilot review comments | `push-to-pull-request-branch`, `reply-to-review-comment`, `resolve-thread`, `add-labels` | -| `quality-gate.md` | `pull_request_review` | Evaluate quality + blast radius, approve or block | `submit-pull-request-review`, `add-comment` | +| `review-responder.md` | `pull_request_review` | Address Copilot review comments | `push-to-pull-request-branch`, `reply-to-review-comment`, `add-labels` | +| `quality-gate.md` | `pull_request_review` | Evaluate quality + blast radius, approve or block | `submit-pull-request-review`, `close-pull-request`, `add-comment`, `add-labels` | +| `pipeline-orchestrator.yml` | `workflow_run` / `push` / cron / manual | Resolve threads, rebase PRs, dispatch implementer | N/A (bash, not gh-aw) | ### Loop prevention @@ -631,4 +647,15 @@ The enhanced PR rescue (#116) went through three complete rewrites: - Removed dispatch references from code-health and test-analysis prompts. - **Key lesson**: gh-aw agents are for judgment (code review, implementation). Deterministic orchestration (check state → dispatch → resolve) should be regular bash workflows. +### 2026-03-17 — Pipeline orchestrator (bash) replaces gh-aw agent + +- PR #140: Quality gate can now close poor-quality PRs (`close-pull-request` safe-output). Aligned gh-aw version to v0.60.0. +- PR #141: Pipeline orchestrator v1 — thread resolver. Bash GitHub Action resolves addressed review threads via GraphQL. Removed broken `resolve-pull-request-review-thread` from review-responder. Closes #117. +- PR #142: Pipeline orchestrator v2 — auto-rebase. Detects PRs behind main, rebases and force-pushes. Security hardened (env vars, no script injection). +- PR #143: Fixed git fetch not creating remote tracking refs for rebase. +- PR #113: First end-to-end success — orchestrator rebased a stuck PR, CI passed, auto-merge fired. 7 seconds total. +- Dependabot PRs #125, #124, #123 merged (checkout v6, setup-uv v7, codeql-action v4). +- Issues closed: #89, #88, #117, #66. +- **Key insight**: Bash orchestrator in 7 seconds vs gh-aw agent in 7-10 minutes. Same logic, 60x faster. + diff --git a/docs/changelog.md b/docs/changelog.md index ec00f80..2eb761d 100644 --- a/docs/changelog.md +++ b/docs/changelog.md @@ -4,6 +4,41 @@ Append-only history of repo-level changes (CI, infra, shared config). Tool-speci --- +## feat: pipeline orchestrator (bash) + quality gate close + version alignment — 2026-03-17 + +**PR #140** — Quality gate close + gh-aw v0.60.0 alignment: +- Added `close-pull-request` safe-output to quality-gate so it can close poor-quality PRs instead of just commenting +- Aligned `copilot-setup-steps.yml` gh-aw version from v0.58.1 to v0.60.0 (matching local CLI compiler and lock files) +- Closes #89 + +**PRs #125, #124, #123** — Dependabot bumps: +- actions/checkout v4→v6 +- astral-sh/setup-uv v4→v7 +- github/codeql-action v3→v4 +- All rebased onto main and merged sequentially + +**PR #141** — Pipeline orchestrator v1 (thread resolver): +- New `pipeline-orchestrator.yml` — bash GitHub Action (no LLM) +- Resolves addressed review threads via GraphQL `resolveReviewThread` mutation +- Triggered by `workflow_run` when review-responder completes +- Removed broken `resolve-pull-request-review-thread` safe-output from review-responder (never worked due to MCP thread ID limitation) +- Closes #117 + +**PR #142** — Pipeline orchestrator v2 (auto-rebase): +- Added rebase capability: detects PRs behind main via `mergeStateStatus`, rebases and force-pushes +- Added `push: branches: [main]` trigger +- Security hardened: all step outputs via env vars, no `${{ }}` in run blocks + +**PR #143** — Orchestrator rebase fetch fix: +- Fixed `git fetch` not creating remote tracking refs + +**PR #113** — Auto-merged after orchestrator rebase: +- First end-to-end test: orchestrator rebased PR #113 (stuck behind main for days), CI passed, auto-merge fired. 7 seconds total. + +**Issues closed**: #89, #88, #117, #66 (via PR #113) + +--- + ## chore: remove pipeline orchestrator agent — 2026-03-17 **Problem**: The gh-aw orchestrator agent (PR #130) took 7-10 minutes per run for deterministic if/else logic. Over 22+ overnight runs it failed to resolve a single thread — auth failures, wrong action ordering, redundant review requests. Burned significant Opus inference tokens with no results. From efb6a0e41912d818a9d7980d5cf0c3548e8be90f Mon Sep 17 00:00:00 2001 From: Sasa Junuzovic Date: Tue, 17 Mar 2026 12:58:05 -0700 Subject: [PATCH 2/4] =?UTF-8?q?feat:=20pipeline=20orchestrator=20v3=20?= =?UTF-8?q?=E2=80=94=20issue=20dispatch,=20cron,=20review=20loop?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds three capabilities to the pipeline orchestrator: 1. Cron trigger (*/15 * * * *) — orchestrator runs automatically 2. Issue dispatch — finds oldest aw-labeled issue without aw-dispatched or aw-stuck labels, dispatches implementer 3. Review loop management — after resolving threads, removes review-response-attempted label to re-enable responder for next Copilot review round. Counter labels (review-response-N) cap at 3 rounds before marking PR as aw-stuck. Also: aw-stuck label makes orchestrator ignore stuck PRs so they don't block new issue dispatch. Permissions updated to issues:write and actions:write for label management and workflow dispatch. Part of #135 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/workflows/pipeline-orchestrator.yml | 158 ++++++++++++++------ 1 file changed, 113 insertions(+), 45 deletions(-) diff --git a/.github/workflows/pipeline-orchestrator.yml b/.github/workflows/pipeline-orchestrator.yml index 8ca611f..a450312 100644 --- a/.github/workflows/pipeline-orchestrator.yml +++ b/.github/workflows/pipeline-orchestrator.yml @@ -3,9 +3,7 @@ # Automates PR lifecycle management for agent-created (aw-labeled) PRs: # v1: Resolves addressed review threads after responder replies # v2: Auto-rebases PRs that fall behind main -# -# Future versions will add: issue dispatch, CI fixer dispatch, stale -# PR cleanup, and cron-based scheduling. See issue #135. +# v3: Dispatches implementer for unworked issues, manages review loop # # Related issues: # #135 — Pipeline orchestrator (main tracking issue) @@ -16,6 +14,8 @@ name: "Pipeline Orchestrator" on: + schedule: + - cron: "*/15 * * * *" workflow_run: workflows: ["Review Responder"] types: [completed] @@ -35,15 +35,16 @@ concurrency: permissions: contents: write pull-requests: write - issues: read + issues: write + actions: write jobs: orchestrate: runs-on: ubuntu-latest - # Run on: manual dispatch, push to main, or successful responder run if: >- github.event_name == 'workflow_dispatch' || github.event_name == 'push' || + github.event_name == 'schedule' || github.event.workflow_run.conclusion == 'success' env: GH_TOKEN: ${{ secrets.GH_AW_WRITE_TOKEN }} @@ -69,29 +70,75 @@ jobs: if [[ "$EVENT_NAME" == "workflow_dispatch" && -n "$PR_NUMBER_INPUT" ]]; then PR_NUMBERS="$PR_NUMBER_INPUT" else + # Find open aw PRs, excluding stuck ones PR_NUMBERS=$(gh api graphql -f query=' query($owner: String!, $repo: String!) { repository(owner: $owner, name: $repo) { pullRequests(labels: ["aw"], states: OPEN, first: 10) { - nodes { number } + nodes { + number + labels(first: 10) { nodes { name } } + } } } }' -f owner="$OWNER" -f repo="$REPO" \ - --jq '.data.repository.pullRequests.nodes[].number') + --jq '[.data.repository.pullRequests.nodes[] | select([.labels.nodes[].name] | index("aw-stuck") | not) | .number] | .[]') fi if [[ -z "$PR_NUMBERS" ]]; then - echo "No open aw-labeled PRs found. Nothing to do." + echo "No active aw-labeled PRs found." echo "has_prs=false" >> "$GITHUB_OUTPUT" + else + echo "has_prs=true" >> "$GITHUB_OUTPUT" + echo "pr_numbers=$(echo $PR_NUMBERS | tr '\n' ' ')" >> "$GITHUB_OUTPUT" + echo "Found active PRs: $PR_NUMBERS" + fi + + - name: Dispatch implementer for unworked issues + if: steps.find-prs.outputs.has_prs == 'false' + run: | + set -euo pipefail + + OWNER="${GITHUB_REPOSITORY_OWNER}" + REPO="${GITHUB_REPOSITORY#*/}" + + # Find oldest aw-labeled issue without aw-dispatched or aw-stuck + ISSUE=$(gh api graphql -f query=' + query($owner: String!, $repo: String!) { + repository(owner: $owner, name: $repo) { + issues(labels: ["aw"], states: OPEN, first: 20, orderBy: {field: CREATED_AT, direction: ASC}) { + nodes { + number + title + labels(first: 10) { nodes { name } } + } + } + } + }' -f owner="$OWNER" -f repo="$REPO" \ + --jq '[.data.repository.issues.nodes[] | select( + ([.labels.nodes[].name] | index("aw-dispatched") | not) + and ([.labels.nodes[].name] | index("aw-stuck") | not) + )] | .[0]') + + if [[ "$ISSUE" == "null" || -z "$ISSUE" ]]; then + echo "No eligible issues to dispatch. Pipeline idle." exit 0 fi - echo "has_prs=true" >> "$GITHUB_OUTPUT" - # Space-delimit PR numbers for safe single-line output - echo "pr_numbers=$(echo $PR_NUMBERS | tr '\n' ' ')" >> "$GITHUB_OUTPUT" - echo "Found PRs: $PR_NUMBERS" + ISSUE_NUM=$(echo "$ISSUE" | jq -r '.number') + ISSUE_TITLE=$(echo "$ISSUE" | jq -r '.title') + + echo "Dispatching implementer for issue #${ISSUE_NUM}: ${ISSUE_TITLE}" + + # Add aw-dispatched label to prevent re-dispatch + gh issue edit "$ISSUE_NUM" --add-label "aw-dispatched" + + # Dispatch the implementer + gh workflow run issue-implementer.lock.yml -f issue_number="$ISSUE_NUM" + + echo "✅ Dispatched implementer for issue #${ISSUE_NUM}" - - name: Resolve addressed review threads + - name: Resolve threads and manage review loop if: steps.find-prs.outputs.has_prs == 'true' env: PR_NUMBERS: ${{ steps.find-prs.outputs.pr_numbers }} @@ -103,12 +150,14 @@ jobs: TOTAL_RESOLVED=0 for PR in $PR_NUMBERS; do - echo "::group::Resolving threads on PR #${PR}" + echo "::group::Processing threads on PR #${PR}" - THREADS=$(gh api graphql -f query=' + # Query threads and current labels + PR_INFO=$(gh api graphql -f query=' query($owner: String!, $repo: String!, $pr: Int!) { repository(owner: $owner, name: $repo) { pullRequest(number: $pr) { + labels(first: 15) { nodes { name } } reviewThreads(first: 100) { nodes { id @@ -123,9 +172,16 @@ jobs: } } } - }' -f owner="$OWNER" -f repo="$REPO" -F pr="$PR" \ - --jq '.data.repository.pullRequest.reviewThreads.nodes') + }' -f owner="$OWNER" -f repo="$REPO" -F pr="$PR") || { + echo " ⚠️ PR #${PR}: Failed to query. Skipping." + echo "::endgroup::" + continue + } + + LABELS=$(echo "$PR_INFO" | jq -r '[.data.repository.pullRequest.labels.nodes[].name]') + THREADS=$(echo "$PR_INFO" | jq -r '.data.repository.pullRequest.reviewThreads.nodes') + # Resolve addressed threads RESOLVABLE=$(echo "$THREADS" | jq -r ' [.[] | select( .isResolved == false @@ -133,30 +189,49 @@ jobs: and (.comments.nodes[-1].author?.login // "" | . != "copilot-pull-request-reviewer") )] | .[].id') - if [[ -z "$RESOLVABLE" ]]; then - echo "PR #${PR}: No threads to resolve." - echo "::endgroup::" - continue + if [[ -n "$RESOLVABLE" ]]; then + COUNT=0 + for THREAD_ID in $RESOLVABLE; do + echo " Resolving thread: ${THREAD_ID}" + gh api graphql -f query=' + mutation($threadId: ID!) { + resolveReviewThread(input: {threadId: $threadId}) { + thread { isResolved } + } + }' -f threadId="$THREAD_ID" \ + --jq '.data.resolveReviewThread.thread.isResolved' || { + echo " ⚠️ Failed to resolve thread ${THREAD_ID}" + continue + } + COUNT=$((COUNT + 1)) + done + echo " Resolved ${COUNT} thread(s)." + TOTAL_RESOLVED=$((TOTAL_RESOLVED + COUNT)) + else + echo " No threads to resolve." fi - COUNT=0 - for THREAD_ID in $RESOLVABLE; do - echo " Resolving thread: ${THREAD_ID}" - gh api graphql -f query=' - mutation($threadId: ID!) { - resolveReviewThread(input: {threadId: $threadId}) { - thread { isResolved } - } - }' -f threadId="$THREAD_ID" \ - --jq '.data.resolveReviewThread.thread.isResolved' || { - echo " ⚠️ Failed to resolve thread ${THREAD_ID}" - continue - } - COUNT=$((COUNT + 1)) - done + # Review loop management + HAS_RESPONSE_LABEL=$(echo "$LABELS" | jq -r 'if index("review-response-attempted") then "yes" else "" end') + + if [[ -n "$HAS_RESPONSE_LABEL" ]]; then + # Count existing review-response-N labels + ROUND=$(echo "$LABELS" | jq -r '[.[] | select(startswith("review-response-"))] | length') + + if [[ "$ROUND" -ge 3 ]]; then + # Max rounds reached — mark as stuck + echo " ⚠️ PR #${PR}: Review-response loop hit 3 rounds. Marking as stuck." + gh pr edit "$PR" --add-label "aw-stuck" + gh pr comment "$PR" --body "⚠️ Pipeline orchestrator: review-response loop reached 3 rounds without resolution. Marking as stuck for human review." + else + # Remove review-response-attempted to re-enable responder + NEXT_ROUND=$((ROUND + 1)) + echo " Enabling review round ${NEXT_ROUND} for PR #${PR}" + gh pr edit "$PR" --remove-label "review-response-attempted" + gh pr edit "$PR" --add-label "review-response-${NEXT_ROUND}" + fi + fi - echo "PR #${PR}: Resolved ${COUNT} thread(s)." - TOTAL_RESOLVED=$((TOTAL_RESOLVED + COUNT)) echo "::endgroup::" done @@ -179,7 +254,6 @@ jobs: for PR in $PR_NUMBERS; do echo "::group::Checking rebase for PR #${PR}" - # Get PR branch and merge state PR_DATA=$(gh api graphql -f query=' query($owner: String!, $repo: String!, $pr: Int!) { repository(owner: $owner, name: $repo) { @@ -202,14 +276,12 @@ jobs: echo " Branch: $BRANCH | Merge state: $MERGE_STATE" - # Only rebase if behind main or has conflicts (DIRTY) if [[ "$MERGE_STATE" != "BEHIND" && "$MERGE_STATE" != "DIRTY" ]]; then echo " PR #${PR}: State is $MERGE_STATE — no rebase needed." echo "::endgroup::" continue fi - # Skip if already flagged for human intervention if [[ -n "$HAS_REBASE_LABEL" ]]; then echo " PR #${PR}: Already has aw-needs-rebase label. Skipping." echo "::endgroup::" @@ -218,7 +290,6 @@ jobs: echo " Attempting rebase of $BRANCH onto main..." - # Fetch latest refs for main and the PR branch git fetch origin main "$BRANCH":"refs/remotes/origin/$BRANCH" || { echo " ⚠️ PR #${PR}: Failed to fetch branch. Skipping." echo "::endgroup::" @@ -227,7 +298,6 @@ jobs: git checkout -B "$BRANCH" "origin/$BRANCH" if git rebase origin/main; then - # Rebase succeeded — force push if git push origin "$BRANCH" --force-with-lease; then echo " ✅ PR #${PR}: Rebased and pushed successfully." TOTAL_REBASED=$((TOTAL_REBASED + 1)) @@ -237,14 +307,12 @@ jobs: gh pr edit "$PR" --add-label "aw-needs-rebase" fi else - # Rebase failed — conflicts git rebase --abort 2>/dev/null || true echo " ❌ PR #${PR}: Rebase conflicts detected." gh pr comment "$PR" --body "❌ Pipeline orchestrator: rebase onto main failed due to conflicts. Manual rebase needed." gh pr edit "$PR" --add-label "aw-needs-rebase" fi - # Return to detached HEAD so next iteration is clean git checkout --detach HEAD 2>/dev/null || true echo "::endgroup::" From 748bb0e8eb6a6e3ec76afd8e4d35e146049c7dd2 Mon Sep 17 00:00:00 2001 From: Sasa Junuzovic Date: Tue, 17 Mar 2026 13:51:46 -0700 Subject: [PATCH 3/4] fix: responder only adds label when addressing threads + orchestrator detects noop Responder: moved review-response-attempted label to after confirming unresolved threads exist. Previously the label was added on noop paths (e.g. quality gate approval trigger), locking out the responder from addressing actual Copilot review comments. Orchestrator: detects when review-response-attempted label is present but all unresolved threads are still from copilot-pull-request-reviewer (nobody addressed them). Removes the label so the responder can retry. Discovered during live test: quality gate approval triggered responder, responder noop'd (no comments), but still got the label. Part of #135 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/workflows/pipeline-orchestrator.yml | 42 ++++++++++++++------- .github/workflows/review-responder.md | 14 ++++--- 2 files changed, 37 insertions(+), 19 deletions(-) diff --git a/.github/workflows/pipeline-orchestrator.yml b/.github/workflows/pipeline-orchestrator.yml index a450312..1797f06 100644 --- a/.github/workflows/pipeline-orchestrator.yml +++ b/.github/workflows/pipeline-orchestrator.yml @@ -214,21 +214,37 @@ jobs: # Review loop management HAS_RESPONSE_LABEL=$(echo "$LABELS" | jq -r 'if index("review-response-attempted") then "yes" else "" end') + # Check if there are unresolved threads that nobody has addressed yet + UNADDRESSED=$(echo "$THREADS" | jq -r ' + [.[] | select( + .isResolved == false + and (.comments.nodes | length > 0) + and (.comments.nodes[-1].author?.login // "" | . == "copilot-pull-request-reviewer") + )] | length') + if [[ -n "$HAS_RESPONSE_LABEL" ]]; then - # Count existing review-response-N labels - ROUND=$(echo "$LABELS" | jq -r '[.[] | select(startswith("review-response-"))] | length') - - if [[ "$ROUND" -ge 3 ]]; then - # Max rounds reached — mark as stuck - echo " ⚠️ PR #${PR}: Review-response loop hit 3 rounds. Marking as stuck." - gh pr edit "$PR" --add-label "aw-stuck" - gh pr comment "$PR" --body "⚠️ Pipeline orchestrator: review-response loop reached 3 rounds without resolution. Marking as stuck for human review." - else - # Remove review-response-attempted to re-enable responder - NEXT_ROUND=$((ROUND + 1)) - echo " Enabling review round ${NEXT_ROUND} for PR #${PR}" + if [[ "$UNADDRESSED" -gt 0 && -z "$RESOLVABLE" ]]; then + # Label present but threads untouched — responder noop'd (e.g. triggered + # by a non-comment review like quality gate approval). Remove the label + # so the responder can try again on the next review with actual comments. + echo " 🔄 PR #${PR}: response label present but $UNADDRESSED threads still unaddressed. Removing label." gh pr edit "$PR" --remove-label "review-response-attempted" - gh pr edit "$PR" --add-label "review-response-${NEXT_ROUND}" + else + # Count existing review-response-N labels + ROUND=$(echo "$LABELS" | jq -r '[.[] | select(startswith("review-response-"))] | length') + + if [[ "$ROUND" -ge 3 ]]; then + # Max rounds reached — mark as stuck + echo " ⚠️ PR #${PR}: Review-response loop hit 3 rounds. Marking as stuck." + gh pr edit "$PR" --add-label "aw-stuck" + gh pr comment "$PR" --body "⚠️ Pipeline orchestrator: review-response loop reached 3 rounds without resolution. Marking as stuck for human review." + else + # Remove review-response-attempted to re-enable responder + NEXT_ROUND=$((ROUND + 1)) + echo " Enabling review round ${NEXT_ROUND} for PR #${PR}" + gh pr edit "$PR" --remove-label "review-response-attempted" + gh pr edit "$PR" --add-label "review-response-${NEXT_ROUND}" + fi fi fi diff --git a/.github/workflows/review-responder.md b/.github/workflows/review-responder.md index 9fa6920..c55d41a 100644 --- a/.github/workflows/review-responder.md +++ b/.github/workflows/review-responder.md @@ -47,13 +47,13 @@ This workflow runs when a review is submitted on a pull request. 1. First, check if the PR has the `aw` label. If it does NOT have the `aw` label, stop immediately — this workflow only handles agent-created PRs. -2. Check the review that triggered this workflow. If the review has no comments (e.g., a plain approval with no inline comments), stop — there is nothing to address. +2. Check the review that triggered this workflow. If the review has no comments (e.g., a plain approval with no inline comments), stop — there is nothing to address. Do NOT add any labels when stopping here. -3. Check if the PR already has the label `review-response-attempted`. If it does, add a comment to the PR saying "Review response already attempted — stopping to prevent loops. Manual intervention needed." and stop. +3. Read the unresolved review comment threads on the PR (not just the latest review — get all unresolved threads). If there are no unresolved threads, stop — there is nothing to address. Do NOT add any labels when stopping here. -4. Add the label `review-response-attempted` to the PR. +4. Check if the PR already has the label `review-response-attempted`. If it does, add a comment to the PR saying "Review response already attempted — stopping to prevent loops. Manual intervention needed." and stop. -5. Read the unresolved review comment threads on the PR (not just the latest review — get all unresolved threads). If there are more than 10 unresolved threads, address the first 10 and leave a summary comment on the PR noting how many remain for manual follow-up. +5. Add the label `review-response-attempted` to the PR. Only add this label HERE — after confirming there are actual threads to address. 6. For each unresolved review comment thread (up to 10): a. Read the comment and understand what change is being requested @@ -61,9 +61,11 @@ This workflow runs when a review is submitted on a pull request. c. Make the requested fix in the code (edit the file locally — do NOT push yet) d. Reply to the comment thread explaining what you changed -7. After addressing all comments, run the CI checks locally to make sure your fixes don't break anything: `uv sync && uv run ruff check --fix . && uv run ruff format . && uv run pyright && uv run pytest --cov --cov-fail-under=80 -v` +7. If there are more than 10 unresolved threads, leave a summary comment on the PR noting how many remain for manual follow-up. -8. Push all changes in a single commit with message "fix: address review comments". Reply to all threads BEFORE pushing — replies after a push will appear on outdated code. +8. After addressing all comments, run the CI checks locally to make sure your fixes don't break anything: `uv sync && uv run ruff check --fix . && uv run ruff format . && uv run pyright && uv run pytest --cov --cov-fail-under=80 -v` + +9. Push all changes in a single commit with message "fix: address review comments". Reply to all threads BEFORE pushing — replies after a push will appear on outdated code. If a review comment requests a change that would be architecturally significant or you're unsure about, reply to the thread explaining your concern rather than making the change blindly. From 43331b6fbd573989ba8b19e60bbbc18dafa03ed5 Mon Sep 17 00:00:00 2001 From: Sasa Junuzovic Date: Tue, 17 Mar 2026 14:02:58 -0700 Subject: [PATCH 4/4] fix: handle unaddressed threads from noop'd responder runs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Responder: step 2 now checks for existing unresolved threads even when the triggering review has no comments. Previously a clean re-review (e.g. from Copilot re-request) would noop even with unaddressed threads from a prior review. Orchestrator: requests Copilot re-review when unaddressed threads exist and no review-response-attempted label is present. This triggers the event chain: Copilot review → responder → orchestrator. Also requests re-review when removing a stale response label. Discovered during live test: quality gate approval triggered responder noop, leaving 4 Copilot threads permanently unaddressed. Part of #135 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/workflows/pipeline-orchestrator.yml | 8 ++++++++ .github/workflows/review-responder.md | 2 +- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/.github/workflows/pipeline-orchestrator.yml b/.github/workflows/pipeline-orchestrator.yml index 1797f06..384e3f1 100644 --- a/.github/workflows/pipeline-orchestrator.yml +++ b/.github/workflows/pipeline-orchestrator.yml @@ -229,6 +229,9 @@ jobs: # so the responder can try again on the next review with actual comments. echo " 🔄 PR #${PR}: response label present but $UNADDRESSED threads still unaddressed. Removing label." gh pr edit "$PR" --remove-label "review-response-attempted" + # Request Copilot re-review to trigger the responder + echo " 🔄 Requesting Copilot re-review to trigger responder." + gh pr edit "$PR" --add-reviewer "@copilot" 2>/dev/null || true else # Count existing review-response-N labels ROUND=$(echo "$LABELS" | jq -r '[.[] | select(startswith("review-response-"))] | length') @@ -246,6 +249,11 @@ jobs: gh pr edit "$PR" --add-label "review-response-${NEXT_ROUND}" fi fi + elif [[ "$UNADDRESSED" -gt 0 ]]; then + # No response label and unaddressed threads — responder hasn't run yet + # or was never triggered. Request Copilot re-review to kick off the chain. + echo " 🔄 PR #${PR}: $UNADDRESSED unaddressed threads, no response label. Requesting Copilot review." + gh pr edit "$PR" --add-reviewer "@copilot" 2>/dev/null || true fi echo "::endgroup::" diff --git a/.github/workflows/review-responder.md b/.github/workflows/review-responder.md index c55d41a..6d8ac9c 100644 --- a/.github/workflows/review-responder.md +++ b/.github/workflows/review-responder.md @@ -47,7 +47,7 @@ This workflow runs when a review is submitted on a pull request. 1. First, check if the PR has the `aw` label. If it does NOT have the `aw` label, stop immediately — this workflow only handles agent-created PRs. -2. Check the review that triggered this workflow. If the review has no comments (e.g., a plain approval with no inline comments), stop — there is nothing to address. Do NOT add any labels when stopping here. +2. Check the review that triggered this workflow. If the review has no comments (e.g., a plain approval with no inline comments), check if there are any unresolved review threads on the PR. If the review has no comments AND there are no unresolved threads, stop — there is nothing to address. Do NOT add any labels when stopping here. If the review has no comments BUT there ARE unresolved threads from a previous review, continue to step 3 to address those threads. 3. Read the unresolved review comment threads on the PR (not just the latest review — get all unresolved threads). If there are no unresolved threads, stop — there is nothing to address. Do NOT add any labels when stopping here.