Skip to content

DO NOT MERGE — responder test sandbox v3#159

Closed
microsasa wants to merge 9 commits intomainfrom
test/responder-sandbox
Closed

DO NOT MERGE — responder test sandbox v3#159
microsasa wants to merge 9 commits intomainfrom
test/responder-sandbox

Conversation

@microsasa
Copy link
Owner

Test PR for responder agent with REST API fallback instruction. DO NOT MERGE.

Sasa Junuzovic and others added 9 commits March 17, 2026 21:28
Exact match to the version that successfully addressed review
threads (run 23117978475, 2026-03-15), with only two changes:
1. Added if: label gate for aw PRs
2. Removed resolve-pull-request-review-thread safe-output and
   step 6e (orchestrator handles resolution)

No other changes. This commit should not be modified.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace pull_request_review trigger with workflow_dispatch + pr_number
input. No more loops from stray review events. Orchestrator or
manual dispatch controls when responder runs.

Removed aw label check and triggering review check — no longer
applicable with workflow_dispatch. Renumbered steps.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
If CI checks fail after addressing review comments, fix the
issues and re-run until they pass. Do not push broken code.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The reply-to-pull-request-review-comment and push-to-pull-request-branch
safe output handlers default to target 'triggering' which requires a
pull_request event context. Setting target to '*' lets the agent
specify the PR number in each message, enabling workflow_dispatch.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Same fix as the responder: set target '*' on push-to-pull-request-branch
so the agent specifies the PR number in its message (no event context
needed). Added labels: [aw] to restrict pushes. Added checkout fetch
config so all remote branches are available.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
MCP tools intermittently return empty results for review data.
The agent sometimes falls back to REST API on its own (PR #152)
but not always (PR #158). Explicitly instruct it to use the
REST API endpoints for comments and reviews.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Intentionally imperfect code for testing the review-responder agent.
DO NOT MERGE this branch to main.
Fix broken_function return type mismatch: wrap int result with str()
to match declared return type.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@microsasa microsasa added the aw Created by agentic workflow label Mar 19, 2026
Copilot AI review requested due to automatic review settings March 19, 2026 07:34
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR is a responder-agent sandbox/testing change set, introducing an intentionally imperfect Python “sandbox” module and altering gh-aw agent workflows to be manually dispatched (with expanded safe-output targeting and checkout ref fetching) for testing a REST-API-based review responder flow.

Changes:

  • Add tests/test_responder_sandbox.py with simple helper functions intended to be “imperfect” for responder testing.
  • Convert review-responder from a PR-review-triggered workflow to a workflow_dispatch workflow with pr_number input.
  • Update review-responder and ci-fixer workflows (and compiled lock files) to fetch full history/refs and set safe-outputs target: "*" (plus add aw labels via config).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
tests/test_responder_sandbox.py Adds a sandbox module under tests/ with intentionally imperfect code (but currently not real pytest tests).
.github/workflows/review-responder.md Switches responder to manual dispatch and updates instructions + safe-output targeting.
.github/workflows/review-responder.lock.yml Compiled workflow updates reflecting new trigger/config (fetch depth/refs, safe-output target).
.github/workflows/ci-fixer.md Adds checkout ref fetching and safe-output targeting (and config label addition).
.github/workflows/ci-fixer.lock.yml Compiled workflow updates reflecting checkout ref fetching and safe-output target.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +5 to +10
total = 0
for i in lst:
total = total + i
return total


Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Replaced the manual loop/accumulator with sum(lst) for simplicity.

Generated by Review Responder

return result


def broken_function(value: int) -> str:
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Renamed broken_function to increment_to_string to accurately describe its behavior (increments an int and returns it as a string).

Generated by Review Responder

Comment on lines 35 to 38
push-to-pull-request-branch:
target: "*"
labels: [aw]
github-token: ${{ secrets.GH_AW_WRITE_TOKEN }}
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially addressed. Added an explicit aw label gate as step 1 in the workflow instructions — the agent will now verify the target PR has the aw label and stop if it doesn't.

However, removing labels: [aw] from the push-to-pull-request-branch frontmatter config requires recompilation of the .lock.yml file via gh aw compile, which I cannot run. This should be done manually as a follow-up.

Generated by Review Responder

Comment on lines 36 to 39
push-to-pull-request-branch:
target: "*"
labels: [aw]
github-token: ${{ secrets.GH_AW_WRITE_TOKEN }}
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially addressed. Added an explicit aw label gate as step 1 in the ci-fixer instructions — the agent will now verify the target PR has the aw label before proceeding.

Same as the review-responder: removing labels: [aw] from the push-to-pull-request-branch frontmatter config requires .lock.yml recompilation via gh aw compile, which I cannot run. This should be done manually as a follow-up.

Generated by Review Responder

Comment on lines +62 to 66
4. For each unresolved review comment thread (up to 10):
a. Read the comment and understand what change is being requested
b. Read the relevant file and surrounding code context
c. Make the requested fix in the code (edit the file locally — do NOT push yet)
c. Make the requested fix in the code
d. Reply to the comment thread explaining what you changed
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added step 4: "Fetch and checkout the PR's head branch before making any code changes. Use the PR number from $\{\{ inputs.pr_number }} to determine the head ref and switch to it." This ensures the agent explicitly checks out the correct branch rather than relying on the automatic checkout step (which doesn't fire for workflow_dispatch).

Generated by Review Responder

Comment on lines +16 to +19
checkout:
fetch: ["*"]
fetch-depth: 0

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added step 6: "Fetch and checkout the PR's head branch before making any code changes. Use the PR number from $\{\{ github.event.inputs.pr_number }} to determine the head ref and switch to it." This ensures the ci-fixer agent explicitly checks out the PR branch before applying fixes.

Generated by Review Responder

Comment on lines +1 to +4
# Responder Sandbox — intentionally imperfect code for testing


def calculate_total(lst: list[int]) -> int:
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Renamed the file from tests/test_responder_sandbox.py to tests/responder_sandbox.py (removed the test_ prefix) so pytest no longer collects it as a test module.

Generated by Review Responder

Comment on lines +12 to +16
result = ""
for item in data:
suffix = "\n" if verbose else ", "
result = result + str(item) + suffix
return result
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Replaced the quadratic string concatenation loop with separator.join(data). This also eliminates the trailing delimiter issue since join only places the separator between elements.

Generated by Review Responder

2. Add the label `review-response-attempted` to the PR.

5. Read the unresolved review comment threads on the PR (not just the latest review — get all unresolved threads). If there are more than 10 unresolved threads, address the first 10 and leave a summary comment on the PR noting how many remain for manual follow-up.
3. Read the unresolved review comment threads on the PR using the GitHub REST API: fetch `https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER/comments` and `https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER/reviews`. If there are more than 10 unresolved threads, address the first 10 and leave a summary comment on the PR noting how many remain for manual follow-up.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Rewrote step 5 (previously step 3) to:

  • Use {owner}/{repo}/{pr_number} placeholder notation instead of non-existent $OWNER/$REPO/$PR_NUMBER env vars
  • Note that the REST API doesn't expose thread resolution state, and instruct the agent to use in_reply_to_id to identify top-level threads
  • Add guidance to fall back to REST when MCP tools return empty results
  • Removed the /reviews endpoint reference since comments are sufficient for identifying threads

Generated by Review Responder

@microsasa
Copy link
Owner Author

Quality Gate — Manual Review Required

Decision: NOT auto-approving (HIGH impact + explicit "DO NOT MERGE")

Evaluation Summary

Dimension Assessment
aw label ✅ Present
Copilot review trigger ✅ Confirmed
PR title ⚠️ Explicitly says "DO NOT MERGE"
Code quality N/A (test sandbox)
Impact HIGH

High-Impact Areas

  1. Workflow trigger change (review-responder.md): Switches from event-driven pull_request_review trigger to workflow_dispatch. This fundamentally changes how the responder is invoked and removes the automatic label gate (if: contains(...labels..., 'aw')).

  2. Safe-outputs target: "*" (both workflows): Adds target: "*" to push-to-pull-request-branch and reply-to-pull-request-review-comment, broadening which PRs these agents can push to. This is a security-sensitive configuration change.

  3. Full repo fetch (fetch: ["*"], fetch-depth: 0): Both workflows now fetch full history and all branches, which increases checkout time and attack surface.

  4. Not real tests: tests/test_responder_sandbox.py contains sandbox functions for testing the responder agent — no actual test_ functions or assertions.

Recommendation

This is a test sandbox PR explicitly marked "DO NOT MERGE." A human reviewer should evaluate whether the workflow configuration changes (especially target: "*" and the trigger model change) are intentional and safe before any merge consideration.

Generated by Quality Gate for issue #159 ·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

aw Created by agentic workflow review-response-attempted

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants