From 9cad783cff13738ead6739a0ebbb20b83b7f86f9 Mon Sep 17 00:00:00 2001 From: Verky Yi Date: Wed, 22 Apr 2026 18:54:42 +0000 Subject: [PATCH 01/14] agent-team: wire Context7 MCP into planner + implementer MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Gives both agents fresh library/API docs via Upstash's Context7 MCP (stdio, npx-launched). Planner uses it to ground plan file paths against real framework surfaces; implementer uses it to write calls against current APIs instead of training-data ones. Planner: widened network from `defaults` to include the `node` preset (for the npm fetch of @upstash/context7-mcp) + context7.com. Implementer: `node` already allowed, just appended context7.com. Prompt bodies unchanged — Claude discovers MCP tools automatically; explicit "check Context7 first" nudges tend to cause over-lookup. Co-Authored-By: Claude Opus 4.7 (1M context) --- catalog/agent-team/implementer-agent.md | 7 +++++++ catalog/agent-team/planner-agent.md | 12 +++++++++++- 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/catalog/agent-team/implementer-agent.md b/catalog/agent-team/implementer-agent.md index 163bde6..fe1a899 100644 --- a/catalog/agent-team/implementer-agent.md +++ b/catalog/agent-team/implementer-agent.md @@ -41,6 +41,7 @@ network: - rust - dotnet - java + - "context7.com" checkout: fetch-depth: 0 @@ -52,6 +53,12 @@ tools: bash: true web-fetch: +mcp-servers: + context7: + command: npx + args: ["-y", "@upstash/context7-mcp"] + allowed: [resolve-library-id, get-library-docs] + safe-outputs: # Trusted-input pipeline (dispatched by the planner in our own repo). # Skip the ~1-min threat-detection classifier to save wall-clock per run. diff --git a/catalog/agent-team/planner-agent.md b/catalog/agent-team/planner-agent.md index 527ad79..f61bf71 100644 --- a/catalog/agent-team/planner-agent.md +++ b/catalog/agent-team/planner-agent.md @@ -29,7 +29,11 @@ permissions: contents: read issues: read -network: defaults +network: + allowed: + - defaults + - node + - "context7.com" tools: github: @@ -37,6 +41,12 @@ tools: min-integrity: none bash: true +mcp-servers: + context7: + command: npx + args: ["-y", "@upstash/context7-mcp"] + allowed: [resolve-library-id, get-library-docs] + safe-outputs: # Trusted-input pipeline (dispatched by the spec-agent in our own repo). # Skip the ~1-min threat-detection classifier to save wall-clock per run. From 1342e48a024761e18c6cab6c7edb6477e14cddd0 Mon Sep 17 00:00:00 2001 From: Verky Yi Date: Thu, 23 Apr 2026 00:05:35 +0000 Subject: [PATCH 02/14] docs: spec for agent-team auto-rebase (v0.2) Closes the "PR conflicts after human leaves" gap: implementer rebases at start of its own runs, and a scheduled sweep rebases open agent-team PRs that fall behind main. Mechanical conflicts resolve silently; semantic ones escalate via state:blocked. --- ...026-04-23-agent-team-auto-rebase-design.md | 120 ++++++++++++++++++ 1 file changed, 120 insertions(+) create mode 100644 docs/superpowers/specs/2026-04-23-agent-team-auto-rebase-design.md diff --git a/docs/superpowers/specs/2026-04-23-agent-team-auto-rebase-design.md b/docs/superpowers/specs/2026-04-23-agent-team-auto-rebase-design.md new file mode 100644 index 0000000..5a90004 --- /dev/null +++ b/docs/superpowers/specs/2026-04-23-agent-team-auto-rebase-design.md @@ -0,0 +1,120 @@ +# agent-team auto-rebase — design + +**Status**: draft +**Date**: 2026-04-23 +**Scope**: v0.2 addition to the `agent-team` catalog entry + +## Problem + +Today the agent-team pipeline produces a draft PR, approves it, and stops. A human merges. If `main` advances between "approved" and "merged" and the changes touch overlapping files, the PR goes stale or conflicts, and the human has to rebase manually — a chore that should be the team's responsibility, not the stakeholder's. + +The agent-team should behave like a real engineering team: keep its own PRs mergeable, and only surface to the human when there's something to *decide* (semantic conflict, ambiguous intent), not when there's something to *do* (mechanical rebase). + +## Goals + +- Any `agent-team:pr` draft PR that falls behind `main` is rebased automatically without human intervention. +- The human sees the PR only when it is ready to merge or genuinely blocked on judgment. +- Mechanical conflicts (non-overlapping hunks, trivially resolvable text conflicts with tests still green) are resolved silently. +- Semantic conflicts (overlapping logic, test regressions after resolve, or low-confidence resolution) escalate via the existing `state:blocked` channel. +- No new agent role; reuse the implementer with an added mode. + +## Non-goals + +- Auto-merging PRs. Humans still merge. +- Resolving conflicts on non-agent-team PRs. +- Webhook-driven rebase on every push to `main` (stampede risk; scheduled sweep is enough for v0.2). +- Rebasing draft PRs that aren't labeled `agent-team:pr`. + +## Design + +### Trigger model + +Two triggers, covering every window in which `main` can advance: + +| Window | Handled by | +|---|---| +| Before / during an implementer run (initial impl or kickback) | Implementer rebases onto `main` as the first step of its own run | +| PR waiting in review, or approved and waiting for human merge | Scheduled sweep dispatches implementer in rebase-only mode | + +No label, no slash command. Rebase is a chore, not a decision — the human doesn't opt in. + +### Implementer: new `mode` input + +Add an input to `implementer-agent.md`: + +```yaml +mode: + description: Implementer behavior mode. `impl` (default) runs the normal spec→plan→PR flow. `rebase` rebases the existing PR onto main and stops. + required: false + type: string + default: "impl" +``` + +**`mode: impl`** (unchanged from today, plus one addition): after checking out the branch in step 2, run `git fetch origin main && git rebase origin/main` before making any edits. On conflict, try Claude-led resolution; on failure, escalate (see "Escalation rules" below). + +**`mode: rebase`**: +1. `pr_number` is required; fail fast if empty. +2. Check out the PR branch. +3. `git fetch origin main` +4. `git merge-base --is-ancestor origin/main HEAD` → if true, PR is already current; post nothing, exit cleanly. +5. Otherwise: `git rebase origin/main`. On conflict: Claude-led resolution (see below). +6. Run the project's test command once (reuse the same detection logic as normal impl). +7. If clean + tests pass: `git push --force-with-lease`, post one short comment on the PR: `🤖 agent-team / rebase: rebased onto main at , tests green.` +8. Do **not** read the spec or plan. Do **not** dispatch the reviewer. Rebase mode is terminal. + +### Sweep workflow + +New file: `catalog/agent-team/sweep-agent.md` (or a plain `.github/workflows/agent-team-sweep.yml` — see "Open questions"). Engine-less — it's just `gh` + bash, no LLM needed for enumeration. Trigger: + +```yaml +on: + schedule: + - cron: "17 */6 * * *" # every 6h, offset to avoid peak minutes + workflow_dispatch: +``` + +Logic: + +1. `gh pr list --label agent-team:pr --state open --draft --json number,headRefName,headRefOid,headRepository` +2. For each PR: + - `git fetch origin main` + - `git merge-base --is-ancestor origin/main ` → if true, skip (already current). + - Otherwise: `gh workflow run implementer-agent.yml -f issue_number= -f pr_number= -f iteration= -f mode=rebase` +3. The existing `concurrency: agent-team-issue-` group on the implementer serializes the sweep behind any live impl/review cycle — no extra locking needed. + +### Escalation rules (shared between both modes) + +After attempting `git rebase origin/main`: + +- **Clean rebase, tests pass** → push, short success comment, stop. +- **Conflict, Claude resolves, tests pass** → push, short comment noting `resolved N conflict(s)`, stop. +- **Conflict Claude declines or resolves with low confidence** → `git rebase --abort`, add `state:blocked` to the issue, comment on the PR with the conflicted files + why it escalated, stop. +- **Rebase succeeds but tests fail** → `git rebase --abort`, add `state:blocked`, comment on the PR with the failing test output, stop. + +"Low confidence" is defined by the resolution itself: if Claude can't resolve without substantially rewriting either side's logic (not just merging parallel edits), it's semantic and belongs to the human. + +### Reviewer: no change + +The reviewer stays read-only and role-pure. If the reviewer is mid-run when the sweep fires, the concurrency group makes the sweep wait. If `main` advances *during* a reviewer run (rare, minutes-long window), the next sweep catches it — acceptable latency. + +## File changes + +- `catalog/agent-team/implementer-agent.md` — add `mode` input; add rebase-at-start to `impl` mode; add `mode: rebase` branch and escalation rules. +- `catalog/agent-team/sweep-agent.md` *(new)* — sweep workflow as described. If a plain workflow YAML fits the catalog pattern better, use `.github/workflows/agent-team-sweep.yml` instead. +- `catalog/agent-team/README.md` — one section on how rebases are handled, escalation signals, and sweep cadence. +- `skills/install-agent-team/SKILL.md` — mention the new sweep workflow in the install list. +- `tests/test-install-agent-team.sh` and `tests/test-e2e-install-agent-team.sh` — extend to cover the sweep file being installed. + +## Open questions + +1. **Sweep cadence.** Every 6h is a starting point. If the user merges PRs frequently, shorter (every 2h) is fine; if rarely, daily. Adjust after observing real usage. +2. **Sweep as a gh-aw agent vs plain workflow.** The sweep doesn't need an LLM. Plain YAML is simpler and cheaper. But keeping all agent-team files under `catalog/agent-team/` as `.md` gh-aw sources is more consistent. Decide during implementation. +3. **Iteration input for sweep-dispatched rebase.** The sweep needs to pass *some* iteration value. Options: (a) always `"1"` since rebase isn't a "review attempt," (b) query the last impl run's iteration and reuse. (a) is simpler; iteration is used only by the guard in `impl` mode, which rebase-mode skips. +4. **Force-push safety.** `--force-with-lease` protects against overwriting concurrent pushes. Good enough for draft PRs the agent owns; revisit if humans start pushing to agent-team branches directly. +5. **Claude-led conflict resolution prompt.** Needs a concrete, bounded prompt template ("here are the conflict markers, here's each side's intent from the spec and from the last N commits on main; resolve only if the intents don't overlap"). Draft during implementation. + +## Deferred to later + +- Webhook trigger on push-to-main — revisit if 6h sweep latency turns out to matter in practice. Rejected for v0.2 due to stampede risk and speculative work on non-conflicting PRs. +- A dedicated `maintainer-agent` role — revisit if the implementer prompt gets unwieldy from carrying two modes. Rejected for v0.2 as premature role split. +- Reviewer-driven final-rebase-on-approve — revisit if the sweep turns out to be too slow for fresh approvals. Rejected for v0.2 because it breaks the reviewer's read-only role boundary and doesn't help the common "PR waiting for human merge" window. From e2e220d57365e5fd358651383858bb3c0b7be7eb Mon Sep 17 00:00:00 2001 From: Verky Yi Date: Thu, 23 Apr 2026 00:11:02 +0000 Subject: [PATCH 03/14] docs: plan for agent-team auto-rebase (v0.2) Eight-task breakdown: mode input + rebase-at-start (T1), rebase-only mode (T2), sweep-agent (T3), install-skill updates (T4), README (T5), install behavior tests (T6), e2e install test (T7), playground smoke (T8). Each task scoped to a single file or test file and ends in a commit. --- .../2026-04-23-agent-team-auto-rebase.md | 680 ++++++++++++++++++ 1 file changed, 680 insertions(+) create mode 100644 docs/superpowers/plans/2026-04-23-agent-team-auto-rebase.md diff --git a/docs/superpowers/plans/2026-04-23-agent-team-auto-rebase.md b/docs/superpowers/plans/2026-04-23-agent-team-auto-rebase.md new file mode 100644 index 0000000..9281e49 --- /dev/null +++ b/docs/superpowers/plans/2026-04-23-agent-team-auto-rebase.md @@ -0,0 +1,680 @@ +# agent-team Auto-Rebase Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Keep agent-team draft PRs rebased onto `main` without human intervention, escalating only when conflicts are semantic. + +**Architecture:** The implementer gains a `mode` input. In default `impl` mode it rebases the branch onto `main` at the start of every run. A new `rebase` mode skips spec/plan and only rebases-and-pushes. A new `sweep-agent` workflow runs on a 6-hour cron (and on-demand), enumerates open `agent-team:pr` draft PRs that are behind `main`, and dispatches the implementer in `rebase` mode for each. Mechanical conflicts resolve silently; semantic conflicts escalate via `state:blocked`. + +**Tech Stack:** gh-aw (`.md` → `.lock.yml` compilation), GitHub Actions `workflow_dispatch` + `schedule`, `gh` CLI, `git rebase`, bash. + +**Spec:** [`docs/superpowers/specs/2026-04-23-agent-team-auto-rebase-design.md`](../specs/2026-04-23-agent-team-auto-rebase-design.md) + +--- + +## Working mode conventions + +These files are gh-aw workflow prompts (markdown) and GitHub Actions config (YAML) — not runnable code. "Test" in this plan means: + +1. **Syntactic validation:** `gh aw compile ` produces a `.lock.yml` without errors. +2. **Semantic validation:** `gh aw validate` succeeds against the compiled output. +3. **Shape check:** grep/inspect the generated lock file to confirm frontmatter fields (triggers, permissions, safe-outputs) compiled as intended. +4. **Smoke test (final task):** run the sweep against the playground repo manually via `gh workflow run` and watch the dispatched implementer run. + +Unit-testing the prompt body itself is not possible; correctness is verified by the smoke test. + +--- + +## Task 1: Add `mode` input to implementer-agent and rebase-at-start to `impl` mode + +**Files:** +- Modify: `catalog/agent-team/implementer-agent.md` (frontmatter + "Normal path" section) + +Purpose of this task: give the implementer a new input it can be invoked with, and make the default path (`impl` mode) rebase onto `main` before it starts editing. The new `rebase` mode branch is added in Task 2. + +- [ ] **Step 1: Read the current file to locate the exact edit points** + +Run: `sed -n '1,50p' catalog/agent-team/implementer-agent.md` to see the frontmatter; `sed -n '100,130p' catalog/agent-team/implementer-agent.md` to see the "Normal path" steps 1-3. + +Confirm: +- The `workflow_dispatch.inputs` block is at lines 10-26 (after the opening `---`). +- The "Normal path" section begins at line 108. +- Step 2 (branch selection) is at lines 117-119. + +- [ ] **Step 2: Add the `mode` input to frontmatter** + +In `catalog/agent-team/implementer-agent.md`, inside `on.workflow_dispatch.inputs` (after the `pr_number` input, before the closing of the inputs block, i.e. after line 26), insert: + +```yaml + mode: + description: >- + Implementer behavior mode. `impl` (default) runs the normal spec→plan→PR flow and rebases onto main at the start. + `rebase` skips spec/plan and only rebases the existing PR onto main, runs tests, and pushes. + required: false + type: string + default: "impl" +``` + +(Indentation: two spaces deeper than `pr_number`, matching YAML block structure.) + +- [ ] **Step 3: Add a top-level mode dispatch at the start of the prompt body** + +Immediately after the opening line `You are the **implementer**...` (currently line 93), insert a new section **before** "## Iteration guard": + +```markdown +## Mode dispatch + +Check `inputs.mode`: +- `impl` (default) or empty → follow the **Normal path** below. +- `rebase` → follow the **Rebase-only mode** section instead; skip the Normal path entirely. + +Any other value → add `state:blocked` to `inputs.issue_number`, post `🛑 agent-team: unknown implementer mode "".` on the issue, stop. +``` + +(The "Rebase-only mode" section itself is added in Task 2. It's fine for this task to reference a section that doesn't exist yet; Task 2 adds it before compilation of the final result.) + +- [ ] **Step 4: Add the rebase-at-start step inside the Normal path** + +In the "Normal path" section, between current step 2 (branch pick) and current step 3 (implement), insert a new step numbered 2.5 (and renumber remaining steps 3→4, 4→5, etc., **or** insert it as step 3 and renumber everything after — whichever minimizes diff). Use this exact content: + +```markdown +3. **Rebase the branch onto `main` before editing**: + - `git fetch origin main` + - If this is a fresh branch (inputs.pr_number empty) and you just branched from `main`, this is a no-op — skip. + - Otherwise: `git rebase origin/main`. + - **Clean rebase** → proceed. + - **Rebase produces conflicts** → attempt resolution (see "Conflict resolution" below). If resolved and the project's tests still pass after resolution, proceed. If not, `git rebase --abort`, add `state:blocked` to the issue, comment on the PR (or on the issue if no PR yet) with the conflicting file list and a one-sentence reason, and stop. Do not dispatch the reviewer. +``` + +Renumber subsequent steps accordingly (what was step 3 becomes step 4, etc., through what was step 7 becoming step 8). + +- [ ] **Step 5: Add a "Conflict resolution" subsection under "## Rules"** + +At the end of the file, before the final `## Rules` list items (or as a new `## Conflict resolution` section between "## Rules" and the end of file), append: + +```markdown +## Conflict resolution + +When `git rebase origin/main` produces conflicts (either in `impl` mode's rebase-at-start step or in `rebase` mode): + +1. Read each conflicted file. Look at the conflict markers (`<<<<<<<`, `=======`, `>>>>>>>`). +2. **Resolve only if the two sides edit disjoint concerns** — e.g. one side renames a variable, the other side adds an unrelated function nearby. Keep both changes. +3. **Do not resolve** if either side changed the same logic (e.g. both sides modified the same function body in ways that affect behavior). That's a semantic conflict requiring human judgment. +4. After resolving, `git add ` and `git rebase --continue`. +5. After all conflicts are resolved (or none existed), run the project's test command **once**. If tests pass → push. If tests fail → `git rebase --abort` (or `git reset --hard ORIG_HEAD` if already past rebase), escalate via `state:blocked` with the failing test output. + +Escalation format (when blocking due to unresolvable conflict or test failure after resolve): +- Add `state:blocked` to `inputs.issue_number`. +- Comment on the PR (or issue, if no PR yet) — body: + ``` + 🛑 agent-team / : rebase onto main blocked. + + **Reason**: | tests failed after mechanical resolve> + **Conflicting files**: + **What I tried**: + **Next**: human resolves locally, then removes state:blocked to re-enter the pipeline. + ``` +- Stop. Do not dispatch downstream. +``` + +- [ ] **Step 6: Compile and validate** + +Run: +```bash +cd /tmp && mkdir -p aw-compile-check && cd aw-compile-check +cp /home/dev/projects/github-agent-runner/catalog/agent-team/implementer-agent.md . +gh aw compile implementer-agent.md +``` + +Expected: an `implementer-agent.lock.yml` is produced, no errors. + +Run: +```bash +grep -c '"mode"' implementer-agent.lock.yml +``` + +Expected: at least 1 (the new input shows up in the compiled `workflow_dispatch.inputs`). + +Then `gh aw validate` on the lock file. Expected: no errors. + +- [ ] **Step 7: Commit** + +```bash +cd /home/dev/projects/github-agent-runner +git add catalog/agent-team/implementer-agent.md +git commit -m "agent-team: add mode input + rebase-at-start to implementer" +``` + +--- + +## Task 2: Add rebase-only mode to implementer-agent + +**Files:** +- Modify: `catalog/agent-team/implementer-agent.md` (add "Rebase-only mode" section) + +- [ ] **Step 1: Add the "Rebase-only mode" section** + +In `catalog/agent-team/implementer-agent.md`, insert a new `## Rebase-only mode` section **between** the "Mode dispatch" section (added in Task 1) and the "## Iteration guard" section. Content: + +```markdown +## Rebase-only mode + +Triggered when `inputs.mode == "rebase"`. Purpose: keep an existing PR current with `main` without doing any implementation work. Called by the sweep workflow (and can be invoked manually via `gh workflow run`). + +**Preconditions** (fail fast): +- `inputs.pr_number` must be set. If empty, add `state:blocked` to `inputs.issue_number`, comment `🛑 agent-team / rebase: mode=rebase requires pr_number.`, stop. + +**Steps**: + +1. Check out the PR branch: + - `gh pr view --json headRefName,state,isDraft` — confirm the PR is open and draft. If closed or merged, stop silently (nothing to do). + - `git fetch origin && git checkout ` + +2. Fetch `main`: + - `git fetch origin main` + - If `main` is already an ancestor of `HEAD` (`git merge-base --is-ancestor origin/main HEAD`), the PR is current. Stop silently — post no comment. + +3. Rebase: + - `git rebase origin/main`. + - On conflicts → follow "Conflict resolution" (same rules as `impl` mode). Clean rebase or successful mechanical resolve → continue to step 4. + +4. Run the project's test command once. Use the same test-command detection as `impl` mode (read `package.json` / `Makefile` / CI files). If no test command is detectable, skip and note that in step 5. + - Tests pass → continue to step 5. + - Tests fail → `git rebase --abort` (or reset if already past), escalate per "Conflict resolution" escalation format, stop. + +5. Push: + - `git push --force-with-lease origin HEAD:` + - Post one comment on the PR — body: + ``` + 🤖 agent-team / rebase: rebased onto main at . + + - Rebase: > + - Tests: <✅ passed | ⚠ skipped — no test command detected> + ``` + +6. Stop. **Do not dispatch the reviewer.** Rebase mode is terminal. + +**Rules for this mode**: +- Never read the spec or plan. This mode addresses no requirements changes. +- Never dispatch downstream. The PR stays in whatever state it was in (`state:review-needed`, `state:done`, etc.) — a rebase doesn't reset review. +- Never touch files beyond what `git rebase` modifies. No spec-driven edits. +- Force-push uses `--force-with-lease` so a concurrent human push isn't clobbered. +``` + +- [ ] **Step 2: Update `safe-outputs` if needed** + +The existing `safe-outputs` block already allows `push-to-pull-request-branch`, `add-comment`, `add-labels`, `dispatch-workflow`. Rebase mode uses: +- `push-to-pull-request-branch` — already present, `max: 1` is enough. +- `add-comment` — already present, `max: 2`, enough for the success comment + any escalation comment. +- `add-labels` with `state:blocked` — already present. + +Confirm by reading lines 62-88 of the current file. No changes needed if those are present. If the current `max` on any is tight, bump `add-comment.max` to `3` (rebase comment + escalation comment + buffer). + +- [ ] **Step 3: Compile and validate** + +```bash +cd /tmp/aw-compile-check +cp /home/dev/projects/github-agent-runner/catalog/agent-team/implementer-agent.md . +gh aw compile implementer-agent.md +gh aw validate implementer-agent.lock.yml +``` + +Expected: no errors. Grep the output: +```bash +grep -c "Rebase-only mode" implementer-agent.lock.yml +``` +Expected: at least 1 (the prompt body is embedded in the lock file). + +- [ ] **Step 4: Commit** + +```bash +cd /home/dev/projects/github-agent-runner +git add catalog/agent-team/implementer-agent.md +git commit -m "agent-team: add rebase-only mode to implementer" +``` + +--- + +## Task 3: Create `sweep-agent.md` + +**Files:** +- Create: `catalog/agent-team/sweep-agent.md` + +- [ ] **Step 1: Create the file with the full contents below** + +Exact contents for `catalog/agent-team/sweep-agent.md`: + +```markdown +--- +engine: claude +description: | + Sweep agent for the agent-team pattern. Runs on a schedule and on demand, + enumerates open draft PRs labeled `agent-team:pr`, and dispatches the + implementer in `rebase` mode for any that have fallen behind `main`. + No LLM reasoning on the diffs themselves — it's enumerate + ancestry + check + dispatch. + +on: + schedule: + - cron: "17 */6 * * *" + workflow_dispatch: {} + +concurrency: + group: agent-team-sweep + cancel-in-progress: false + +timeout-minutes: 5 + +permissions: + contents: read + issues: read + pull-requests: read + +network: + allowed: + - defaults + +checkout: + fetch-depth: 0 + +tools: + github: + toolsets: [default] + min-integrity: none + bash: true + +safe-outputs: + threat-detection: false + add-comment: + max: 1 + target: "*" + dispatch-workflow: + workflows: [implementer-agent] + max: 20 +--- + +# Sweep Agent + +You are the **sweep** for the agent-team pipeline. Your job: find open draft PRs labeled `agent-team:pr` that are behind `main`, and dispatch the implementer in `rebase` mode for each. + +## Steps + +1. List candidate PRs: + ``` + gh pr list --label agent-team:pr --state open --draft \ + --json number,headRefName,headRefOid,body --limit 50 + ``` + +2. For each PR in the list: + + a. Derive `issue_number` by parsing `Closes #` from the PR body. If no `Closes #N` marker exists, **skip that PR** (log the skip; do not dispatch). + + b. Check if the PR is behind `main`: + ``` + git fetch origin main --quiet + git merge-base --is-ancestor origin/main + ``` + - Exit code `0` → PR is current, skip it. + - Exit code `1` → PR is behind, dispatch (next step). + + c. Dispatch the implementer in rebase mode via the `dispatch-workflow` safe-output: + - workflow: `implementer-agent` + - inputs: + - `issue_number`: `` (from step 2a) + - `pr_number`: `` + - `iteration`: `"1"` (rebase mode bypasses the iteration guard; any value works) + - `mode`: `"rebase"` + +3. After the loop, if at least one dispatch was emitted, post one summary comment on the **repository's dashboard issue** (optional — only if a dashboard issue is configured; otherwise skip). Default: post no comment. The dispatched runs' logs are the audit trail. + +## Rules + +- Sweep never edits code, never rebases itself, never dispatches anything except the implementer in `rebase` mode. +- If `gh pr list` returns zero PRs, stop silently — no comment, no dispatch. +- If more than 20 PRs are behind (unusually large), dispatch the first 20 only. The next sweep run (6h later) picks up the rest. Prevents dispatch-workflow cap from erroring out. +- Sweep is idempotent — running it back-to-back produces zero extra dispatches (the second run sees all PRs current). +- Footer comment (only if one is posted): `🤖 agent-team / sweep`. +``` + +- [ ] **Step 2: Compile and validate** + +```bash +cd /tmp/aw-compile-check +cp /home/dev/projects/github-agent-runner/catalog/agent-team/sweep-agent.md . +gh aw compile sweep-agent.md +gh aw validate sweep-agent.lock.yml +``` + +Expected: no errors. + +- [ ] **Step 3: Verify trigger shape in the compiled lock file** + +```bash +grep -A3 "^on:" sweep-agent.lock.yml | head -10 +``` + +Expected output includes `schedule:` with the `cron: "17 */6 * * *"` and `workflow_dispatch:`. + +- [ ] **Step 4: Commit** + +```bash +cd /home/dev/projects/github-agent-runner +git add catalog/agent-team/sweep-agent.md +git commit -m "agent-team: add sweep-agent for periodic rebase dispatch" +``` + +--- + +## Task 4: Update install-agent-team skill + +**Files:** +- Modify: `skills/install-agent-team/SKILL.md` + +- [ ] **Step 1: Update the description frontmatter** + +In `skills/install-agent-team/SKILL.md`, change the frontmatter `description` from: + +``` +Install the full four-role agent-team pattern (spec → plan → impl → review) into the current repo as one unified setup. +``` + +to: + +``` +Install the full agent-team pattern (spec → plan → impl → review + periodic sweep) into the current repo as one unified setup. +``` + +- [ ] **Step 2: Update the opening paragraph** + +Change the first paragraph under `# install-agent-team` from "Install all four agent-team workflows..." to "Install all five agent-team workflows..." (or similar minimal edit that keeps the rest). + +Actual edit — replace line 8: +``` +Install all four agent-team workflows into the current repo in one pass: fetch, wire auth once, apply the OAuth tweak to every lockfile, create the labels, validate, commit. +``` +with: +``` +Install all five agent-team workflows (spec, planner, implementer, reviewer, and the sweep that keeps PRs rebased) into the current repo in one pass: fetch, wire auth once, apply the OAuth tweak to every lockfile, create the labels, validate, commit. +``` + +- [ ] **Step 3: Update step 1 ("Explain what's about to happen")** + +Change "four workflows will be added" to "five workflows will be added (four pipeline roles + a sweep that runs every 6 hours to keep PRs rebased)". + +- [ ] **Step 4: Update step 4 (the install sequence)** + +After the existing `gh aw add ... reviewer-agent.md@main` line, add: + +```bash +gh aw add verkyyi/github-agent-runner/catalog/agent-team/sweep-agent.md@main +``` + +Update the error-handling sentence that follows — change "The four are a unit" to "The five are a unit" (or equivalent). + +- [ ] **Step 5: Update step 5 (OAuth tweak)** + +Change "For each of the four `.lock.yml` files" to "For each of the five `.lock.yml` files". + +- [ ] **Step 6: Update step 8 (summary)** + +Change "Four files added" to "Five files added" and list each `.md` + `.lock.yml` pair (the existing text says "name each" — the engineer filling this in should add `sweep-agent` to whatever list format is used). + +- [ ] **Step 7: Update "Hard rules" section** + +Change "If any of the four `gh aw add` calls fails" to "If any of the five `gh aw add` calls fails". + +- [ ] **Step 8: Commit** + +```bash +cd /home/dev/projects/github-agent-runner +git add skills/install-agent-team/SKILL.md +git commit -m "install-agent-team: include sweep-agent in install flow" +``` + +--- + +## Task 5: Update the catalog README + +**Files:** +- Modify: `catalog/agent-team/README.md` + +- [ ] **Step 1: Update the top paragraph** + +Change the first paragraph from "A four-workflow pattern..." to "A five-workflow pattern for a spec → plan → implement → review pipeline, plus a sweep that keeps draft PRs rebased. Each role is a separate gh-aw workflow...". Leave the rest of the sentence intact. + +- [ ] **Step 2: Add a row to the Files table** + +Find the `## Files` section (around line 64). Add a fifth row: + +```markdown +| `sweep-agent.md` | `schedule` (every 6h) + `workflow_dispatch` | `implementer-agent` in `rebase` mode, per stale PR | +``` + +- [ ] **Step 3: Add a "Rebase handling" subsection** + +Between `## The handoff model` (ends ~line 49) and `## The comment contract` (starts ~line 51), insert a new subsection: + +```markdown +## Rebase handling + +Draft PRs drift out of date as `main` advances. Two mechanisms keep them current, no human action required: + +1. **Rebase at start of every implementer run** — `impl` mode begins with `git fetch origin main && git rebase origin/main`. Catches drift within the pipeline (initial impl, kickback cycles). +2. **Scheduled sweep** — `sweep-agent.md` runs every 6 hours (and on-demand via `workflow_dispatch`). It lists open `agent-team:pr` draft PRs, checks each for `main`-ancestry, and dispatches the implementer in `rebase` mode for any that are behind. Catches the common "PR sat waiting for human merge, main moved" case. + +Both paths share the same escalation rule: mechanical conflicts resolve silently; semantic conflicts (overlapping logic, tests fail after resolve) escalate via `state:blocked` with a targeted comment. The human sees the PR only when it's ready to merge or when their judgment is needed. +``` + +- [ ] **Step 4: Update the "Install" manual block** + +Under the `
` "Manual install (advanced)" block, add a fifth `gh aw add` line: + +```bash +gh aw add verkyyi/github-agent-runner/catalog/agent-team/sweep-agent.md@main +``` + +- [ ] **Step 5: Update the flow diagram (optional but recommended)** + +The ASCII diagram under `## The handoff model` currently ends at the reviewer's kickback loop. Add, after the existing diagram, a small note: + +``` + (Separately, on a 6-hour cron) + ┌─────────────┐ dispatch (issue_number, pr_number, mode=rebase) + │ sweep-agent │─────────────────────────► implementer-agent (for each stale PR) + └─────────────┘ +``` + +- [ ] **Step 6: Commit** + +```bash +cd /home/dev/projects/github-agent-runner +git add catalog/agent-team/README.md +git commit -m "agent-team: document sweep + rebase-handling in README" +``` + +--- + +## Task 6: Update install-agent-team behavior tests + +**Files:** +- Modify: `tests/test-install-agent-team.sh` + +These are prompt-based tests that ask Claude questions about the skill and assert keywords in the response. Two tests need adjustment for the fifth workflow; one new test verifies the rebase concept is mentioned. + +- [ ] **Step 1: Update Test 1 ("names all four roles")** + +In `tests/test-install-agent-team.sh`, find the Test 1 block (lines 15-25). Change: +- The prompt text from "list the four roles it installs" to "list the five workflows it installs". +- `assert_contains "$output" "four" ...` to `assert_contains "$output" "five" "Mentions five (workflows)"`. +- Add: `assert_contains "$output" "sweep" "Names the sweep workflow" || exit 1` + +- [ ] **Step 2: Update Test 3 ("atomic install")** + +Find Test 3 (around lines 46-51). The prompt text "if one of the four 'gh aw add' calls fails" should become "if one of the five 'gh aw add' calls fails". No other assertions change. + +- [ ] **Step 3: Update Test 4 ("auth wired once")** + +Find Test 4 (around lines 55-60). In `assert_contains`, change the failure message "four" to "five" if present in the regex options list. + +- [ ] **Step 4: Add a new Test 7 for rebase behavior awareness** + +At the end of the file, before `echo "=== All install-agent-team tests passed ==="`, add: + +```bash +# Test 7: Rebase behavior is part of the installed pipeline. +echo "Test 7: Skill mentions automatic rebase handling..." +output=$(run_claude "Load the install-agent-team skill. Does the installed pipeline do anything automatic about keeping draft PRs rebased onto main, or does the user have to rebase by hand? Quote the specific workflow or behavior." 180) +assert_contains "$output" "sweep|rebase" "Mentions sweep or rebase" || exit 1 +assert_contains "$output" "automat|without|silently|no.*action" "Frames it as automatic" || exit 1 + +echo "" +``` + +- [ ] **Step 5: Run the tests locally against the updated skill** + +```bash +cd /home/dev/projects/github-agent-runner +./tests/test-install-agent-team.sh +``` + +Expected: all tests pass. If any fails because Claude's answer phrasing doesn't match the assertion regex, adjust the regex (not the skill) — the skill is the source of truth. + +Cost note: these are live Claude calls. Budget ~5-10 min wall-clock and a few thousand tokens. + +- [ ] **Step 6: Commit** + +```bash +git add tests/test-install-agent-team.sh +git commit -m "tests: cover sweep-agent in install-agent-team behavior suite" +``` + +--- + +## Task 7: Update the e2e install test + +**Files:** +- Modify: `tests/test-e2e-install-agent-team.sh` + +- [ ] **Step 1: Update the PROMPT string** + +Find the `PROMPT=` line (around line 120). Change "install all four agent-team workflows" to "install all five agent-team workflows (including the sweep)". + +- [ ] **Step 2: Update the assertion loop** + +Find the `for wf in spec-agent planner-agent implementer-agent reviewer-agent; do` line (around line 138). Change to: + +```bash +for wf in spec-agent planner-agent implementer-agent reviewer-agent sweep-agent; do +``` + +No other changes needed — the loop body already checks for `.md`, `.lock.yml`, and OAuth-tweak grep counts for each workflow name. + +- [ ] **Step 3: (Optional) Add a quick sweep-trigger smoke assertion** + +After the main assertion loop, add a check that the sweep workflow was actually registered with GitHub Actions: + +```bash +# Sweep workflow is registered and dispatchable +if gh workflow list --repo "$FULL" --json name,path --jq '.[] | select(.path | contains("sweep-agent")) | .name' | grep -q .; then + pass "sweep-agent workflow registered with Actions" +else + fail "sweep-agent workflow not registered" +fi +``` + +Insert this before the existing "Labels" check. + +- [ ] **Step 4: Run the e2e test locally (optional — costs ~5-8 min + creates a throwaway repo)** + +```bash +cd /home/dev/projects/github-agent-runner +./tests/test-e2e-install-agent-team.sh +``` + +Expected: all assertions pass, including the five-workflow loop and the new sweep-registration check. If running this is too costly, skip and rely on the Task 8 smoke test instead. + +- [ ] **Step 5: Commit** + +```bash +git add tests/test-e2e-install-agent-team.sh +git commit -m "tests: extend e2e install check to cover sweep-agent" +``` + +--- + +## Task 8: Smoke-test the sweep against the playground repo + +**Files:** none modified — this is manual verification. + +Goal: confirm the sweep actually dispatches an implementer-in-rebase-mode run against a real stale PR, and that the rebase either pushes silently or escalates cleanly. + +- [ ] **Step 1: Install the updated pipeline into `verkyyi/agent-team-playground`** + +From the playground repo, run `/install-agent-team` with the plugin pointed at the current dev branch (not `main` — use the feature branch with the new sweep). The install should report 5 files added, OAuth tweak applied to 5 lockfiles. + +If the playground already has an older agent-team install, remove the four existing `.lock.yml` files + `.md` sources first, then reinstall. + +- [ ] **Step 2: Create a stale PR deliberately** + +In the playground: +1. Open an issue with label `agent-team` describing a trivial change (e.g. "add a blank line to README"). +2. Let the full pipeline run — spec, plan, impl, review, approve. You now have a draft PR. +3. Without merging, push an unrelated commit directly to `main` (e.g. edit a different file). This makes the PR "behind main." + +- [ ] **Step 3: Trigger the sweep manually** + +```bash +gh workflow run sweep-agent.lock.yml --repo verkyyi/agent-team-playground +``` + +Watch the run (`gh run watch`). Expected in the logs: +- `gh pr list` returns the one open draft PR. +- Ancestry check fails (PR is behind). +- `dispatch-workflow` fires for `implementer-agent` with `mode=rebase`. + +- [ ] **Step 4: Verify the implementer-in-rebase-mode run** + +Watch the dispatched implementer run. Expected: +- The Mode dispatch section routes to "Rebase-only mode". +- `git rebase origin/main` runs. Since `main`'s new commit is in an unrelated file, rebase is clean. +- Tests run once, pass. +- `git push --force-with-lease` updates the PR branch. +- One comment appears on the PR: `🤖 agent-team / rebase: rebased onto main at , tests green.` +- The reviewer is **not** dispatched. + +- [ ] **Step 5: Verify the escalation path with a real conflict** + +Repeat steps 2-4, but this time the commit to `main` must touch the same lines the PR touches. Expected: +- Implementer rebase-mode hits a conflict. +- Per "Conflict resolution": if the conflict is over the same logic, rebase is aborted, `state:blocked` is added to the issue, and a PR comment explains. No push happens. + +- [ ] **Step 6: Record the smoke-test results** + +Paste the two run URLs (the successful rebase and the blocked conflict) into the PR description for this feature work, as evidence the sweep works end-to-end. + +- [ ] **Step 7: Open the PR for review** + +```bash +gh pr create --base main --draft \ + --title "agent-team v0.2: auto-rebase via sweep + implementer rebase mode" \ + --body-file docs/superpowers/plans/2026-04-23-agent-team-auto-rebase.md +``` + +(Body-file is a placeholder — write a proper PR body summarizing spec, tasks, smoke results.) + +--- + +## Self-review checklist (run by the plan author, not the implementer) + +- [x] Every spec requirement maps to a task: + - "Implementer rebases at start of `impl`" → Task 1 + - "New `rebase` mode" → Task 2 + - "Sweep workflow on schedule + dispatch" → Task 3 + - "Install skill includes sweep" → Task 4 + - "README documents behavior" → Task 5 + - "Tests cover the new shape" → Tasks 6, 7 + - "Smoke test on playground" → Task 8 + - "Escalation via state:blocked" → covered in Task 1 (Conflict resolution section), referenced by Task 2 +- [x] No TBD/TODO/"similar to earlier" placeholders; every prompt body, YAML block, and shell command is written out. +- [x] Type consistency: input name `mode` is used identically in the implementer frontmatter (Task 1), the dispatch section (Task 1), the rebase-only-mode body (Task 2), and the sweep's `dispatch-workflow` call (Task 3). Values `"impl"` and `"rebase"` are used consistently. The escalation comment format is identical across both modes. +- [x] Test strategy is honest: no claim that prompt bodies are "tested" in the code-unit-test sense — validation is `gh aw compile` + `gh aw validate` + behavior-probe tests (Task 6) + e2e file-presence (Task 7) + manual playground smoke (Task 8). From 5f3235a3ed0a537edef22a9f4b79bfc41ae6f2f9 Mon Sep 17 00:00:00 2001 From: Verky Yi Date: Thu, 23 Apr 2026 00:17:45 +0000 Subject: [PATCH 04/14] agent-team: add mode input + rebase-at-start to implementer Co-Authored-By: Claude Sonnet 4.6 --- catalog/agent-team/implementer-agent.md | 61 +++++++++++++++++++++---- 1 file changed, 53 insertions(+), 8 deletions(-) diff --git a/catalog/agent-team/implementer-agent.md b/catalog/agent-team/implementer-agent.md index fe1a899..4680ad6 100644 --- a/catalog/agent-team/implementer-agent.md +++ b/catalog/agent-team/implementer-agent.md @@ -24,6 +24,13 @@ on: required: false type: string default: "" + mode: + description: >- + Implementer behavior mode. `impl` (default) runs the normal spec→plan→PR flow and rebases onto main at the start. + `rebase` skips spec/plan and only rebases the existing PR onto main, runs tests, and pushes. + required: false + type: string + default: "impl" concurrency: group: agent-team-issue-${{ inputs.issue_number }} @@ -97,6 +104,14 @@ Inputs: - `inputs.iteration` — attempt number. - `inputs.pr_number` — if non-empty, you're being re-invoked after a reviewer kickback and should **push updates to the existing PR branch**, not open a new PR. +## Mode dispatch + +Check `inputs.mode`: +- `impl` (default) or empty → follow the **Normal path** below. +- `rebase` → follow the **Rebase-only mode** section instead; skip the Normal path entirely. + +Any other value → add `state:blocked` to `inputs.issue_number`, post `🛑 agent-team: unknown implementer mode "".` on the issue, stop. + ## Iteration guard (do this first) If `inputs.iteration` is greater than 3: @@ -118,14 +133,21 @@ If `inputs.iteration` is greater than 3: - If `inputs.pr_number` is empty → create a new branch: `agent-team/issue--`. - If `inputs.pr_number` is set → check out the existing PR's branch (via `gh pr view --json headRefName`) and push updates to it. -3. Implement **only what the plan says** (plus any kickback requested changes). Do not expand scope. +3. **Rebase the branch onto `main` before editing**: + - `git fetch origin main` + - If this is a fresh branch (inputs.pr_number empty) and you just branched from `main`, this is a no-op — skip. + - Otherwise: `git rebase origin/main`. + - **Clean rebase** → proceed. + - **Rebase produces conflicts** → attempt resolution (see "Conflict resolution" below). If resolved and the project's tests still pass after resolution, proceed. If not, `git rebase --abort`, add `state:blocked` to the issue, comment on the PR (or on the issue if no PR yet) with the conflicting file list and a one-sentence reason, and stop. Do not dispatch the reviewer. + +4. Implement **only what the plan says** (plus any kickback requested changes). Do not expand scope. - **Trust the plan.** The planner already explored the repo, confirmed file paths exist, and identified the exact edits. Do NOT re-read surrounding files to "understand the codebase" or "check for patterns." Read only the files the plan names under `Files to change`, plus `AGENTS.md` / `CLAUDE.md` / `CONTRIBUTING.md` once for convention reminders. - **Edit, don't explore.** For each step, make the edit directly. If a file's current content surprises you relative to the plan, stop (see the "plan is wrong" rule below) — do not start investigating. - **Run tests ONCE at the end**, not after each edit. Find the command by reading `package.json` / `Makefile` / CI files on the first pass; cache it. Commands to look for: `npm test`, `pytest`, `cargo test`, `go test ./...`, `make test`. - If tests fail due to your changes, fix and re-run (still one additional run, not per-edit). Unrelated infrastructure failures → document under `## Test status`. - **Budget check**: if this task feels like it needs more than ~5 tool calls for reading or more than 2 test runs, the plan is probably wrong or you're over-exploring. Stop and re-read this section. -4. Produce the PR: +5. Produce the PR: - **New PR** (first impl attempt): use `create-pull-request`. - Title: `` (the workflow adds the `[agent-team] ` prefix). - Body: @@ -136,14 +158,14 @@ If `inputs.iteration` is greater than 3: - Footer: `🤖 agent-team / implementer`. - **Kickback update** (pr_number was set): use `push-to-pull-request-branch` to push the fix commits to the existing PR. Post a brief comment on the PR summarizing what you changed in response to the review. -5. Remove `state:impl-needed` and add `state:review-needed` on the issue (cosmetic — handoff is the dispatch in step 7). +6. Remove `state:impl-needed` and add `state:review-needed` on the issue (cosmetic — handoff is the dispatch in step 8). -6. Capture the PR number: - - New PR: the PR number comes from the `create-pull-request` safe output. Use it in step 7. +7. Capture the PR number: + - New PR: the PR number comes from the `create-pull-request` safe output. Use it in step 8. - Kickback: use `inputs.pr_number` as-is. -7. **Dispatch the reviewer-agent workflow** with: - - `pr_number`: the number from step 6 +8. **Dispatch the reviewer-agent workflow** with: + - `pr_number`: the number from step 7 - `issue_number`: passed through from your input - `iteration`: passed through from your input (do NOT bump) @@ -153,4 +175,27 @@ If `inputs.iteration` is greater than 3: - Never add dependencies that aren't in the plan. If the plan implies one, pick the minimal option and document in PR body. - If the plan is wrong (contradicts the spec, impossible in this repo): stop, do NOT open a partial PR. Add `state:blocked` on the issue and post a comment explaining what's wrong with the plan. A human will resolve. - One concern per PR. If the plan isn't scoped that way, that's a planner bug — report via state:blocked + comment. -- The dispatch in step 7 is the real handoff. `state:review-needed` is decorative. +- The dispatch in step 8 is the real handoff. `state:review-needed` is decorative. + +## Conflict resolution + +When `git rebase origin/main` produces conflicts (either in `impl` mode's rebase-at-start step or in `rebase` mode): + +1. Read each conflicted file. Look at the conflict markers (`<<<<<<<`, `=======`, `>>>>>>>`). +2. **Resolve only if the two sides edit disjoint concerns** — e.g. one side renames a variable, the other side adds an unrelated function nearby. Keep both changes. +3. **Do not resolve** if either side changed the same logic (e.g. both sides modified the same function body in ways that affect behavior). That's a semantic conflict requiring human judgment. +4. After resolving, `git add ` and `git rebase --continue`. +5. After all conflicts are resolved (or none existed), run the project's test command **once**. If tests pass → push. If tests fail → `git rebase --abort` (or `git reset --hard ORIG_HEAD` if already past rebase), escalate via `state:blocked` with the failing test output. + +Escalation format (when blocking due to unresolvable conflict or test failure after resolve): +- Add `state:blocked` to `inputs.issue_number`. +- Comment on the PR (or issue, if no PR yet) — body: + ``` + 🛑 agent-team / : rebase onto main blocked. + + **Reason**: | tests failed after mechanical resolve> + **Conflicting files**: + **What I tried**: + **Next**: human resolves locally, then removes state:blocked to re-enter the pipeline. + ``` +- Stop. Do not dispatch downstream. From fe1422fa8cb92b65d68c5d76408758dbfb20d2ec Mon Sep 17 00:00:00 2001 From: Verky Yi Date: Thu, 23 Apr 2026 00:24:15 +0000 Subject: [PATCH 05/14] agent-team: tighten implementer rebase prompt per review - CR step 5 clarifies push-vs-proceed across modes - Inputs paragraph lists mode - Step 3 defers escalation to CR's template - Skip condition uses ancestry check --- catalog/agent-team/implementer-agent.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/catalog/agent-team/implementer-agent.md b/catalog/agent-team/implementer-agent.md index 4680ad6..6c935b7 100644 --- a/catalog/agent-team/implementer-agent.md +++ b/catalog/agent-team/implementer-agent.md @@ -103,6 +103,7 @@ Inputs: - `inputs.issue_number` — the issue you're implementing against. - `inputs.iteration` — attempt number. - `inputs.pr_number` — if non-empty, you're being re-invoked after a reviewer kickback and should **push updates to the existing PR branch**, not open a new PR. +- `inputs.mode` — behavior mode; `impl` (default) runs the normal spec→plan→PR flow, `rebase` skips to the Rebase-only mode section. ## Mode dispatch @@ -135,10 +136,10 @@ If `inputs.iteration` is greater than 3: 3. **Rebase the branch onto `main` before editing**: - `git fetch origin main` - - If this is a fresh branch (inputs.pr_number empty) and you just branched from `main`, this is a no-op — skip. + - If `inputs.pr_number` is empty and `git merge-base --is-ancestor origin/main HEAD` exits 0, the branch is already current — skip the rebase. - Otherwise: `git rebase origin/main`. - **Clean rebase** → proceed. - - **Rebase produces conflicts** → attempt resolution (see "Conflict resolution" below). If resolved and the project's tests still pass after resolution, proceed. If not, `git rebase --abort`, add `state:blocked` to the issue, comment on the PR (or on the issue if no PR yet) with the conflicting file list and a one-sentence reason, and stop. Do not dispatch the reviewer. + - **Rebase produces conflicts** → follow the "Conflict resolution" section below. On successful mechanical resolution, proceed with the normal flow. On escalation (unresolvable conflict or test failure), do not dispatch the reviewer. 4. Implement **only what the plan says** (plus any kickback requested changes). Do not expand scope. - **Trust the plan.** The planner already explored the repo, confirmed file paths exist, and identified the exact edits. Do NOT re-read surrounding files to "understand the codebase" or "check for patterns." Read only the files the plan names under `Files to change`, plus `AGENTS.md` / `CLAUDE.md` / `CONTRIBUTING.md` once for convention reminders. @@ -185,7 +186,7 @@ When `git rebase origin/main` produces conflicts (either in `impl` mode's rebase 2. **Resolve only if the two sides edit disjoint concerns** — e.g. one side renames a variable, the other side adds an unrelated function nearby. Keep both changes. 3. **Do not resolve** if either side changed the same logic (e.g. both sides modified the same function body in ways that affect behavior). That's a semantic conflict requiring human judgment. 4. After resolving, `git add ` and `git rebase --continue`. -5. After all conflicts are resolved (or none existed), run the project's test command **once**. If tests pass → push. If tests fail → `git rebase --abort` (or `git reset --hard ORIG_HEAD` if already past rebase), escalate via `state:blocked` with the failing test output. +5. After all conflicts are resolved (or none existed), run the project's test command **once**. If tests pass → return to the caller's next step (in `impl` mode, proceed with the normal flow; in `rebase` mode, push and comment). If tests fail → `git rebase --abort` (or `git reset --hard ORIG_HEAD` if already past rebase), escalate via `state:blocked` with the failing test output. Escalation format (when blocking due to unresolvable conflict or test failure after resolve): - Add `state:blocked` to `inputs.issue_number`. From cdca42cb13d84c7764b3cef8387bca4ce71fc545 Mon Sep 17 00:00:00 2001 From: Verky Yi Date: Thu, 23 Apr 2026 00:25:45 +0000 Subject: [PATCH 06/14] agent-team: add rebase-only mode to implementer Co-Authored-By: Claude Sonnet 4.6 --- catalog/agent-team/implementer-agent.md | 47 ++++++++++++++++++++++++- 1 file changed, 46 insertions(+), 1 deletion(-) diff --git a/catalog/agent-team/implementer-agent.md b/catalog/agent-team/implementer-agent.md index 6c935b7..b7ce360 100644 --- a/catalog/agent-team/implementer-agent.md +++ b/catalog/agent-team/implementer-agent.md @@ -113,7 +113,7 @@ Check `inputs.mode`: Any other value → add `state:blocked` to `inputs.issue_number`, post `🛑 agent-team: unknown implementer mode "".` on the issue, stop. -## Iteration guard (do this first) +## Iteration guard (impl mode only) If `inputs.iteration` is greater than 3: - Add `state:blocked` to issue `inputs.issue_number`. @@ -121,6 +121,51 @@ If `inputs.iteration` is greater than 3: - Do **not** dispatch the reviewer. - Stop. +## Rebase-only mode + +Triggered when `inputs.mode == "rebase"`. Purpose: keep an existing PR current with `main` without doing any implementation work. Called by the sweep workflow (and can be invoked manually via `gh workflow run`). + +The iteration guard does not apply to this mode — a rebase is not a review attempt. + +**Preconditions** (fail fast): +- `inputs.pr_number` must be set. If empty, add `state:blocked` to `inputs.issue_number`, comment `🛑 agent-team / rebase: mode=rebase requires pr_number.`, stop. + +**Steps**: + +1. Check out the PR branch: + - `gh pr view --json headRefName,state,isDraft` — confirm the PR is open and draft. If closed or merged, stop silently (nothing to do). + - `git fetch origin && git checkout ` + +2. Fetch `main`: + - `git fetch origin main` + - If `main` is already an ancestor of `HEAD` (`git merge-base --is-ancestor origin/main HEAD`), the PR is current. Stop silently — post no comment. + +3. Rebase: + - `git rebase origin/main`. + - On conflicts → follow the "Conflict resolution" section below. Clean rebase or successful mechanical resolve → continue to step 4. + +4. Run the project's test command once. Use the same test-command detection as `impl` mode (read `package.json` / `Makefile` / CI files). If no test command is detectable, skip and note that in step 5. + - Tests pass → continue to step 5. + - Tests fail → escalate per "Conflict resolution" escalation format, stop. + +5. Push: + - `git push --force-with-lease origin HEAD:` + - Post one comment on the PR — body: + ``` + 🤖 agent-team / rebase: rebased onto main at . + + - Rebase: > + - Tests: <✅ passed | ⚠ skipped — no test command detected> + ``` + +6. Stop. **Do not dispatch the reviewer.** Rebase mode is terminal. + +**Rules for this mode**: +- Never read the spec or plan. This mode addresses no requirements changes. +- Never dispatch downstream. The PR stays in whatever state it was in (`state:review-needed`, `state:done`, etc.) — a rebase doesn't reset review. +- Never touch files beyond what `git rebase` modifies. No spec-driven edits. +- Force-push uses `--force-with-lease` so a concurrent human push isn't clobbered. + ## Normal path 1. Fetch the issue (`gh issue view `). Extract: From a9b15f76930a2f19d3fa4c9c28fa1c107530b0f7 Mon Sep 17 00:00:00 2001 From: Verky Yi Date: Thu, 23 Apr 2026 00:28:59 +0000 Subject: [PATCH 07/14] agent-team: tighten rebase-mode prompt per review - Iteration guard explicitly skipped for rebase mode - Rebase-only mode stops on non-draft PRs - Test-failure cleanup uses reset (not rebase --abort) --- catalog/agent-team/implementer-agent.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/catalog/agent-team/implementer-agent.md b/catalog/agent-team/implementer-agent.md index b7ce360..23e8ea7 100644 --- a/catalog/agent-team/implementer-agent.md +++ b/catalog/agent-team/implementer-agent.md @@ -115,6 +115,8 @@ Any other value → add `state:blocked` to `inputs.issue_number`, post `🛑 age ## Iteration guard (impl mode only) +Skip this section entirely if `inputs.mode == "rebase"` — a rebase is not a review attempt. + If `inputs.iteration` is greater than 3: - Add `state:blocked` to issue `inputs.issue_number`. - Post one comment on that issue: `🛑 agent-team: max iterations reached at impl stage.` @@ -133,7 +135,7 @@ The iteration guard does not apply to this mode — a rebase is not a review att **Steps**: 1. Check out the PR branch: - - `gh pr view --json headRefName,state,isDraft` — confirm the PR is open and draft. If closed or merged, stop silently (nothing to do). + - `gh pr view --json headRefName,state,isDraft` — confirm the PR is open and draft. Stop silently (post no comment) if any of: PR is closed, merged, or `isDraft: false` (a human promoted it out of draft; they control it from here). - `git fetch origin && git checkout ` 2. Fetch `main`: @@ -146,7 +148,7 @@ The iteration guard does not apply to this mode — a rebase is not a review att 4. Run the project's test command once. Use the same test-command detection as `impl` mode (read `package.json` / `Makefile` / CI files). If no test command is detectable, skip and note that in step 5. - Tests pass → continue to step 5. - - Tests fail → escalate per "Conflict resolution" escalation format, stop. + - Tests fail → the rebase itself is already complete, so the CR section's `git rebase --abort` does not apply. Run `git reset --hard ORIG_HEAD` to restore the pre-rebase state, then escalate per "Conflict resolution" escalation format, stop. 5. Push: - `git push --force-with-lease origin HEAD:` From a4571dd29611ad25d0319bd815cd5f1e6edffcd7 Mon Sep 17 00:00:00 2001 From: Verky Yi Date: Thu, 23 Apr 2026 00:31:18 +0000 Subject: [PATCH 08/14] agent-team: add sweep-agent for periodic rebase dispatch Co-Authored-By: Claude Sonnet 4.6 --- catalog/agent-team/sweep-agent.md | 89 +++++++++++++++++++++++++++++++ 1 file changed, 89 insertions(+) create mode 100644 catalog/agent-team/sweep-agent.md diff --git a/catalog/agent-team/sweep-agent.md b/catalog/agent-team/sweep-agent.md new file mode 100644 index 0000000..c7373b6 --- /dev/null +++ b/catalog/agent-team/sweep-agent.md @@ -0,0 +1,89 @@ +--- +engine: claude +description: | + Sweep agent for the agent-team pattern. Runs on a schedule and on demand, + enumerates open draft PRs labeled `agent-team:pr`, and dispatches the + implementer in `rebase` mode for any that have fallen behind `main`. + No LLM reasoning on the diffs themselves — it's enumerate + ancestry + check + dispatch. + +on: + schedule: + - cron: "17 */6 * * *" + workflow_dispatch: {} + +concurrency: + group: agent-team-sweep + cancel-in-progress: false + +timeout-minutes: 5 + +permissions: + contents: read + issues: read + pull-requests: read + +network: + allowed: + - defaults + +checkout: + fetch-depth: 0 + +tools: + github: + toolsets: [default] + min-integrity: none + bash: true + +safe-outputs: + threat-detection: false + add-comment: + max: 1 + target: "*" + dispatch-workflow: + workflows: [implementer-agent] + max: 20 +--- + +# Sweep Agent + +You are the **sweep** for the agent-team pipeline. Your job: find open draft PRs labeled `agent-team:pr` that are behind `main`, and dispatch the implementer in `rebase` mode for each. + +## Steps + +1. List candidate PRs: + ``` + gh pr list --label agent-team:pr --state open --draft \ + --json number,headRefName,headRefOid,body --limit 50 + ``` + +2. For each PR in the list: + + a. Derive `issue_number` by parsing `Closes #` from the PR body. If no `Closes #N` marker exists, **skip that PR** (log the skip; do not dispatch). + + b. Check if the PR is behind `main`: + ``` + git fetch origin main --quiet + git merge-base --is-ancestor origin/main + ``` + - Exit code `0` → PR is current, skip it. + - Exit code `1` → PR is behind, dispatch (next step). + + c. Dispatch the implementer in rebase mode via the `dispatch-workflow` safe-output: + - workflow: `implementer-agent` + - inputs: + - `issue_number`: `` (from step 2a) + - `pr_number`: `` + - `iteration`: `"1"` (rebase mode bypasses the iteration guard; any value works) + - `mode`: `"rebase"` + +3. After the loop, if at least one dispatch was emitted, post one summary comment on the **repository's dashboard issue** (optional — only if a dashboard issue is configured; otherwise skip). Default: post no comment. The dispatched runs' logs are the audit trail. + +## Rules + +- Sweep never edits code, never rebases itself, never dispatches anything except the implementer in `rebase` mode. +- If `gh pr list` returns zero PRs, stop silently — no comment, no dispatch. +- If more than 20 PRs are behind (unusually large), dispatch the first 20 only. The next sweep run (6h later) picks up the rest. Prevents dispatch-workflow cap from erroring out. +- Sweep is idempotent — running it back-to-back produces zero extra dispatches (the second run sees all PRs current). +- Footer comment (only if one is posted): `🤖 agent-team / sweep`. From f54c56cd786213d8a6eee1b8215da2dee59f1cdd Mon Sep 17 00:00:00 2001 From: Verky Yi Date: Thu, 23 Apr 2026 00:36:28 +0000 Subject: [PATCH 09/14] agent-team: tighten sweep prompt per review - Fetch main once before the loop (was per-PR) - Tighten idempotency wording to 're-entrant safe' - Remove dashboard-issue comment path + unused add-comment safe-output --- catalog/agent-team/sweep-agent.md | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/catalog/agent-team/sweep-agent.md b/catalog/agent-team/sweep-agent.md index c7373b6..da9581a 100644 --- a/catalog/agent-team/sweep-agent.md +++ b/catalog/agent-team/sweep-agent.md @@ -38,9 +38,6 @@ tools: safe-outputs: threat-detection: false - add-comment: - max: 1 - target: "*" dispatch-workflow: workflows: [implementer-agent] max: 20 @@ -52,8 +49,10 @@ You are the **sweep** for the agent-team pipeline. Your job: find open draft PRs ## Steps -1. List candidate PRs: +1. Fetch `main` once and list candidate PRs: ``` + git fetch origin main --quiet + gh pr list --label agent-team:pr --state open --draft \ --json number,headRefName,headRefOid,body --limit 50 ``` @@ -64,7 +63,6 @@ You are the **sweep** for the agent-team pipeline. Your job: find open draft PRs b. Check if the PR is behind `main`: ``` - git fetch origin main --quiet git merge-base --is-ancestor origin/main ``` - Exit code `0` → PR is current, skip it. @@ -78,12 +76,11 @@ You are the **sweep** for the agent-team pipeline. Your job: find open draft PRs - `iteration`: `"1"` (rebase mode bypasses the iteration guard; any value works) - `mode`: `"rebase"` -3. After the loop, if at least one dispatch was emitted, post one summary comment on the **repository's dashboard issue** (optional — only if a dashboard issue is configured; otherwise skip). Default: post no comment. The dispatched runs' logs are the audit trail. +3. After the loop, post no comment. The dispatched runs' logs (visible in the Actions tab, linked from each dispatched workflow run) are the audit trail. ## Rules - Sweep never edits code, never rebases itself, never dispatches anything except the implementer in `rebase` mode. - If `gh pr list` returns zero PRs, stop silently — no comment, no dispatch. - If more than 20 PRs are behind (unusually large), dispatch the first 20 only. The next sweep run (6h later) picks up the rest. Prevents dispatch-workflow cap from erroring out. -- Sweep is idempotent — running it back-to-back produces zero extra dispatches (the second run sees all PRs current). -- Footer comment (only if one is posted): `🤖 agent-team / sweep`. +- Sweep is re-entrant safe — back-to-back dispatches of the same PR produce at most one actual rebase push. If two sweeps overlap before the first's dispatched rebases complete, the second may re-dispatch; the implementer's rebase-mode ancestry check then exits silently with no push. From 69e450cf835481fb366d7efceb16238b92b4d37b Mon Sep 17 00:00:00 2001 From: Verky Yi Date: Thu, 23 Apr 2026 00:37:55 +0000 Subject: [PATCH 10/14] install-agent-team: include sweep-agent in install flow Co-Authored-By: Claude Sonnet 4.6 --- skills/install-agent-team/SKILL.md | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/skills/install-agent-team/SKILL.md b/skills/install-agent-team/SKILL.md index 7a26c1e..530fbf6 100644 --- a/skills/install-agent-team/SKILL.md +++ b/skills/install-agent-team/SKILL.md @@ -1,11 +1,11 @@ --- name: install-agent-team -description: Install the full four-role agent-team pattern (spec → plan → impl → review) into the current repo as one unified setup. Use when the user wants an end-to-end agent pipeline driven by a single issue label, or types /install-agent-team. +description: Install the full agent-team pattern (spec → plan → impl → review + periodic sweep) into the current repo as one unified setup. Use when the user wants an end-to-end agent pipeline driven by a single issue label, or types /install-agent-team. --- # install-agent-team -Install all four agent-team workflows into the current repo in one pass: fetch, wire auth once, apply the OAuth tweak to every lockfile, create the labels, validate, commit. +Install all five agent-team workflows (spec, planner, implementer, reviewer, and the sweep that keeps PRs rebased) into the current repo in one pass: fetch, wire auth once, apply the OAuth tweak to every lockfile, create the labels, validate, commit. The result: the user dispatches a task by adding a single `agent-team` label to any issue. The four agents coordinate across the thread via structured comments and a small internal state machine. @@ -24,7 +24,7 @@ The result: the user dispatches a task by adding a single `agent-team` label to ### 1. Explain what's about to happen -One paragraph: four workflows will be added, one auth secret will be set, seven labels will be created, nothing runs until the user opens an issue and adds `agent-team`. Ask for explicit confirmation to proceed. The user must opt in — workflows run on push. +One paragraph: five workflows will be added (four pipeline roles + a sweep that runs every 6 hours to keep PRs rebased), one auth secret will be set, seven labels will be created, nothing runs until the user opens an issue and adds `agent-team`. Ask for explicit confirmation to proceed. The user must opt in — workflows run on push. ### 2. Preflight @@ -47,9 +47,9 @@ Pick the path per `skills/install-workflow/auth.md`: - Check `gh secret list` first — if `CLAUDE_CODE_OAUTH_TOKEN` (OAuth) or `ANTHROPIC_API_KEY` (API) already exists, reuse it. Do not re-prompt. - Otherwise guide the user through `claude setup-token` + `gh secret set CLAUDE_CODE_OAUTH_TOKEN`, or `gh secret set ANTHROPIC_API_KEY` directly. -Never echo or store the token. One secret covers all four workflows. +Never echo or store the token. One secret covers all five workflows. -### 4. Install all four workflows +### 4. Install all five workflows Run, in sequence (each `gh aw add` auto-compiles): @@ -58,13 +58,14 @@ gh aw add verkyyi/github-agent-runner/catalog/agent-team/spec-agent.md@main gh aw add verkyyi/github-agent-runner/catalog/agent-team/planner-agent.md@main gh aw add verkyyi/github-agent-runner/catalog/agent-team/implementer-agent.md@main gh aw add verkyyi/github-agent-runner/catalog/agent-team/reviewer-agent.md@main +gh aw add verkyyi/github-agent-runner/catalog/agent-team/sweep-agent.md@main ``` -If any fails, stop and surface the exact error — do not proceed with a partial install. The four are a unit; a half-installed pipeline dead-ends on the first handoff. +If any fails, stop and surface the exact error — do not proceed with a partial install. The five are a unit; a half-installed pipeline dead-ends on the first handoff. ### 5. Apply the OAuth tweak (OAuth path only) -For each of the four `.lock.yml` files just generated, apply the two-pass sed from `skills/install-workflow/auth.md` Step 3. Then verify grep counts on each file per auth.md Step 4. API-key path skips this step entirely. +For each of the five `.lock.yml` files just generated, apply the two-pass sed from `skills/install-workflow/auth.md` Step 3. Then verify grep counts on each file per auth.md Step 4. API-key path skips this step entirely. ### 6. Create the labels @@ -94,7 +95,7 @@ Runs against all lock files. Safe (no recompile). Show the user, in this order: -- Four files added under `.github/workflows/` (name each `.md` + `.lock.yml` pair) +- Five files added under `.github/workflows/` (name each `.md` + `.lock.yml` pair) - Secret configured (name only, never value) or reused - Tweak applied to N lock files (or "skipped — API-key path") - Seven labels created (or "N already existed, skipped") @@ -105,7 +106,7 @@ Then ask whether to commit and push. Do not commit without explicit confirmation ## Hard rules -- **All or nothing**. If any of the four `gh aw add` calls fails, stop and back out. A half-installed pipeline is worse than none — users will dispatch tasks that stall silently. +- **All or nothing**. If any of the five `gh aw add` calls fails, stop and back out. A half-installed pipeline is worse than none — users will dispatch tasks that stall silently. - Never write the workflow YAML by hand. Always delegate to `gh aw add`. The `.md` sources live in this plugin's `catalog/agent-team/`. - Never store or echo the auth token. Pipe through `gh secret set` stdin. - Never skip the `--exclude-env ANTHROPIC_API_KEY` carve-out when applying the OAuth tweak. See `skills/install-workflow/auth.md` for why. @@ -128,7 +129,7 @@ Escape hatches at any time: remove a state label to pause, edit a comment to ste ## Out of scope for v0.1 -- Uninstalling the pipeline (remove the four `.md`/`.lock.yml` files + labels manually) +- Uninstalling the pipeline (remove the five `.md`/`.lock.yml` files + labels manually) - Cross-repo install - Customizing max iterations without editing the workflow source - Turning individual roles on/off — the four are designed to work as a unit From 87b77c9f24653c9a009b36f727024a92aa3eb5c1 Mon Sep 17 00:00:00 2001 From: Verky Yi Date: Thu, 23 Apr 2026 00:39:39 +0000 Subject: [PATCH 11/14] agent-team: document sweep + rebase-handling in README Co-Authored-By: Claude Sonnet 4.6 --- catalog/agent-team/README.md | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/catalog/agent-team/README.md b/catalog/agent-team/README.md index 3aeb296..dbf222c 100644 --- a/catalog/agent-team/README.md +++ b/catalog/agent-team/README.md @@ -1,6 +1,6 @@ # agent-team -A four-workflow pattern for a spec → plan → implement → review pipeline on a single GitHub issue. Each role is a separate gh-aw workflow; they coordinate by **dispatching the next workflow** via gh-aw's `dispatch-workflow` safe-output, passing typed inputs (issue number, iteration counter, optional PR number). +A five-workflow pattern for a spec → plan → implement → review pipeline on a single GitHub issue, plus a sweep that keeps draft PRs rebased onto `main`. Each role is a separate gh-aw workflow; they coordinate by **dispatching the next workflow** via gh-aw's `dispatch-workflow` safe-output, passing typed inputs (issue number, iteration counter, optional PR number). > **Status**: reference pattern. Templates only — `.lock.yml` files are generated when you install into a target repo. @@ -44,10 +44,24 @@ Each agent finishes its work by **emitting a `dispatch-workflow` safe-output** n issue_number, pr_number, iteration = iteration + 1 ) + + (Separately, on a 6-hour cron) + ┌─────────────┐ dispatch (issue_number, pr_number, mode=rebase) + │ sweep-agent │─────────────────────────► implementer-agent (for each stale PR) + └─────────────┘ ``` `state:*` labels (`plan-needed`, `impl-needed`, `review-needed`, `done`, `blocked`) are **cosmetic breadcrumbs for humans** — they let the GitHub UI show pipeline progress at a glance. They do **not** drive control flow; the `dispatch-workflow` safe-outputs do. +## Rebase handling + +Draft PRs drift out of date as `main` advances. Two mechanisms keep them current, no human action required: + +1. **Rebase at start of every implementer run** — `impl` mode begins with `git fetch origin main && git rebase origin/main`. Catches drift within the pipeline (initial impl, kickback cycles). +2. **Scheduled sweep** — `sweep-agent.md` runs every 6 hours (and on-demand via `workflow_dispatch`). It lists open `agent-team:pr` draft PRs, checks each for `main`-ancestry, and dispatches the implementer in `rebase` mode for any that are behind. Catches the common "PR sat waiting for human merge, main moved" case. + +Both paths share the same escalation rule: mechanical conflicts resolve silently; semantic conflicts (overlapping logic, tests fail after resolve) escalate via `state:blocked` with a targeted comment. The human sees the PR only when it's ready to merge or when their judgment is needed. + ## The comment contract Agents communicate their work product via fenced HTML-comment blocks, which downstream agents grep out of the issue body + comments. Never rely on prose ordering. @@ -69,6 +83,7 @@ Sections: `spec`, `plan`, `review`. Each carries the `iteration` at the time it | `planner-agent.md` | `workflow_dispatch` (issue_number, iteration) | `implementer-agent` (issue_number, iteration) | | `implementer-agent.md` | `workflow_dispatch` (issue_number, iteration, pr_number?) | `reviewer-agent` (issue_number, pr_number, iteration) | | `reviewer-agent.md` | `workflow_dispatch` (pr_number, issue_number, iteration) | `implementer-agent` on kickback (iteration+1), else nothing | +| `sweep-agent.md` | `schedule` (every 6h) + `workflow_dispatch` | `implementer-agent` in `rebase` mode, per stale PR | ## Install @@ -78,7 +93,7 @@ Use the bundled skill — it's the supported path: /install-agent-team ``` -One flow installs all four workflows, wires auth once, applies the OAuth tweak to every lockfile, and creates the seven labels. See [`skills/install-agent-team/SKILL.md`](../../skills/install-agent-team/SKILL.md). +One flow installs all five workflows, wires auth once, applies the OAuth tweak to every lockfile, and creates the seven labels. See [`skills/install-agent-team/SKILL.md`](../../skills/install-agent-team/SKILL.md).
Manual install (advanced) @@ -88,6 +103,7 @@ gh aw add verkyyi/github-agent-runner/catalog/agent-team/spec-agent.md@main gh aw add verkyyi/github-agent-runner/catalog/agent-team/planner-agent.md@main gh aw add verkyyi/github-agent-runner/catalog/agent-team/implementer-agent.md@main gh aw add verkyyi/github-agent-runner/catalog/agent-team/reviewer-agent.md@main +gh aw add verkyyi/github-agent-runner/catalog/agent-team/sweep-agent.md@main ``` Then apply the OAuth token tweak to each `.lock.yml` per [`skills/install-workflow/auth.md`](../../skills/install-workflow/auth.md), and create the labels (see the skill file for the exact `gh label create` commands). From a34105aaee3bccc9645e0820db3ffd70a3b8d1b3 Mon Sep 17 00:00:00 2001 From: Verky Yi Date: Thu, 23 Apr 2026 00:41:03 +0000 Subject: [PATCH 12/14] tests: cover sweep-agent in install-agent-team behavior suite Co-Authored-By: Claude Sonnet 4.6 --- tests/test-install-agent-team.sh | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/tests/test-install-agent-team.sh b/tests/test-install-agent-team.sh index a86af93..20ae18b 100755 --- a/tests/test-install-agent-team.sh +++ b/tests/test-install-agent-team.sh @@ -11,17 +11,18 @@ source "$SCRIPT_DIR/test-helpers.sh" echo "=== Test: install-agent-team skill ===" echo "" -# Test 1: Load skill + name all four roles. -echo "Test 1: Skill loads and names all four roles..." -output=$(run_claude "Use the Skill tool to load the github-agent-runner plugin's install-agent-team skill (full SKILL.md, not just the description). Then list the four roles it installs by name — the skill calls them out explicitly. Count and confirm there are exactly four." 180) +# Test 1: Load skill + name all five workflows. +echo "Test 1: Skill loads and names all five workflows..." +output=$(run_claude "Use the Skill tool to load the github-agent-runner plugin's install-agent-team skill (full SKILL.md, not just the description). Then list the five workflows it installs by name — the skill calls them out explicitly. Count and confirm there are exactly five." 180) # Deterministic: check the session transcript for a Skill tool invocation # against install-agent-team. assert_skill_used "install-agent-team" "Skill tool invoked for install-agent-team" || exit 1 -assert_contains "$output" "four" "Mentions four (roles/workflows/agents)" || exit 1 +assert_contains "$output" "five" "Mentions five (workflows)" || exit 1 assert_contains "$output" "spec" "Names the spec role" || exit 1 assert_contains "$output" "plan" "Names the planner role" || exit 1 assert_contains "$output" "implement" "Names the implementer role" || exit 1 assert_contains "$output" "review" "Names the reviewer role" || exit 1 +assert_contains "$output" "sweep" "Names the sweep workflow" || exit 1 echo "" @@ -45,7 +46,7 @@ echo "" # Test 3: Atomic install. echo "Test 3: All-or-nothing install; partial state is unacceptable..." -output=$(run_claude "Load the install-agent-team skill. Per its Hard rules, if one of the four 'gh aw add' calls fails mid-install, what does the skill do — continue, skip, or abort? And why (what does the skill say about a half-installed pipeline)? Quote the exact hard-rule text." 180) +output=$(run_claude "Load the install-agent-team skill. Per its Hard rules, if one of the five 'gh aw add' calls fails mid-install, what does the skill do — continue, skip, or abort? And why (what does the skill say about a half-installed pipeline)? Quote the exact hard-rule text." 180) assert_contains "$output" "stop|abort|back out|halt|all.or.nothing|not proceed|does not proceed" "Stops on partial failure" || exit 1 assert_contains "$output" "half|partial|stall|dead.?end|unit" "Explains why a partial install is bad" || exit 1 @@ -53,9 +54,9 @@ echo "" # Test 4: Auth configured once; tweak applied to every lockfile. echo "Test 4: Auth wired once; OAuth tweak applied to every lockfile..." -output=$(run_claude "Load the install-agent-team skill. Does it configure the Claude auth secret once (reused across all four workflows) or set it separately per workflow? And does it apply the OAuth post-compile tweak to one .lock.yml file or to every generated .lock.yml? Quote the specific steps from the skill." 180) +output=$(run_claude "Load the install-agent-team skill. Does it configure the Claude auth secret once (reused across all five workflows) or set it separately per workflow? And does it apply the OAuth post-compile tweak to one .lock.yml file or to every generated .lock.yml? Quote the specific steps from the skill." 180) assert_contains "$output" "once|single|one.*secret|reuse" "Auth configured once" || exit 1 -assert_contains "$output" "all|every|each|four" "Tweak applied to every lockfile" || exit 1 +assert_contains "$output" "all|every|each|five" "Tweak applied to every lockfile" || exit 1 assert_contains "$output" "lock\\.yml|\\.lock\\.yml|lockfile" "References the lock files" || exit 1 echo "" @@ -76,5 +77,13 @@ output=$(run_claude "Load the install-agent-team skill. Quote its Hard rules abo assert_contains "$output" "never|does not|doesn't|no" "Hard rules are upheld" || exit 1 assert_contains "$output" "gh aw add|delegate" "Delegates workflow generation" || exit 1 +echo "" + +# Test 7: Rebase behavior is part of the installed pipeline. +echo "Test 7: Skill mentions automatic rebase handling..." +output=$(run_claude "Load the install-agent-team skill. Does the installed pipeline do anything automatic about keeping draft PRs rebased onto main, or does the user have to rebase by hand? Quote the specific workflow or behavior." 180) +assert_contains "$output" "sweep|rebase" "Mentions sweep or rebase" || exit 1 +assert_contains "$output" "automat|without|silently|no.*action" "Frames it as automatic" || exit 1 + echo "" echo "=== All install-agent-team tests passed ===" From bcee624209867f6a8bf4ad85333f5c29d2717e6a Mon Sep 17 00:00:00 2001 From: Verky Yi Date: Thu, 23 Apr 2026 00:42:02 +0000 Subject: [PATCH 13/14] tests: extend e2e install check to cover sweep-agent Co-Authored-By: Claude Sonnet 4.6 --- tests/test-e2e-install-agent-team.sh | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/tests/test-e2e-install-agent-team.sh b/tests/test-e2e-install-agent-team.sh index d3fc5d0..a983b79 100755 --- a/tests/test-e2e-install-agent-team.sh +++ b/tests/test-e2e-install-agent-team.sh @@ -3,7 +3,7 @@ # # Unlike test-e2e.sh (which tests an already-installed pipeline), this test # verifies the SKILL itself: given a clean repo, does invoking install-agent-team -# produce the expected end-state (4 compiled lockfiles with OAuth tweak + 7 +# produce the expected end-state (5 compiled lockfiles with OAuth tweak + 7 # labels + no hand-edits needed)? # # Usage: @@ -117,7 +117,7 @@ echo " Set CLAUDE_CODE_OAUTH_TOKEN" # 3. Invoke the skill via claude -p against the repo clone echo "" echo "-- Invoking /install-agent-team via claude -p --" -PROMPT="We are in a fresh clone of github repo $FULL. The repo already has CLAUDE_CODE_OAUTH_TOKEN set as a secret (skip the 'claude setup-token' step in your install flow — confirm via gh secret list and proceed). Execute the /install-agent-team skill end-to-end: install all four agent-team workflows, apply the OAuth tweak to every lockfile, create the seven labels, validate. Commit and push all changes to origin/main. Do not pause for confirmations — proceed autonomously. When done, print 'SKILL_E2E_DONE' on its own line." +PROMPT="We are in a fresh clone of github repo $FULL. The repo already has CLAUDE_CODE_OAUTH_TOKEN set as a secret (skip the 'claude setup-token' step in your install flow — confirm via gh secret list and proceed). Execute the /install-agent-team skill end-to-end: install all five agent-team workflows (including the sweep), apply the OAuth tweak to every lockfile, create the seven labels, validate. Commit and push all changes to origin/main. Do not pause for confirmations — proceed autonomously. When done, print 'SKILL_E2E_DONE' on its own line." cd "$WORKDIR/repo" claude -p "$PROMPT" \ @@ -135,7 +135,7 @@ rm -rf "$WORKDIR/verify" git clone "https://github.com/$FULL.git" "$WORKDIR/verify" --quiet cd "$WORKDIR/verify" -for wf in spec-agent planner-agent implementer-agent reviewer-agent; do +for wf in spec-agent planner-agent implementer-agent reviewer-agent sweep-agent; do [ -f ".github/workflows/${wf}.md" ] && pass "workflow source committed: ${wf}.md" \ || fail "missing workflow source: ${wf}.md" [ -f ".github/workflows/${wf}.lock.yml" ] && pass "lockfile committed: ${wf}.lock.yml" \ @@ -153,6 +153,13 @@ done cd - >/dev/null +# Sweep workflow registered with Actions +if gh workflow list --repo "$FULL" --json name,path --jq '.[] | select(.path | contains("sweep-agent")) | .name' | grep -q .; then + pass "sweep-agent workflow registered with Actions" +else + fail "sweep-agent workflow not registered" +fi + # Labels want_labels=(agent-team state:plan-needed state:impl-needed state:review-needed state:done state:blocked agent-team:reviewed) have=$(gh label list --repo "$FULL" --limit 50 --json name --jq '[.[].name] | join(",")') From b21a660a1147215d62b5c0cc6fe6f08b21f8add2 Mon Sep 17 00:00:00 2001 From: Verky Yi Date: Thu, 23 Apr 2026 00:47:48 +0000 Subject: [PATCH 14/14] agent-team: create agent-team:pr label at install time MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Without this label the sweep never finds PRs to rebase — the implementer already tags new PRs with agent-team:pr but the install skill wasn't creating the label, so gh pr list --label agent-team:pr returned empty. Also gitignore stray catalog/**/*.lock.yml artifacts left by local gh aw compile during development. Co-Authored-By: Claude Opus 4.7 (1M context) --- .gitignore | 3 +++ catalog/agent-team/README.md | 4 ++-- skills/install-agent-team/SKILL.md | 7 ++++--- tests/test-e2e-install-agent-team.sh | 6 +++--- tests/test-install-agent-team.sh | 4 ++-- 5 files changed, 14 insertions(+), 10 deletions(-) diff --git a/.gitignore b/.gitignore index 4db4729..146e620 100644 --- a/.gitignore +++ b/.gitignore @@ -3,3 +3,6 @@ node_modules/ *.log .env .env.local + +# gh aw compile artifacts in catalog sources (local dev only) +catalog/**/*.lock.yml diff --git a/catalog/agent-team/README.md b/catalog/agent-team/README.md index dbf222c..c51498d 100644 --- a/catalog/agent-team/README.md +++ b/catalog/agent-team/README.md @@ -93,7 +93,7 @@ Use the bundled skill — it's the supported path: /install-agent-team ``` -One flow installs all five workflows, wires auth once, applies the OAuth tweak to every lockfile, and creates the seven labels. See [`skills/install-agent-team/SKILL.md`](../../skills/install-agent-team/SKILL.md). +One flow installs all five workflows, wires auth once, applies the OAuth tweak to every lockfile, and creates the eight labels. See [`skills/install-agent-team/SKILL.md`](../../skills/install-agent-team/SKILL.md).
Manual install (advanced) @@ -113,7 +113,7 @@ Then apply the OAuth token tweak to each `.lock.yml` per [`skills/install-workfl - Repo Actions settings: **Read and write** permissions + **Allow GitHub Actions to create and approve pull requests**. - Either `CLAUDE_CODE_OAUTH_TOKEN` (subscription) or `ANTHROPIC_API_KEY` repo secret. -- Labels (`agent-team`, `state:plan-needed`, `state:impl-needed`, `state:review-needed`, `state:done`, `state:blocked`, `agent-team:reviewed`) — the install skill creates them. +- Labels (`agent-team`, `agent-team:pr`, `state:plan-needed`, `state:impl-needed`, `state:review-needed`, `state:done`, `state:blocked`, `agent-team:reviewed`) — the install skill creates them. ## Kicking off a task diff --git a/skills/install-agent-team/SKILL.md b/skills/install-agent-team/SKILL.md index 530fbf6..621843a 100644 --- a/skills/install-agent-team/SKILL.md +++ b/skills/install-agent-team/SKILL.md @@ -24,7 +24,7 @@ The result: the user dispatches a task by adding a single `agent-team` label to ### 1. Explain what's about to happen -One paragraph: five workflows will be added (four pipeline roles + a sweep that runs every 6 hours to keep PRs rebased), one auth secret will be set, seven labels will be created, nothing runs until the user opens an issue and adds `agent-team`. Ask for explicit confirmation to proceed. The user must opt in — workflows run on push. +One paragraph: five workflows will be added (four pipeline roles + a sweep that runs every 6 hours to keep PRs rebased), one auth secret will be set, eight labels will be created, nothing runs until the user opens an issue and adds `agent-team`. Ask for explicit confirmation to proceed. The user must opt in — workflows run on push. ### 2. Preflight @@ -73,6 +73,7 @@ Create these labels (idempotent — ignore "already exists" errors): ```bash gh label create agent-team --color 7C3AED --description "Opt-in marker for the agent-team pipeline" --force +gh label create agent-team:pr --color 8B5CF6 --description "agent-team: PR tag for sweep enumeration" --force gh label create state:plan-needed --color FEF08A --description "agent-team: ready for the planner" --force gh label create state:impl-needed --color FCD34D --description "agent-team: ready for the implementer" --force gh label create state:review-needed --color FDBA74 --description "agent-team: ready for the reviewer" --force @@ -81,7 +82,7 @@ gh label create state:blocked --color F87171 --description "agent-team gh label create agent-team:reviewed --color A7F3D0 --description "agent-team: PR has been reviewed" --force ``` -(The implementer also adds an `agent-team` label to the PR it opens. Same label as the issue entry — one label, two roles: opt-in on issues, reviewer trigger on PRs.) +(The implementer applies both `agent-team` and `agent-team:pr` labels to PRs it opens. `agent-team` marks it as belonging to the pipeline; `agent-team:pr` is what the sweep filters on to find draft PRs needing rebase.) ### 7. Validate @@ -98,7 +99,7 @@ Show the user, in this order: - Five files added under `.github/workflows/` (name each `.md` + `.lock.yml` pair) - Secret configured (name only, never value) or reused - Tweak applied to N lock files (or "skipped — API-key path") -- Seven labels created (or "N already existed, skipped") +- Eight labels created (or "N already existed, skipped") - **How to dispatch a task**: *"Open an issue describing what you want built. Add the `agent-team` label. Done."* - Reminder: `gh aw compile` reverts the OAuth tweak. Re-apply on every recompile. `gh aw validate` is safe. diff --git a/tests/test-e2e-install-agent-team.sh b/tests/test-e2e-install-agent-team.sh index a983b79..39147d4 100755 --- a/tests/test-e2e-install-agent-team.sh +++ b/tests/test-e2e-install-agent-team.sh @@ -3,7 +3,7 @@ # # Unlike test-e2e.sh (which tests an already-installed pipeline), this test # verifies the SKILL itself: given a clean repo, does invoking install-agent-team -# produce the expected end-state (5 compiled lockfiles with OAuth tweak + 7 +# produce the expected end-state (5 compiled lockfiles with OAuth tweak + 8 # labels + no hand-edits needed)? # # Usage: @@ -117,7 +117,7 @@ echo " Set CLAUDE_CODE_OAUTH_TOKEN" # 3. Invoke the skill via claude -p against the repo clone echo "" echo "-- Invoking /install-agent-team via claude -p --" -PROMPT="We are in a fresh clone of github repo $FULL. The repo already has CLAUDE_CODE_OAUTH_TOKEN set as a secret (skip the 'claude setup-token' step in your install flow — confirm via gh secret list and proceed). Execute the /install-agent-team skill end-to-end: install all five agent-team workflows (including the sweep), apply the OAuth tweak to every lockfile, create the seven labels, validate. Commit and push all changes to origin/main. Do not pause for confirmations — proceed autonomously. When done, print 'SKILL_E2E_DONE' on its own line." +PROMPT="We are in a fresh clone of github repo $FULL. The repo already has CLAUDE_CODE_OAUTH_TOKEN set as a secret (skip the 'claude setup-token' step in your install flow — confirm via gh secret list and proceed). Execute the /install-agent-team skill end-to-end: install all five agent-team workflows (including the sweep), apply the OAuth tweak to every lockfile, create the eight labels, validate. Commit and push all changes to origin/main. Do not pause for confirmations — proceed autonomously. When done, print 'SKILL_E2E_DONE' on its own line." cd "$WORKDIR/repo" claude -p "$PROMPT" \ @@ -161,7 +161,7 @@ else fi # Labels -want_labels=(agent-team state:plan-needed state:impl-needed state:review-needed state:done state:blocked agent-team:reviewed) +want_labels=(agent-team agent-team:pr state:plan-needed state:impl-needed state:review-needed state:done state:blocked agent-team:reviewed) have=$(gh label list --repo "$FULL" --limit 50 --json name --jq '[.[].name] | join(",")') for lbl in "${want_labels[@]}"; do echo "$have" | grep -q "$lbl" && pass "label created: $lbl" || fail "label missing: $lbl" diff --git a/tests/test-install-agent-team.sh b/tests/test-install-agent-team.sh index 20ae18b..d6a44c6 100755 --- a/tests/test-install-agent-team.sh +++ b/tests/test-install-agent-team.sh @@ -62,8 +62,8 @@ assert_contains "$output" "lock\\.yml|\\.lock\\.yml|lockfile" "References the lo echo "" # Test 5: Label creation is part of install. -echo "Test 5: All seven labels created by the installer (entry + state:* + reviewed)..." -output=$(run_claude "Load the install-agent-team skill. Does it create the required GitHub labels as part of installation, or does it expect the user to create them? List the label names it creates (per the skill's 'Create the labels' step) — there should be seven." 180) +echo "Test 5: All eight labels created by the installer (entry + pr tag + state:* + reviewed)..." +output=$(run_claude "Load the install-agent-team skill. Does it create the required GitHub labels as part of installation, or does it expect the user to create them? List the label names it creates (per the skill's 'Create the labels' step) — there should be eight." 180) assert_contains "$output" "creates?|gh label create|sets? up" "Installer creates labels" || exit 1 assert_contains "$output" "agent-team" "Creates the agent-team entry label" || exit 1 assert_contains "$output" "state:plan-needed|state:impl-needed|state:review-needed" "Creates the state:* labels" || exit 1