From bf9d997f3db58875e6589d01693a7ad6eebad9ed Mon Sep 17 00:00:00 2001
From: "github-actions[bot]"
 <41898282+github-actions[bot]@users.noreply.github.com>
Date: Tue, 21 Apr 2026 11:00:07 +0000
Subject: [PATCH] docs: fix label names, add Publishing section, correct test
 runner flags

Four documentation gaps found during routine doc sweep:

- README.md: Uninstall section listed `state:in-progress` for agent-team
  label cleanup, but the label the install skill actually creates is
  `agent-team:reviewed`. Fixed to match the skill and catalog/agent-team
  README verbatim.

- README.md: Added missing `## Publishing` section. CONTRIBUTING.md has a
  "Publishing (maintainers only)" heading that links to `README.md#publishing`,
  but that anchor did not exist, leaving a broken reference.

- CONTRIBUTING.md: Testing section said "no automated test harness" but
  `tests/run-tests.sh` runs tier-2 invariants + tier-1 skill assertions on
  every CI run. Rewrote the section to reference the runner and point to
  tests/README.md.

- tests/README.md: Table row for `test-e2e.sh` said to opt in via `--tier3`,
  a flag that does not exist in `run-tests.sh`. Clarified that `test-e2e.sh`
  must be invoked directly and is never managed by the runner.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 CONTRIBUTING.md | 18 ++++++++++++++----
 README.md       | 10 +++++++++-
 tests/README.md |  2 +-
 3 files changed, 24 insertions(+), 6 deletions(-)
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 1c2721e..f08163a 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -74,12 +74,22 @@ Edit the relevant `SKILL.md` or data file. Test by running the skill locally wit
 
 ## Testing
 
-There is no automated test harness for skills — they are instruction sets interpreted by Claude Code, not code with unit tests. The validation steps are:
+Skills are instruction sets interpreted by Claude Code, so the test suite uses headless Claude invocations to assert skill behavior rather than unit tests. Run all fast checks with:
+
+```bash
+./tests/run-tests.sh                    # tier-2 invariants + tier-1 skill assertions (~4-5 min)
+./tests/run-tests.sh --verbose          # show per-assertion output
+./tests/run-tests.sh --test test-install-workflow.sh
+./tests/run-tests.sh --with-e2e        # also run the daily E2E (~5-7 min extra)
+```
+
+See [tests/README.md](tests/README.md) for a full tier breakdown, expected runtimes, and how to add a test.
+
+Additional validation steps for things the test suite doesn't cover:
 
 1. **Load the plugin**: `claude --plugin-dir .` — confirm no startup errors.
-2. **Run the skill manually**: invoke `/discover-workflows` or `/install-workflow` and walk through the flow.
-3. **Validate lock files** (if you changed `.lock.yml` files): `gh aw validate` — safe, does not recompile.
-4. **Check grep counts** (if you applied the OAuth tweak): see [skills/install-workflow/auth.md](skills/install-workflow/auth.md#step-4--verify-the-tweak-shape).
+2. **Validate lock files** (if you changed `.lock.yml` files): `gh aw validate` — safe, does not recompile.
+3. **Check grep counts** (if you applied the OAuth tweak): see [skills/install-workflow/auth.md](skills/install-workflow/auth.md#step-4--verify-the-tweak-shape).
 
 Never test by committing untested changes to `main`. The installed workflows run on push to `main`, so a broken install skill or a bad `.lock.yml` will trigger a live workflow run.
 
diff --git a/README.md b/README.md
index e18e51a..a5878f5 100644
--- a/README.md
+++ b/README.md
@@ -80,7 +80,7 @@ To remove workflows this plugin installed into your target repo:
 
 - `gh aw remove <workflow>` for each installed workflow (deletes both the `.md` source and the compiled `.lock.yml`), then commit the deletion.
 - `gh secret delete CLAUDE_CODE_OAUTH_TOKEN` — or `ANTHROPIC_API_KEY`, whichever path you used — to unset the auth secret.
-- For `agent-team` specifically, also delete the seven labels: `gh label delete agent-team` plus `gh label delete state:<name>` for each of `plan-needed`, `impl-needed`, `review-needed`, `done`, `blocked`, and `in-progress`.
+- For `agent-team` specifically, also delete the seven labels: `gh label delete agent-team`, `gh label delete agent-team:reviewed`, plus `gh label delete state:<name>` for each of `plan-needed`, `impl-needed`, `review-needed`, `done`, and `blocked`.
 
 Nothing else is persisted — the plugin writes only to your target repo (under user approval) and holds no local state outside Claude Code's own plugin directory.
 
@@ -123,6 +123,14 @@ catalog/
 `.lock.yml` files are marked `linguist-generated` and `merge=ours` in `.gitattributes` to prevent spurious merge conflicts.
 </details>
 
+## Publishing (maintainers only)
+
+1. Bump `"version"` in `.claude-plugin/plugin.json` and update the `v<version>` status badge near the top of this README (the `test-invariants` check enforces they stay in sync).
+2. Commit to `main`.
+3. Create and push a tag: `git tag v<version> && git push origin v<version>`.
+4. Draft a GitHub release from the tag — the release body is the human-readable changelog entry.
+5. The marketplace URL (`https://raw.githubusercontent.com/verkyyi/github-agent-runner/main/.claude-plugin/marketplace.json`) is stable; existing users pick up the new version automatically on next plugin refresh. For new registry listings, follow the submission flows at [claude-plugins.dev](https://claude-plugins.dev) and [ClaudePluginHub](https://claudepluginhub.com).
+
 ## Credits
 
 Built on two open-source projects from the [GitHub Next](https://githubnext.com) team:
diff --git a/tests/README.md b/tests/README.md
index 8208b3e..c246c2c 100644
--- a/tests/README.md
+++ b/tests/README.md
@@ -48,7 +48,7 @@ Each run burns modest tokens against your Claude account.
 | 1 | `test-discover-workflows.sh` | Skill loads; mentions `githubnext/agentics`; runtime fetch (no static catalog); fail-stop on upstream error | ~1min |
 | 1 | `test-install-workflow.sh`   | Skill loads; mentions `gh aw add` + `gh secret set`; documents both auth paths; understands the `--exclude-env` carve-out; hard rules (never writes YAML by hand, never stores tokens) | ~2min |
 | 1 | `test-install-agent-team.sh` | Skill loads; pitches the four roles (spec/plan/impl/review); one-label dispatch via `agent-team`; atomic install (all-or-nothing); auth wired once; OAuth tweak applied to every lockfile; creates the `state:*` label set; inherits the no-hand-written-YAML / no-token-echo hard rules | ~2min |
-| 3 | `test-e2e.sh` (opt-in) | Real pipeline run on `verkyyi/agent-team-playground`: opens a unique canned issue, labels `agent-team`, polls until terminal state, asserts PR exists with `Closes #N` + test-status section + reviewer verdict + pipeline-summary comment. Collects per-stage timings and compares to last run — yellow flag if any stage or total wall-clock exceeds 150% of baseline. **Not** run by default; opt in via `--tier3` (see below). | ~20-35min |
+| 3 | `test-e2e.sh` (opt-in) | Real pipeline run on `verkyyi/agent-team-playground`: opens a unique canned issue, labels `agent-team`, polls until terminal state, asserts PR exists with `Closes #N` + test-status section + reviewer verdict + pipeline-summary comment. Collects per-stage timings and compares to last run — yellow flag if any stage or total wall-clock exceeds 150% of baseline. **Never** in the runner — invoke directly: `./tests/test-e2e.sh` (see below). | ~20-35min |
 | 3-skill | `test-e2e-install-workflow.sh` (opt-in, destructive) | Skill E2E for `/install-workflow`: `gh repo create`s a throwaway private repo, pre-seeds the OAuth secret from SSM, invokes the skill via `claude -p --plugin-dir <repo>` with a target workflow (default `daily-repo-status`). Asserts the `.md` + `.lock.yml` are committed, frontmatter has `engine: claude` + `source: githubnext/agentics`, the two-pass OAuth tweak was correctly applied (API≥2, OAUTH≥5, `--exclude-env ANTHROPIC_API_KEY` preserved), and skill printed its completion marker. Single-workflow path is the plugin's core; test exercises Step-5's engine-fix-and-recompile flow since upstream agentics omits `engine:`. Deletes the repo on success (`--keep` to retain). | ~2-4min |
 | 3-skill | `test-e2e-install-agent-team.sh` (opt-in, destructive) | Skill E2E for `/install-agent-team` (the multi-workflow installer). Same throwaway-repo pattern: asserts all four workflows committed + OAuth tweak on every lockfile + all seven labels created + skill printed its completion marker. | ~5-8min |