Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 14 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,12 +74,22 @@ Edit the relevant `SKILL.md` or data file. Test by running the skill locally wit

## Testing

There is no automated test harness for skills — they are instruction sets interpreted by Claude Code, not code with unit tests. The validation steps are:
Skills are instruction sets interpreted by Claude Code, so the test suite uses headless Claude invocations to assert skill behavior rather than unit tests. Run all fast checks with:

```bash
./tests/run-tests.sh # tier-2 invariants + tier-1 skill assertions (~4-5 min)
./tests/run-tests.sh --verbose # show per-assertion output
./tests/run-tests.sh --test test-install-workflow.sh
./tests/run-tests.sh --with-e2e # also run the daily E2E (~5-7 min extra)
```

See [tests/README.md](tests/README.md) for a full tier breakdown, expected runtimes, and how to add a test.

Additional validation steps for things the test suite doesn't cover:

1. **Load the plugin**: `claude --plugin-dir .` — confirm no startup errors.
2. **Run the skill manually**: invoke `/discover-workflows` or `/install-workflow` and walk through the flow.
3. **Validate lock files** (if you changed `.lock.yml` files): `gh aw validate` — safe, does not recompile.
4. **Check grep counts** (if you applied the OAuth tweak): see [skills/install-workflow/auth.md](skills/install-workflow/auth.md#step-4--verify-the-tweak-shape).
2. **Validate lock files** (if you changed `.lock.yml` files): `gh aw validate` — safe, does not recompile.
3. **Check grep counts** (if you applied the OAuth tweak): see [skills/install-workflow/auth.md](skills/install-workflow/auth.md#step-4--verify-the-tweak-shape).

Never test by committing untested changes to `main`. The installed workflows run on push to `main`, so a broken install skill or a bad `.lock.yml` will trigger a live workflow run.

Expand Down
10 changes: 9 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ To remove workflows this plugin installed into your target repo:

- `gh aw remove <workflow>` for each installed workflow (deletes both the `.md` source and the compiled `.lock.yml`), then commit the deletion.
- `gh secret delete CLAUDE_CODE_OAUTH_TOKEN` — or `ANTHROPIC_API_KEY`, whichever path you used — to unset the auth secret.
- For `agent-team` specifically, also delete the seven labels: `gh label delete agent-team` plus `gh label delete state:<name>` for each of `plan-needed`, `impl-needed`, `review-needed`, `done`, `blocked`, and `in-progress`.
- For `agent-team` specifically, also delete the seven labels: `gh label delete agent-team`, `gh label delete agent-team:reviewed`, plus `gh label delete state:<name>` for each of `plan-needed`, `impl-needed`, `review-needed`, `done`, and `blocked`.

Nothing else is persisted — the plugin writes only to your target repo (under user approval) and holds no local state outside Claude Code's own plugin directory.

Expand Down Expand Up @@ -123,6 +123,14 @@ catalog/
`.lock.yml` files are marked `linguist-generated` and `merge=ours` in `.gitattributes` to prevent spurious merge conflicts.
</details>

## Publishing (maintainers only)

1. Bump `"version"` in `.claude-plugin/plugin.json` and update the `v<version>` status badge near the top of this README (the `test-invariants` check enforces they stay in sync).
2. Commit to `main`.
3. Create and push a tag: `git tag v<version> && git push origin v<version>`.
4. Draft a GitHub release from the tag — the release body is the human-readable changelog entry.
5. The marketplace URL (`https://raw.githubusercontent.com/verkyyi/github-agent-runner/main/.claude-plugin/marketplace.json`) is stable; existing users pick up the new version automatically on next plugin refresh. For new registry listings, follow the submission flows at [claude-plugins.dev](https://claude-plugins.dev) and [ClaudePluginHub](https://claudepluginhub.com).

## Credits

Built on two open-source projects from the [GitHub Next](https://githubnext.com) team:
Expand Down
2 changes: 1 addition & 1 deletion tests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Each run burns modest tokens against your Claude account.
| 1 | `test-discover-workflows.sh` | Skill loads; mentions `githubnext/agentics`; runtime fetch (no static catalog); fail-stop on upstream error | ~1min |
| 1 | `test-install-workflow.sh` | Skill loads; mentions `gh aw add` + `gh secret set`; documents both auth paths; understands the `--exclude-env` carve-out; hard rules (never writes YAML by hand, never stores tokens) | ~2min |
| 1 | `test-install-agent-team.sh` | Skill loads; pitches the four roles (spec/plan/impl/review); one-label dispatch via `agent-team`; atomic install (all-or-nothing); auth wired once; OAuth tweak applied to every lockfile; creates the `state:*` label set; inherits the no-hand-written-YAML / no-token-echo hard rules | ~2min |
| 3 | `test-e2e.sh` (opt-in) | Real pipeline run on `verkyyi/agent-team-playground`: opens a unique canned issue, labels `agent-team`, polls until terminal state, asserts PR exists with `Closes #N` + test-status section + reviewer verdict + pipeline-summary comment. Collects per-stage timings and compares to last run — yellow flag if any stage or total wall-clock exceeds 150% of baseline. **Not** run by default; opt in via `--tier3` (see below). | ~20-35min |
| 3 | `test-e2e.sh` (opt-in) | Real pipeline run on `verkyyi/agent-team-playground`: opens a unique canned issue, labels `agent-team`, polls until terminal state, asserts PR exists with `Closes #N` + test-status section + reviewer verdict + pipeline-summary comment. Collects per-stage timings and compares to last run — yellow flag if any stage or total wall-clock exceeds 150% of baseline. **Never** in the runner — invoke directly: `./tests/test-e2e.sh` (see below). | ~20-35min |
| 3-skill | `test-e2e-install-workflow.sh` (opt-in, destructive) | Skill E2E for `/install-workflow`: `gh repo create`s a throwaway private repo, pre-seeds the OAuth secret from SSM, invokes the skill via `claude -p --plugin-dir <repo>` with a target workflow (default `daily-repo-status`). Asserts the `.md` + `.lock.yml` are committed, frontmatter has `engine: claude` + `source: githubnext/agentics`, the two-pass OAuth tweak was correctly applied (API≥2, OAUTH≥5, `--exclude-env ANTHROPIC_API_KEY` preserved), and skill printed its completion marker. Single-workflow path is the plugin's core; test exercises Step-5's engine-fix-and-recompile flow since upstream agentics omits `engine:`. Deletes the repo on success (`--keep` to retain). | ~2-4min |
| 3-skill | `test-e2e-install-agent-team.sh` (opt-in, destructive) | Skill E2E for `/install-agent-team` (the multi-workflow installer). Same throwaway-repo pattern: asserts all four workflows committed + OAuth tweak on every lockfile + all seven labels created + skill printed its completion marker. | ~5-8min |

Expand Down