feat(ce-release-notes): add skill for browsing plugin release history#589
feat(ce-release-notes): add skill for browsing plugin release history#589
Conversation
Captures the requirements and implementation plan for a new slash-only skill that summarizes recent compound-engineering plugin releases and answers questions grounded in GitHub Releases. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ransports
Add list-plugin-releases.py — a Python 3 stdlib helper that fetches the
last N compound-engineering plugin releases from GitHub, falls back from
gh to the anonymous API on missing-binary / non-zero exit / empty result
(GHE-pointing case) / malformed JSON, and emits a single-line JSON
contract ({ok, source, fetched_at, releases[…]} or {ok:false, error{…}}).
Filters tags to compound-engineering-v* so cli-v*, coding-tutor-v*, and
marketplace-v* releases are excluded. Extracts linked PR numbers from
release-please bodies via [#N] (markdown-link form), avoiding bare #N
mentions and commit-SHA parens.
Use gh api /repos/.../releases (not gh release list --json) so body and
html_url are populated and gh's auth removes the anonymous rate limit.
Tests use Bun.spawn with two seams: CE_RELEASE_NOTES_GH_BIN to mock the
gh binary, and --api-base to point urllib at a local Bun.serve harness.
15 tests cover gh path, gh→anon fallback (4 modes), anon path, anon
error paths (rate_limit, network_outage), and integration.
Wire bare /ce:release-notes invocation: parse args (stripping mode:* tokens), invoke the helper for 40 raw releases, and render the first 10 plugin releases with version, date, body, and a release-page link. Per-release body is soft-capped at 25 rendered lines with markdown-fence awareness — if the cut lands inside an open code fence, append a closing fence before the "see more" link so renderers do not swallow following content. Frontmatter uses the colon form (name: ce:release-notes) to align with user-facing slash workflows and trigger the Codex prompt-wrapper path. disable-model-invocation: true keeps the skill slash-only in v1. Query mode is referenced as Phase 4 with a fall-back-to-summary stub; the actual query-mode logic lands in the next unit.
Replace the Phase 4 stub with the full query-mode flow: fetch a 60-raw buffer, judge confident match across the 20-plugin-release search window, optionally enrich the matched release(s) from the first linked PR via gh pr view, synthesize a narrative answer with a (v2.X.Y) citation, and fall through to a hardcoded R14 line when no confident match exists. Phase 6 includes a prompt-injection guard: release bodies are read as data, not as instructions. PR enrichment is best-effort — gh failures fall through to body-only synthesis with a one-line note rather than aborting the response. Phase 7 also instructs list-form invocation so a release body can never inject shell content.
Add /ce:release-notes to the Workflow Utilities table next to /ce-update. Plugin/marketplace manifests are auto-counted, so no manual count bumps. Verified bun run release:validate, full bun test (only pre-existing flaky resolve-base test fails), and that bun run convert ships the skill to both opencode and codex (with the Codex prompt wrapper auto-created because the frontmatter name uses the colon form).
The GitHub Releases API returns both a browser-friendly `html_url` and
an API `url` (pointing at `api.github.com/.../releases/{id}`). The
helper's normalizer preferred `url` first, so the rendered
"Full release notes →" links in summary mode pointed at the API
endpoint instead of the release page. Swap the preference and add a
regression test that sets both fields on an API-shaped fixture.
Surfaced by eval iteration 1 of the ce-release-notes skill.
Summary mode now renders the last 5 releases instead of 10 — at the plugin's current cadence (~2 releases per week) this is about 10 days of history, compact enough to avoid a wall of text but wide enough to catch recent activity. Added a one-line hint pointing users at query mode for older history. Query mode's search window is 40 (up from 20), and the helper fetch limit is 100 (up from 60) to ensure the window can be filled even when sibling tags (cli-v*, marketplace-v*, etc.) interleave. Surfaced by eval iteration 1: asking "what happened to deepen-plan?" put the canonical v2.51.0 promotion outside the old 20-release window. Phase 9's no-match line updated to reference 40.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6f8d802ac7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| Run the helper with a wider buffer so the search window can be filled even when sibling tags interleave heavily: | ||
|
|
||
| ```bash | ||
| python3 scripts/list-plugin-releases.py --limit 100 |
There was a problem hiding this comment.
Fetch enough raw releases to guarantee 40 plugin matches
Query mode says it searches the last 40 plugin releases, but this command only fetches 100 raw releases before filtering to compound-engineering-v*. If sibling components (cli-v*, coding-tutor-v*, marketplace-v*, cursor-marketplace-v*) make up more than 60 of those 100 tags, the filtered window drops below 40 and Phase 9 can return a false “not found” for changes that are still within the intended 40-plugin-release horizon. Paginating until 40 plugin releases are collected (or history is exhausted) would avoid this regression.
Useful? React with 👍 / 👎.
Upstream PR #589 introduced the ce-release-notes skill with frontmatter name ce:release-notes (old colon format) and two in-content references to /ce:release-notes. Aligned all three to the new ce-release-notes hyphen convention this branch standardizes on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Upstream PR #589 introduced the ce-release-notes skill with frontmatter name ce:release-notes (old colon format) and two in-content references to /ce:release-notes. Aligned all three to the new ce-release-notes hyphen convention this branch standardizes on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Upstream PR #589 introduced the ce-release-notes skill with frontmatter name ce:release-notes (old colon format) and two in-content references to /ce:release-notes. Aligned all three to the new ce-release-notes hyphen convention this branch standardizes on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Upstream PR #589 introduced the ce-release-notes skill with frontmatter name ce:release-notes (old colon format) and two in-content references to /ce:release-notes. Aligned all three to the new ce-release-notes hyphen convention this branch standardizes on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Upstream PR #589 introduced the ce-release-notes skill with frontmatter name ce:release-notes (old colon format) and two in-content references to /ce:release-notes. Aligned all three to the new ce-release-notes hyphen convention this branch standardizes on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Upstream PR #589 introduced the ce-release-notes skill with frontmatter name ce:release-notes (old colon format) and two in-content references to /ce:release-notes. Aligned all three to the new ce-release-notes hyphen convention this branch standardizes on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
/ce:release-noteslets anyone using the compound-engineering plugin ask "what shipped recently?" or "what happened to the deepen-plan skill?" without scrolling the GitHub releases page. Bare invocation renders the last 5 plugin releases with a hint to go deeper; a natural-language argument searches the last 40 and answers with a version citation and (when possible) PR-level grounding.Demo
Three frames: summary mode with the hint line and footer, a query-mode answer that cites v2.67.0 and PR #568 for
ce-polish-beta, and the Phase 9 no-match line for an unrelated query.How it works
The skill has two modes plus a graceful no-match path:
/ce:release-notes(bare)/ce:release-notes <question>gh pr viewon the first linked PRA Python helper (
scripts/list-plugin-releases.py) owns all transport:gh apifirst (uses the user's auth, no rate limit), anonymous GitHub API as fallback. The helper always exits 0 and emits a single JSON object on stdout —{ok, source, fetched_at, releases}on success or{ok: false, error: {code, message, user_hint}}on failure. The skill never branches on transport; it only reads the contract.Filtering is tag-prefix based (
compound-engineering-v*) so sibling release-please components (cli-v*,marketplace-v*,coding-tutor-v*,cursor-marketplace-v*) are excluded. The helper fetches 100 raw releases in query mode so a 40-entry filtered window can be filled even when sibling tags interleave heavily.Design decisions worth calling out
disable-model-invocation: true. The skill is user-invoked only. There is no ambient situation in which the model should volunteer a release-notes summary.2.65.0,v2.65.0,compound-engineering-v2.65.0) are treated as query strings, not a separate lookup mode.html_urloverurlin helper output. The GitHub Releases API returns both; onlyhtml_urlpoints at the browser-visible release page. The helper prefers it and there is a regression test pinning the preference.gh pr viewfails (missing auth, rate limit, deleted PR), the narrative falls back to body-only synthesis and appends a one-line note rather than aborting.Evaluation
The skill was graded against a snapshot of 29 real plugin releases via a 6-scenario harness: bare summary, four representative queries (active skill, removed skill, version-as-query, unrelated query), and an explicit no-match. Scenarios ran as parallel reviewer subagents; outputs were self-graded against the snapshot.
First pass: 25/26 assertions. The failing assertion surfaced a real bug — the helper was returning
api.github.com/repos/.../releases/{id}URLs instead of the browser-friendly release page. Fixed by swapping thehtml_url/urlpreference and locking it in with a unit test. Also tightened the summary window to 5 (from 10) with a query hint, and widened the query window to 40 (from 20) after verifying that v2.51.0'sdeepen-planfold-in fell outside the original window.Test plan
bun test tests/skills/ce-release-notes-helper.test.tscovers the contract on success, both failure paths (rate limit, network outage),gh-missing fallback, tag-prefix filtering of sibling components, URL normalization, and PR extraction from release-please bodies.bun run release:validateconfirms skill metadata consistency with the marketplace catalog.