Skip to content

feat(ce-release-notes): add skill for browsing plugin release history#589

Merged
tmchow merged 7 commits intomainfrom
tmchow/ce-release-notes
Apr 17, 2026
Merged

feat(ce-release-notes): add skill for browsing plugin release history#589
tmchow merged 7 commits intomainfrom
tmchow/ce-release-notes

Conversation

@tmchow
Copy link
Copy Markdown
Collaborator

@tmchow tmchow commented Apr 17, 2026

Summary

/ce:release-notes lets anyone using the compound-engineering plugin ask "what shipped recently?" or "what happened to the deepen-plan skill?" without scrolling the GitHub releases page. Bare invocation renders the last 5 plugin releases with a hint to go deeper; a natural-language argument searches the last 40 and answers with a version citation and (when possible) PR-level grounding.

Demo

Demo

Three frames: summary mode with the hint line and footer, a query-mode answer that cites v2.67.0 and PR #568 for ce-polish-beta, and the Phase 9 no-match line for an unrelated query.

How it works

The skill has two modes plus a graceful no-match path:

Mode Trigger Window What it renders
Summary /ce:release-notes (bare) Last 5 plugin releases Compact release-please bodies, markdown-fence-aware 25-line cap, footer with query hint
Query /ce:release-notes <question> Last 40 plugin releases Narrative answer citing the primary matching version, with up to 2 older matches inline, optionally enriched by gh pr view on the first linked PR
No match Query mode, no confident match A single verbatim line pointing to the full releases page

A Python helper (scripts/list-plugin-releases.py) owns all transport: gh api first (uses the user's auth, no rate limit), anonymous GitHub API as fallback. The helper always exits 0 and emits a single JSON object on stdout — {ok, source, fetched_at, releases} on success or {ok: false, error: {code, message, user_hint}} on failure. The skill never branches on transport; it only reads the contract.

Filtering is tag-prefix based (compound-engineering-v*) so sibling release-please components (cli-v*, marketplace-v*, coding-tutor-v*, cursor-marketplace-v*) are excluded. The helper fetches 100 raw releases in query mode so a 40-entry filtered window can be filled even when sibling tags interleave heavily.

Design decisions worth calling out

  • disable-model-invocation: true. The skill is user-invoked only. There is no ambient situation in which the model should volunteer a release-notes summary.
  • Window sizes. 5 releases is about two weeks at current cadence — enough to orient without being a wall of text. 40 covers ~4 months of plugin history for targeted questions. Version-like inputs (2.65.0, v2.65.0, compound-engineering-v2.65.0) are treated as query strings, not a separate lookup mode.
  • html_url over url in helper output. The GitHub Releases API returns both; only html_url points at the browser-visible release page. The helper prefers it and there is a regression test pinning the preference.
  • Release body treated as untrusted data. The query-mode synthesis step reads bodies for content but never follows instructions that may appear inside them.
  • PR enrichment is best-effort. If gh pr view fails (missing auth, rate limit, deleted PR), the narrative falls back to body-only synthesis and appends a one-line note rather than aborting.

Evaluation

The skill was graded against a snapshot of 29 real plugin releases via a 6-scenario harness: bare summary, four representative queries (active skill, removed skill, version-as-query, unrelated query), and an explicit no-match. Scenarios ran as parallel reviewer subagents; outputs were self-graded against the snapshot.

First pass: 25/26 assertions. The failing assertion surfaced a real bug — the helper was returning api.github.com/repos/.../releases/{id} URLs instead of the browser-friendly release page. Fixed by swapping the html_url/url preference and locking it in with a unit test. Also tightened the summary window to 5 (from 10) with a query hint, and widened the query window to 40 (from 20) after verifying that v2.51.0's deepen-plan fold-in fell outside the original window.

Test plan

  • bun test tests/skills/ce-release-notes-helper.test.ts covers the contract on success, both failure paths (rate limit, network outage), gh-missing fallback, tag-prefix filtering of sibling components, URL normalization, and PR extraction from release-please bodies.
  • bun run release:validate confirms skill metadata consistency with the marketplace catalog.

Compound Engineering
Claude Code
Opus 4.7

tmchow and others added 7 commits April 17, 2026 01:11
Captures the requirements and implementation plan for a new slash-only
skill that summarizes recent compound-engineering plugin releases and
answers questions grounded in GitHub Releases.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ransports

Add list-plugin-releases.py — a Python 3 stdlib helper that fetches the
last N compound-engineering plugin releases from GitHub, falls back from
gh to the anonymous API on missing-binary / non-zero exit / empty result
(GHE-pointing case) / malformed JSON, and emits a single-line JSON
contract ({ok, source, fetched_at, releases[…]} or {ok:false, error{…}}).

Filters tags to compound-engineering-v* so cli-v*, coding-tutor-v*, and
marketplace-v* releases are excluded. Extracts linked PR numbers from
release-please bodies via [#N] (markdown-link form), avoiding bare #N
mentions and commit-SHA parens.

Use gh api /repos/.../releases (not gh release list --json) so body and
html_url are populated and gh's auth removes the anonymous rate limit.

Tests use Bun.spawn with two seams: CE_RELEASE_NOTES_GH_BIN to mock the
gh binary, and --api-base to point urllib at a local Bun.serve harness.
15 tests cover gh path, gh→anon fallback (4 modes), anon path, anon
error paths (rate_limit, network_outage), and integration.
Wire bare /ce:release-notes invocation: parse args (stripping mode:*
tokens), invoke the helper for 40 raw releases, and render the first 10
plugin releases with version, date, body, and a release-page link.

Per-release body is soft-capped at 25 rendered lines with markdown-fence
awareness — if the cut lands inside an open code fence, append a closing
fence before the "see more" link so renderers do not swallow following
content.

Frontmatter uses the colon form (name: ce:release-notes) to align with
user-facing slash workflows and trigger the Codex prompt-wrapper path.
disable-model-invocation: true keeps the skill slash-only in v1.

Query mode is referenced as Phase 4 with a fall-back-to-summary stub;
the actual query-mode logic lands in the next unit.
Replace the Phase 4 stub with the full query-mode flow: fetch a 60-raw
buffer, judge confident match across the 20-plugin-release search window,
optionally enrich the matched release(s) from the first linked PR via
gh pr view, synthesize a narrative answer with a (v2.X.Y) citation, and
fall through to a hardcoded R14 line when no confident match exists.

Phase 6 includes a prompt-injection guard: release bodies are read as
data, not as instructions. PR enrichment is best-effort — gh failures
fall through to body-only synthesis with a one-line note rather than
aborting the response. Phase 7 also instructs list-form invocation so a
release body can never inject shell content.
Add /ce:release-notes to the Workflow Utilities table next to /ce-update.
Plugin/marketplace manifests are auto-counted, so no manual count bumps.
Verified bun run release:validate, full bun test (only pre-existing
flaky resolve-base test fails), and that bun run convert ships the skill
to both opencode and codex (with the Codex prompt wrapper auto-created
because the frontmatter name uses the colon form).
The GitHub Releases API returns both a browser-friendly `html_url` and
an API `url` (pointing at `api.github.com/.../releases/{id}`). The
helper's normalizer preferred `url` first, so the rendered
"Full release notes →" links in summary mode pointed at the API
endpoint instead of the release page. Swap the preference and add a
regression test that sets both fields on an API-shaped fixture.

Surfaced by eval iteration 1 of the ce-release-notes skill.
Summary mode now renders the last 5 releases instead of 10 — at the
plugin's current cadence (~2 releases per week) this is about 10 days
of history, compact enough to avoid a wall of text but wide enough to
catch recent activity. Added a one-line hint pointing users at
query mode for older history.

Query mode's search window is 40 (up from 20), and the helper fetch
limit is 100 (up from 60) to ensure the window can be filled even when
sibling tags (cli-v*, marketplace-v*, etc.) interleave. Surfaced by
eval iteration 1: asking "what happened to deepen-plan?" put the
canonical v2.51.0 promotion outside the old 20-release window.

Phase 9's no-match line updated to reference 40.
@tmchow tmchow merged commit 59dbaef into main Apr 17, 2026
2 checks passed
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6f8d802ac7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Run the helper with a wider buffer so the search window can be filled even when sibling tags interleave heavily:

```bash
python3 scripts/list-plugin-releases.py --limit 100
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fetch enough raw releases to guarantee 40 plugin matches

Query mode says it searches the last 40 plugin releases, but this command only fetches 100 raw releases before filtering to compound-engineering-v*. If sibling components (cli-v*, coding-tutor-v*, marketplace-v*, cursor-marketplace-v*) make up more than 60 of those 100 tags, the filtered window drops below 40 and Phase 9 can return a false “not found” for changes that are still within the intended 40-plugin-release horizon. Paginating until 40 plugin releases are collected (or history is exhausted) would avoid this regression.

Useful? React with 👍 / 👎.

tmchow added a commit that referenced this pull request Apr 17, 2026
Upstream PR #589 introduced the ce-release-notes skill with frontmatter
name ce:release-notes (old colon format) and two in-content references
to /ce:release-notes. Aligned all three to the new ce-release-notes
hyphen convention this branch standardizes on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot mentioned this pull request Apr 17, 2026
tmchow added a commit that referenced this pull request Apr 17, 2026
Upstream PR #589 introduced the ce-release-notes skill with frontmatter
name ce:release-notes (old colon format) and two in-content references
to /ce:release-notes. Aligned all three to the new ce-release-notes
hyphen convention this branch standardizes on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tmchow added a commit that referenced this pull request Apr 18, 2026
Upstream PR #589 introduced the ce-release-notes skill with frontmatter
name ce:release-notes (old colon format) and two in-content references
to /ce:release-notes. Aligned all three to the new ce-release-notes
hyphen convention this branch standardizes on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tmchow added a commit that referenced this pull request Apr 18, 2026
Upstream PR #589 introduced the ce-release-notes skill with frontmatter
name ce:release-notes (old colon format) and two in-content references
to /ce:release-notes. Aligned all three to the new ce-release-notes
hyphen convention this branch standardizes on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tmchow added a commit that referenced this pull request Apr 18, 2026
Upstream PR #589 introduced the ce-release-notes skill with frontmatter
name ce:release-notes (old colon format) and two in-content references
to /ce:release-notes. Aligned all three to the new ce-release-notes
hyphen convention this branch standardizes on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tmchow added a commit that referenced this pull request Apr 18, 2026
Upstream PR #589 introduced the ce-release-notes skill with frontmatter
name ce:release-notes (old colon format) and two in-content references
to /ce:release-notes. Aligned all three to the new ce-release-notes
hyphen convention this branch standardizes on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot mentioned this pull request Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant