Skip to content

CI: Check versioned release notes exist before releasing#1907

Open
cpcloud wants to merge 3 commits intoNVIDIA:mainfrom
cpcloud:issue-1326
Open

CI: Check versioned release notes exist before releasing#1907
cpcloud wants to merge 3 commits intoNVIDIA:mainfrom
cpcloud:issue-1326

Conversation

@cpcloud
Copy link
Copy Markdown
Contributor

@cpcloud cpcloud commented Apr 14, 2026

Summary

  • Adds a check-release-notes job to the release workflow that verifies the versioned release-notes file (e.g. 13.1.0-notes.rst) exists and is non-empty for each package being released
  • Blocks doc, upload-archive, and publish-testpypi jobs via needs: gates so releases cannot proceed with missing notes
  • .postN tags are silently skipped (no notes file expected)
  • Helper script at toolshed/check_release_notes.py with 20 pytest tests

Test plan

  • 20/20 pytest tests pass locally (tag parsing, component mapping, missing/empty/post detection, CLI exit codes)
  • Verify check-release-notes job runs in CI on a test release dispatch
  • Confirm .postN tags skip without failure

Closes #1326

🤖 Generated with Claude Code

@cpcloud cpcloud added this to the cuda.core v1.0.0 milestone Apr 14, 2026
@cpcloud cpcloud added P0 High priority - Must do! CI/CD CI/CD infrastructure labels Apr 14, 2026
@cpcloud cpcloud self-assigned this Apr 14, 2026
cpcloud and others added 3 commits April 14, 2026 18:15
Add a check-release-notes job to the release workflow that verifies
the versioned release-notes file (e.g. 13.1.0-notes.rst) exists and
is non-empty for each package being released. The job blocks doc,
upload-archive, and publish-testpypi via needs: gates.

Helper script at toolshed/check_release_notes.py parses the git tag,
maps component to package directories, and checks file presence.
Post-release tags (.postN) are silently skipped. Tests cover tag
parsing, component mapping, missing/empty detection, and the CLI.

Refs NVIDIA#1326

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ensures the release-notes check validates the tagged tree, not the
default branch HEAD. Without this, manually triggered runs could
validate the wrong commit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Copy link
Copy Markdown
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started my review by asking Cursor to look only at the purely technical aspects. For completeness, I'm pasting the response below (no need to read it yet). The main thing it surfaced is what looks like a pre-existing workflow design issue: in the current release.yml on main, all paired with a single git-tag is not really usable. I'll post a separate comment with a more detailed analysis of that point.

I think it would be most useful to decide first, at a high level, what we want to do about all, before revisiting the lower-level details.

I played around with some possible UI/input shapes in PR #1947 (see screenshots in the PR description). Not sure whether you want to make that kind of change in this PR. A simpler alternative that would seem fine to me would be to remove all for now, in this PR, and remove the corresponding all handling from the new check_release_notes.py script. If we go that route, I think that would leave only the much simpler tag-parsing finding to worry about.

Separate high-level question: is toolshed/ the intended home for this script, or would ci/tools/ be a better fit? I could imagine either, but I expected ci/tools/ first.

  Findings
  • Medium: toolshed/check_release_notes.py:27 makes component=all mean “all four packages must have release notes for the same tag
    version”, and toolshed/tests/test_check_release_notes.py:77 explicitly enshrines that with v13.1.0. That does not match this repo’s
    actual versioning split: the issue being fixed is about the shared v13.x line for cuda-bindings/cuda-python, while cuda-core and
    cuda-pathfinder release under their own tag families and note files such as cuda_core/docs/source/release/0.7.0-notes.rst and
    cuda_pathfinder/docs/source/release/1.5.3-notes.rst. I verified locally that toolshed/check_release_notes.py --git-tag v13.1.0
    --component all now fails by demanding nonexistent cuda_core/docs/source/release/13.1.0-notes.rst and
    cuda_pathfinder/docs/source/release/13.1.0-notes.rst, so if all is still meant to support repo-level releases, this gate is
    incorrect.
  • Low: toolshed/check_release_notes.py:31 introduces tag parsing that disagrees with the parser already used later in the same release
     flow at ci/tools/validate-release-wheels:25. For example, a historical tag like cuda-core-v0.1.1rc1 is interpreted here as version
    0.1.1rc1, while wheel validation interprets the same tag as 0.1.1; .post1 tags are skipped here but still normalized later. That
    gives one workflow two different definitions of the release version, which is a maintenance trap and a likely source of false
    failures if rc/post tags are ever reused.

@rwgk
Copy link
Copy Markdown
Collaborator

rwgk commented Apr 17, 2026

Generated with Cursor GPT-5.4 Extra High Fast


Analysis: Can all in the current release.yml workflow on main actually work?

Assume a single commit is tagged with all three release-family tags:

  • v13.3.0
  • cuda-core-v0.8.0
  • cuda-pathfinder-v1.6.0

Now assume we manually run the current release.yml workflow on main, choose component=all, and supply any one of those tags as git-tag.

The concrete question is:

Would that produce a complete and valid set of releases for all four package families?

  • cuda-bindings == 13.3.0
  • cuda-python == 13.3.0
  • cuda-core == 0.8.0
  • cuda-pathfinder == 1.6.0

Why This Is Subtle

At first glance, all plus a single tag looks nonsensical because the repo no longer has a single version namespace. But there is a real reason to hesitate before concluding that: a single tag-triggered CI run on a multi-tagged commit can still build all four wheel families.

The reason is that CI and release are using tags differently.

  • /.github/workflows/ci.yml triggers on all three tag families: v*, cuda-core-v*, and cuda-pathfinder-v*.
  • On non-PR events, ci.yml does not do path-based narrowing; it runs the full pipeline.
  • /.github/workflows/build-wheel.yml builds all four package families in that run:
    • cuda.pathfinder
    • cuda.bindings
    • cuda.core
    • cuda-python

So a CI run triggered by a tag push is not scoped to just one package family.

Why The Wheels Can Still Be Correctly Versioned

The packages do not all derive their version from the same tag family.

  • cuda_bindings/pyproject.toml uses setuptools-scm with tag_regex = ^(?P<version>v...) and git describe --match v*.
  • cuda_python/setup.py does the same shared-line lookup for v*.
  • cuda_core/pyproject.toml uses only cuda-core-v*.
  • cuda_pathfinder/pyproject.toml uses only cuda-pathfinder-v*.

That means version resolution is family-specific, not "whatever tag triggered the workflow."

So if one commit really has all three tags, then the packages can resolve like this:

  • cuda-bindings -> 13.3.0
  • cuda-python -> 13.3.0
  • cuda-core -> 0.8.0
  • cuda-pathfinder -> 1.6.0

This is not just theoretical. The repo already has historical commits that carry both a shared v... tag and a family-specific tag on the same commit, for example:

  • v13.0.2 together with cuda-core-v0.4.0
  • v13.2.0 together with cuda-pathfinder-v1.4.2

And git describe --match ... resolves the expected tag family separately on those commits.

So the answer to the narrow build question is:

Yes, a single CI run on a multi-tagged commit can plausibly build all four wheel families with their own correct versions.

Where It Breaks

The current release workflow is not built around "one commit with several independently meaningful tags." It is built around "one input tag defines one release version."

That assumption shows up in several places.

1. The release workflow selects a CI run by the exact tag

ci/tools/lookup-run-id resolves the input tag to a commit SHA, then filters GitHub Actions runs for the successful push run whose headBranch equals that exact tag.

So the release workflow is not saying "give me a successful CI run for this commit." It is saying "give me the successful CI run for this exact tag ref."

2. The release workflow still creates exactly one GitHub Release

/.github/workflows/release.yml and /.github/workflows/release-upload.yml use inputs.git-tag as the release identifier and upload target.

So even before thinking about wheel validation, the workflow still has only one GitHub Release object in mind:

  • either the release for v13.3.0
  • or the release for cuda-core-v0.8.0
  • or the release for cuda-pathfinder-v1.6.0

It has no notion of "release all four families under their own tags."

3. component=all enforces one version across all downloaded wheels

This is the decisive point.

ci/tools/download-wheels with component=all downloads all wheel artifacts from the selected CI run.

Then ci/tools/validate-release-wheels parses exactly one expected version from inputs.git-tag and applies that expected version to all distributions in the all set:

  • cuda_core
  • cuda_bindings
  • cuda_pathfinder
  • cuda_python

That means:

  • if git-tag=v13.3.0, then validation expects all four distributions to be version 13.3.0
  • if git-tag=cuda-core-v0.8.0, then validation expects all four distributions to be version 0.8.0
  • if git-tag=cuda-pathfinder-v1.6.0, then validation expects all four distributions to be version 1.6.0

Under the example above, each of those cases fails:

  • git-tag=v13.3.0 fails because cuda-core and cuda-pathfinder are not 13.3.0
  • git-tag=cuda-core-v0.8.0 fails because cuda-bindings and cuda-python are not 0.8.0
  • git-tag=cuda-pathfinder-v1.6.0 fails because cuda-bindings, cuda-python, and cuda-core are not 1.6.0

So even though the CI run may have built the right wheels, the release workflow cannot interpret them correctly when component=all.

Bottom Line

There are really two different questions here.

  1. Can one multi-tagged commit support a CI run that builds all four package families with correct versions?

Yes, plausibly.

  1. Does the current release.yml workflow make sense with component=all and a single git-tag?

No, not really.

In the current workflow, all does not mean:

release all independently versioned package families from one commit

Instead, it behaves more like:

release all downloaded artifacts as though they belong to one version namespace derived from one tag

That assumption no longer matches the structure of this repo.

So the statement

"all paired with just one git tag does not make sense"

is reasonable for the current workflow on main.

Final Caveat

Even the more optimistic part, "a single tag-triggered CI run can build all four correctly," is somewhat operationally fragile.

If the CI run triggered by, say, v13.3.0 starts before cuda-core-v0.8.0 and cuda-pathfinder-v1.6.0 are present on the remote, then the family-specific setuptools-scm lookups may not see those tags yet. In that situation, cuda-core and cuda-pathfinder may resolve against older family tags and produce dev/local versions instead of the intended release versions.

So all is not just semantically awkward in the current design; it is also timing-sensitive and therefore fragile in practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI/CD CI/CD infrastructure P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CI: The release workflow should check if the versioned release note is missing

2 participants