fix(ci): deduplicate conformance coverage in GPU CI#577
Merged
yuanchen8911 merged 1 commit intoNVIDIA:mainfrom Apr 15, 2026
Merged
Conversation
Contributor
Author
|
Follow-up proposal after this narrow The remaining GPU CI shape should be made symmetric so that training and inference both run the same core conformance coverage, with only platform-specific controller/gateway coverage and the inference smoke-test tail differing. Proposed follow-up changes:
Target steady state:
That would remove the remaining accidental drift:
|
This was referenced Apr 15, 2026
dims
approved these changes
Apr 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Deduplicate conformance coverage in GPU CI by removing the standalone H100 conformance workflow and preserving the unique trigger coverage it carried for the remaining GPU training and inference workflows.
Motivation / Context
The standalone H100x2 conformance workflow duplicates the conformance coverage already exercised by the training and inference workflows. Deleting it reduces redundant GPU CI usage, while consolidating the trigger coverage that would otherwise be lost into the two surviving workflows.
This PR is intentionally narrow in runtime behavior. It removes duplicate conformance execution without changing the training or inference workflow steps, and it leaves broader GPU CI symmetry work to a follow-up.
Fixes: #554
Issue: #554
Related: #541
Type of Change
Component(s) Affected
cmd/aicr,pkg/cli)cmd/aicrd,pkg/api,pkg/server)pkg/recipe)pkg/bundler,pkg/component/*)pkg/collector,pkg/snapshotter)pkg/validator)pkg/errors,pkg/k8s)docs/,examples/)Implementation Notes
.github/workflows/gpu-h100-conformance-test.yaml..github/actions/setup-build-tools/**to both remaining GPU workflow path filters because the deleted workflow was the only one carrying that trigger coverage.tests/chainsaw/ai-conformancehelper and imported assert files that the training workflow executes indirectly viakind-training/chainsaw-test.yaml, so deleting the standalone workflow does not create a trigger gap for training.Testing
# Commands run (prefer `make qualify` for non-trivial changes) make qualifymake test✅make lint✅make qualify❌ atmake scanmake scancurrently reports dependency vulnerabilities in the local environment:go1.26.1CVEs fixed in1.26.2/1.25.9pygments 2.19.2fixed in2.20.0Risk Assessment
Rollout notes: N/A
Checklist
make testwith-race)make lint)git commit -S) — GPG signing info