fix(ci): wire enable_coverage into pytest-local pytest invocation#9559
Open
nv-tusharma wants to merge 6 commits into
Open
fix(ci): wire enable_coverage into pytest-local pytest invocation#9559nv-tusharma wants to merge 6 commits into
nv-tusharma wants to merge 6 commits into
Conversation
The pytest-local composite action declared an `enable_coverage` input and
gated its upload-artifact step on it, but never threaded the corresponding
`--cov` flags into the pytest command. As a result, nightly-ci jobs that
opted into coverage uploaded artifacts containing only JUnit XML -- no
`.coverage*` data files -- and the downstream `coverage-report` job's
`coverage combine` step had nothing to merge, producing 0% coverage.
Mirror the pattern from `.github/actions/pytest/action.yml`:
- Build COVERAGE_FLAGS with the two `--cov` targets and an empty
`--cov-report=` so pytest-cov writes only the binary `.coverage` data
file (the merge job generates the human-readable reports).
- Export COVERAGE_FILE under TEST_RESULTS_DIR so the data file and any
per-worker `.coverage.*` shards (from `parallel=True` in .coveragerc)
land inside the uploaded artifact directory.
- Extend PYTHONPATH with the component and bindings src roots so
coverage's source resolution lines up with `.coveragerc`'s `[paths]`
mapping.
- Inject \${COVERAGE_FLAGS} into PYTEST_CMD before the \`-m\` selector,
matching the order used by the older action.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
WalkthroughThe ChangesConditional Coverage Collection
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
dynamo-ops
reviewed
May 14, 2026
jh-nv
approved these changes
May 14, 2026
When `max_vram_gib` is set, tests/conftest.py hands off to the VRAM-aware orchestrator at tests/utils/pytest_parallel_gpu.py, which spawns each profiled test as a separate pytest subprocess. The child argv is built from a hardcoded whitelist in conftest.py and did not include `--cov*`, so child sessions ran without pytest-cov loaded and emitted no coverage data. Result: in nightly vllm-test and sglang-test single-GPU jobs (run_gpu_parallel_tests=true), profiled tests still contributed 0% after the prior pytest-local action wiring fix. Two contained changes: - tests/conftest.py: forward --cov and --cov-report values to children when the _cov plugin is loaded in the parent. - tests/utils/pytest_parallel_gpu.py: give each child a unique COVERAGE_FILE (suffixed with worker id + safe test name) so its session-end combine doesn't clobber siblings. pytest-cov sets data_suffix=True internally and combines per-process shards into the unsuffixed path at session end; with a shared path, the last child wins. Unique paths let every child's combined data survive, and the existing job-level merge in nightly-ci.yml's coverage-report job picks them all up alongside the parent's data. Validated by reproducing the orchestrator path inside the CI test image with the patched worktree mounted: forwarded EXTRA_PYTEST_ARGS contains both `--cov=...` entries plus `--cov-report=`, and the per-child COVERAGE_FILE block is intact. Per review by dynamo-ops and dmitry-tokarev-nv on #9559. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…erge)
Flips enable_coverage:true on the PR pipeline's vllm-test job so we can
observe the patched pytest-local action + orchestrator forwarding
end-to-end against real GPU runners. vllm-test already runs with
run_gpu_parallel_tests=true and gpu_parallel_max_vram_gib=24, so it
exercises both the sequential pytest path (action wiring fix) and the
VRAM-aware orchestrator path (conftest + pytest_parallel_gpu fixes).
After CI completes:
- Open the coverage-python-vllm-* artifacts from the workflow run.
- Confirm each contains .coverage* files (parent and per-child shards).
- Confirm the parallel-GPU stage produces .coverage.w{N}.{test_name}*
shards (proof the orchestrator forwarding works).
Revert this commit before merging.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…before merge)" This reverts commit 29dd849.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview:
Fix Python coverage being reported as 0% in nightly CI. Two related gaps caused the same symptom:
pytest-localcomposite action declared anenable_coverageinput and gated its upload-artifact step on it, but never threaded--covflags into the pytest command. Nightly jobs uploaded artifacts containing only JUnit XML, so the downstreamcoverage-reportjob'scoverage combinehad nothing to merge and printed 0%.max_vram_gibis set (vllm/sglang single-GPU jobs withrun_gpu_parallel_tests=true),tests/conftest.pyhands off to the VRAM-aware orchestrator intests/utils/pytest_parallel_gpu.py. Even after fixing Update README.md #1, profiled GPU tests still ran without coverage because the orchestrator builds child pytest argv from a hardcoded whitelist that did not include--cov*.Details:
.github/actions/pytest-local/action.ymlIn the normal-mode
elsebranch, wheninputs.enable_coverage == 'true':COVERAGE_FLAGSwith--cov=components/src/dynamo --cov=lib/bindings/python/src/dynamo --cov-report=. The empty--cov-report=tells pytest-cov to write only the binary.coveragedata file — the nightlycoverage-reportjob generates the human-readable reports aftercoverage combine.COVERAGE_FILE="${TEST_RESULTS_DIR}/.coverage"so the data file (and any per-worker.coverage.*shards produced byparallel=Truein.coveragerc) lands in the uploaded artifact directory.PYTHONPATHwith${CONTAINER_WORKSPACE}/components/src:${CONTAINER_WORKSPACE}/lib/bindings/python/srcso coverage's source resolution matches the[paths]mapping in.coveragerc.${COVERAGE_FLAGS}intoPYTEST_CMDjust before the-mmarker selector.This mirrors the pre-existing pattern in
.github/actions/pytest/action.yml(lines 138–145).tests/conftest.pyIn the orchestrator's child-argv builder (
pytest_runtestloop), forward--cov/--cov-reportvalues when the_covplugin is loaded in the parent. Without this, child sessions ran without pytest-cov loaded and contributed no.coveragedata to the nightly merge.tests/utils/pytest_parallel_gpu.pyGive each child a unique
COVERAGE_FILE(suffixed with worker id + safe test name) so its session-end combine doesn't clobber siblings. pytest-cov writes per-process suffixed shards during the run and combines them into the unsuffixed path at session end; with a shared path, the last child wins. Unique paths let every child's combined data survive, and the parent's own session-endcoverage combinesweeps the per-child files into the merged.coverage(which the nightlycoverage-reportjob then merges across all jobs).What's intentionally not touched
.coveragerc,pyproject.toml, and thecoverage-reportjob innightly-ci.ymlare all correct as-is — they just had nothing to merge before this fix. The dind-as-sidecar branch in.github/actions/pytest/action.ymlhas the same latent gap aspytest-local, butshared-test.ymldoesn't use it, so this PR is kept tightly scoped.Where should the reviewer start?
.github/actions/pytest-local/action.ymllines 217–226: the coverage block and${COVERAGE_FLAGS}interpolation inPYTEST_CMD.tests/conftest.pyaround line 266: the new pytest-cov forwarding block in the orchestrator argv builder.tests/utils/pytest_parallel_gpu.pyaround line 532: the per-childCOVERAGE_FILEsuffix.Test plan:
Validated locally inside the actual CI test image (
-vllm-runtime-cuda12-test):PYTEST_CMDagainsttest_vllm_unit.py..coveragedata file written (77 KiB),coverage reportproduced real per-file numbers withTOTAL = 25 %.run_paralleland capturedEXTRA_PYTEST_ARGS— confirmed--cov=components/src/dynamo,--cov=lib/bindings/python/src/dynamo, and--cov-report=are forwarded to children, plus per-childCOVERAGE_FILEblock is intact.Validated end-to-end on this PR's pre-merge CI (via a temporary
enable_coverage: trueflip onvllm-test, now reverted):.coverageartifact contains real data — 201 files, 4727 bytes of bitmap,coverage reportshowsTOTAL = 37 %(17120 statements, 10752 missed)..coveragefile via the same wiring..coveragefile. The merged data is roughly equal to the parent's collection-time imports — see "Known limitation" below.Known limitation (follow-up needed for full e2e coverage):
The orchestrator forwarding fix is mechanically correct but contributes little measurable coverage for the current test mix because the GPU-parallel pool is entirely e2e tests (
test_serve_deployment[*],test_router_e2e_*). These tests spawn dynamo binaries as separate subprocesses and black-box them over HTTP — the dynamo source executes inside the spawned binary, which pytest-cov in the test process cannot trace.This was confirmed empirically on this PR: the sequential GPU stage ran 226 tests in-process with
--covcorrectly wired, and its.coveragewas bytewise-identical to the CPU stage's. The test execution adds essentially nothing on top of pytest's collection-time imports, because the test bodies don't exercise dynamo code in the pytest process.For meaningful coverage of dynamo internals from e2e tests, a follow-up will need to add
COVERAGE_PROCESS_START+ a sitecustomize hook so spawned Python subprocesses (the dynamo binaries themselves) also record coverage. That's dmitry's "Option B" from review — broader scope, separate PR.This PR is the right floor: unit-test coverage (which is what the CPU stages produce) now flows through end-to-end, and the orchestrator wiring is ready for the day someone adds in-process profiled tests.
🤖 Generated with Claude Code
Summary by CodeRabbit