Skip to content

[Misc] Add coverage report to PRs, including kernels#470

Merged
hughperkins merged 140 commits intomainfrom
hp/pr-coverage-w-kernels
Apr 25, 2026
Merged

[Misc] Add coverage report to PRs, including kernels#470
hughperkins merged 140 commits intomainfrom
hp/pr-coverage-w-kernels

Conversation

@hughperkins
Copy link
Copy Markdown
Collaborator

@hughperkins hughperkins commented Apr 10, 2026

Issue: #

Brief Summary

Summary

This PR adds kernel code coverage to quadrants — tracking which lines and branches actually
execute inside compiled kernels on GPU/CPU, not just host-side Python code. The coverage data is
written in standard coverage.py format, so it integrates with pytest-cov, diff-cover, and
IDE coverage viewers out of the box.

What's included

Core feature (_kernel_coverage.py, 275 lines):

  • AST rewriting inserts lightweight field-store probes into @qd.kernel and @qd.func at each
    source line. Probes inside if/else bodies give true runtime branch coverage.
  • Zero overhead when disabled — the module is never imported unless QD_KERNEL_COVERAGE=1.
  • Survives qd.init() resets by harvesting probe data before runtime destruction.
  • Autodiff-safe: probes are excluded from backward/AD replay compilations.
  • Thread-safe: coverage field is eagerly allocated on the main thread during qd.init().
    Pytest plugin (pytest_plugin.py):
  • Registered via pytest11 entry point. Automatically sets QD_KERNEL_COVERAGE=1 when
    pytest-cov is active. Users can opt out with QD_KERNEL_COVERAGE=0.
    CI integration (linux.yml, 4_test.sh, 4_test_cuda.sh):
  • Runs kernel coverage alongside pytest-cov across CPU and CUDA test phases.
  • Posts diff coverage as PR comments with 80% gate on changed lines.
  • Merges kernel and Python coverage data into unified reports.
    Coverage report tool (coverage_report.py, 458 lines):
  • Shared by CI and local dev. Supports HTML diff reports, terminal summaries, and PR comments
    with collapsible annotated code sections.
    Tests (test_kernel_coverage.py, 488 lines):
  • Covers: basic line/branch coverage, if/else/while/with/try, @qd.func,
    multi-kernel, deduplication, reinit survival, autodiff exclusion, env var control,
    probe capacity overflow.
    Docs (kernel_coverage.md):
  • User guide covering setup, pytest-cov integration, autodiff behavior, offline cache
    interaction, and advanced configuration.

Strengths

  • Standard format: outputs .coverage files, so the entire Python coverage ecosystem
    (pytest-cov, diff-cover, coverage.py, IDE integrations) works without any adapter.
  • Zero-cost when off: no import, no AST changes, no field allocation unless explicitly enabled.
  • Automatic with pytest-cov: the pytest plugin means most users get kernel coverage for free
    without learning about QD_KERNEL_COVERAGE.
  • Branch coverage: knows which side of an if/else actually ran on the device, not just
    which lines were compiled.
  • Comprehensive tests: 488 lines of tests covering edge cases (reinit, autodiff, capacity,
    xdist, concurrent kernels).

Weaknesses / known limitations

  • Autodiff validation=True: kernel calls inside qd.ad.Tape(validation=True) are not
    covered, because the AD compilation mode skips probe insertion.
  • Offline cache invalidation: coverage probes change the compiled kernel, so the offline cache
    misses on first run with coverage enabled. Expected but worth knowing.
  • Probe capacity is fixed per-process: defaults to 100k probes. Very large codebases with
    many kernels may need to increase QD_COVERAGE_MAX_PROBES.
  • No inter-line arc data: the probe design knows which lines ran but not the actual
    transitions between them, so arc data uses entry/exit arcs per line rather than true
    control-flow arcs. This is sufficient for coverage report and diff-cover but won't
    give precise branch-pair information.
  • Large commit count: ~80 commits reflecting iterative CI debugging. Could benefit from a
    squash or interactive rebase before merge.

=====================

Here's the PR story:

────────────────────────────────────────

The problem

When you write a quadrants kernel and run your tests with pytest --cov, coverage.py measures which Python lines executed — but it has no visibility into what happens inside
@qd.kernel or @qd.func bodies. Those are compiled to GPU/CPU code and run on the device. From coverage.py's perspective, the kernel call site is one line, and it either ran or
didn't. You can't tell which branches inside the kernel were taken, or whether a particular code path in your kernel is dead.
This PR adds kernel code coverage: tracking which source lines actually execute inside compiled kernels, and feeding that data back into the standard coverage.py ecosystem so
it shows up in coverage report, coverage html, PR comments, etc.

The approach

The key insight is that quadrants already compiles kernels from Python AST → its own IR → device code. We can intercept at the AST stage and insert extra statements — probes —
before each source line. Each probe is just a field store: _qd_cov[probe_id] = 1. The existing compilation pipeline treats this as an ordinary integer write, so it compiles
and runs on the device alongside the user's code with no special C++ support needed.
After the kernel runs, we read the field back to the host. Any probe_id slot that contains 1 tells us that line executed.

The implementation

Enabling: environment variable gate

Everything is gated on QD_KERNEL_COVERAGE=1. Both _func_base.py and kernel.py read this at module load time into a _KERNEL_COVERAGE bool. When it's False, the coverage module
is never imported and there's zero overhead.

_KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"

The AST rewriter (_kernel_coverage.py)

The core of the feature is _CoverageASTRewriter, an ast.NodeTransformer. It walks a kernel's AST and, for each statement at a new source line, prepends a probe node:

_qd_cov[<probe_id>] = 1

It has visit_* methods for every compound statement type (if/else, for, while, with, try/except/finally) so that probes land inside each branch body. This means if you have an
if/else, the if-body probe only fires when that branch is taken — giving true runtime branch coverage, not just line coverage.
Each probe gets a unique integer ID, and the rewriter records a mapping: {probe_id: (filepath, lineno)}. A _seen_lines set prevents duplicate probes when two statements share
a source line (e.g. a = 1; b = 2).
There's a configurable capacity limit (QD_COVERAGE_MAX_PROBES, default 100k) — if exceeded, a warning is emitted and additional lines are silently skipped.

Hooking into compilation (_func_base.py)

When a kernel is about to be compiled, FuncBase._compile_function runs. This is where the AST gets transformed before being handed to the quadrants compiler. The new code
checks two conditions:

  1. _KERNEL_COVERAGE is enabled
  2. The kernel is being compiled in AutodiffMode.NONE (i.e., the normal forward compilation — not an AD replay)

If both are true, it calls _kernel_coverage.rewrite_ast(tree, filepath, start_lineno) to inject probes into the AST. It then retrieves the coverage field via get_field() and
adds it to the kernel's global_vars dict under the name _qd_cov, so the compiled probe statements can resolve the field reference.
The AD exclusion is important: the backward pass is an auto-generated transformation of the kernel. Inserting field stores into it would break gradient computation (they'd be
treated as differentiable operations). Since the backward pass replays the same control flow as the forward pass, there are no user-written lines that appear only in the
backward pass, so nothing is lost.
This also applies to @qd.func bodies — they go through the same _compile_function path and receive probes when called from a non-AD kernel.

Allocating the probe field (kernel.py)

Before compilation, Kernel.materialize_layout calls _kernel_coverage.ensure_field_allocated(). This creates a single global qd.field(dtype=qd.i32, shape=(_MAX_PROBES,)) — a flat
integer array on the device. All probes across all kernels in the process write into this one field, indexed by their probe ID.
The function uses double-checked locking: it checks whether the field already exists (and belongs to the current Program instance), and only allocates under a lock if needed.
This handles the case where qd.init() is called again mid-process, creating a new Program — the field must be re-allocated against the new runtime.

Surviving runtime resets

Users can call qd.reset() / qd.init() to tear down and recreate the runtime. This destroys all fields, including our coverage field. If we don't harvest the probe data before
that happens, it's lost.
The fix is _install_reset_hook(), which monkey-patches PyQuadrants.clear() (the method that destroys the runtime) to call _harvest_field() first. _harvest_field() reads the field
back to numpy, iterates the probe map, and records which lines fired into _accumulated_lines: dict[str, set[int]] — a durable Python dict that survives across runtime resets.

Writing output (flush)

flush() is registered with atexit. It calls _harvest_field() one last time (for the current session), then writes _accumulated_lines to a _qd_kcov. file using coverage.py's
CoverageData API.
It detects whether pytest-cov is using branch (arc) mode by reading the .coverage file. If so, it emits entry/exit arcs (-1, line) and (line, -1) for each covered line. These are
minimal arcs that correctly represent "this line ran" without fabricating transitions between lines that may not reflect actual control flow. If pytest-cov is in line mode, it
writes line data directly.
The PID-based filename ensures parallel pytest-xdist workers don't collide.

Pure kernel exemption (ast_transformer_utils.py)

Quadrants has a "pure kernel" concept (@qd.kernel(fastcache=True)) where the compiler verifies the kernel doesn't access mutable global state. Our _qd_cov field is a global, so
without an exemption, every pure kernel would fail to compile with coverage enabled.
The fix generalizes an existing check: any global variable whose name starts with qd is exempted from the purity violation. This is a one-line change from if name != "_qd_cov"
to if not name.startswith("qd"), keeping it open for future internal globals.

CI integration (linux.yml, 4_test.sh, 4_test_cuda.sh)

The CI workflow runs tests in three stages:

  1. CPU tests (4_test.sh): runs run_tests.py --coverage (which sets QD_KERNEL_COVERAGE=1 and enables pytest-cov), then calls coverage_report.py --collect-only to combine kernel
    • Python coverage and emit coverage.xml.
  2. CUDA tests (4_test_cuda.sh): same pattern, on a GPU runner using the wheel built in step 1.
  3. Coverage comment (coverage-comment job): downloads both coverage.xml files, runs coverage_report.py --report-only --format markdown to generate a diff coverage report
    against origin/main, and posts it as a PR comment via gh pr comment.

A few existing tests are incompatible with coverage probes (they assert on compiled kernel identity, cache file counts, or thread-safety of add_struct_module). These get
pytest.mark.skipif(QD_KERNEL_COVERAGE) markers and still run in the non-coverage test phases.

Coverage report tool (coverage_report.py)

A standalone script used by both CI and local development. It has two responsibilities:

  1. Collect: rename .coverage to .coverage.pytest, combine with _qd_kcov.* files via coverage combine, generate coverage.xml.
  2. Report: parse coverage.xml for line hits, parse git diff for changed lines, cross-reference to produce a diff coverage report showing which changed lines are covered/missed.

Four output formats are supported via a renderer class hierarchy (_TerminalRenderer, _AnnotatedRenderer, _MarkdownRenderer, _HtmlRenderer), all driven by a single loop in
generate_report that calls begin / begin_file / write_line / end_file / finish.

Documentation (kernel_coverage.md)

A user guide aimed at people using the quadrants library (e.g., Genesis developers), not quadrants framework developers. It covers: enabling coverage, viewing results with
coverage.py and pytest-cov, an example with branch coverage, key properties, probe capacity configuration, autodiff interaction (what is and isn't covered), and a brief "under
the hood" section.

Tests (test_kernel_coverage.py)

The test suite has two layers:
AST rewriter unit tests — parametrized over 6 control-flow constructs (straight-line, if/else, for, while, with, try/except/finally), plus dedicated tests for capacity limiting
and same-line deduplication. These are pure Python and don't need a runtime.
End-to-end tests — run actual kernels on CPU and CUDA and verify probes fired:
• Branch selectivity (taken branch fires, untaken doesn't)
• SIMT with block.sync() and subgroup shuffle (GPU only)
• qd.reset() / qd.init() survival
• Autodiff: forward pass produces probes, backward doesn't add any
• @qd.func called from a kernel
• Multiple kernels accumulating into the same field
• Pure (fastcache) kernel compiles without errors
• _harvest_field graceful failure when to_numpy() throws
copilot:summary

Walkthrough

copilot:walkthrough

Enable pytest-cov in Linux CI and post coverage summary as a
PR comment via MishaKav/pytest-coverage-comment. Uses only the
built-in GITHUB_TOKEN — no external service or org permissions.
Use diff-cover to report percentage of changed/added lines
that are covered by tests. Posts a sticky PR comment with
both diff coverage and overall project coverage. No external
services needed — uses only the built-in GITHUB_TOKEN.
Fail the Linux CI if less than 80% of changed/added Python
lines are covered by tests. The coverage PR comment is posted
before the check so numbers are always visible.
Use --format markdown:file.md syntax instead of --format markdown
with stdout redirect.
Code inside @qd.func / @qd.kernel (and @ti. variants) is
JIT-compiled to GPU code, so Python coverage.py can never
trace it. Exclude these blocks to avoid false-negative
coverage on kernel-heavy files.
When QD_KERNEL_COVERAGE=1, rewrite kernel/func Python ASTs to insert
coverage probes (field stores) before each statement. The probes
execute on the GPU and record which source lines were actually
reached, including runtime if/else branches — not just static ones.

At process exit, probe data is written to .coverage.kernel which
can be merged with pytest-cov data via `coverage combine`.

Zero C++ changes. Zero impact on the normal runtime path — the
coverage module is only imported when the env var is set.
Track which Program instance the coverage field belongs to.
Re-allocate after qd.init() destroys the old SNode tree, preventing
dangling field references with garbage dimensions.
start_lineno=10 + relative line 2 - 1 = 11, not 12.
…it()

The old flush() tried to read the field at atexit, but by then the
runtime was already destroyed (test framework calls qd.reset()),
causing to_numpy() to fail silently. Now we harvest probe data into
_accumulated_lines whenever ensure_field_allocated() detects a
program change, preserving results across reinitializations.
…ction

Instead of trying to read the coverage field after the runtime is
destroyed (which hangs on CUDA), install a hook on clear() that
harvests probe data while the field is still alive. This fixes the
hang when switching architectures (e.g. x64 -> cuda) in tests.
run_tests.py --coverage passes --cov-branch to pytest, producing
branch/arc coverage data. Our .coverage.kernel was writing line-only
data, causing "Can't combine branch coverage data with statement data".
Now we detect branch mode and synthesize arcs from covered lines.
_is_branch_coverage() checked config files but --cov-branch is a CLI
flag not in config. Now reads the actual .coverage file written by
pytest-cov to detect arc mode. Also removes stale .coverage.kernel
from previous runs to avoid "no such table: meta" errors.
Uses portable subgroup.shuffle instead of CUDA-specific warp.shfl,
and qd.gpu arch so it runs on both CUDA and Vulkan.
The kernel reads flag[0] (a runtime field value) to choose between
two shuffle paths, verifying that coverage correctly tracks which
branch executed and which didn't.
qd.gpu doesn't match QD_WANTED_ARCHS=cuda filtering in test_utils.
The _qd_cov field is injected into global_vars for coverage
instrumentation. Pure kernels flag all global_vars accesses as
violations, causing compilation errors. Exempt _qd_cov so coverage
works on all kernels including pure ones.
Set QD_KERNEL_COVERAGE=1 in the test script so kernel probes are
actually injected during CI runs, and add a coverage combine step
to merge .coverage.kernel into the main .coverage before generating
reports. Also fix flush() to accumulate kernel data across multiple
test phases instead of overwriting.
Comment thread pyproject.toml
# things, without doing full c++ build
build-backend = "setuptools.build_meta"

[tool.coverage.report]
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this do?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opus:

That tells coverage.py to treat lines matching those regex patterns as excluded from coverage reporting — as if they had # pragma: no cover.
Specifically, it excludes the decorator lines @qd.func, @qd.kernel, @ti.func, and @ti.kernel. Those lines are never "executed" in the normal Python sense because the decorated
function body gets JIT-compiled rather than interpreted, so without this exclusion they'd show up as uncovered and drag down the coverage percentage.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems superfluous now we removed all the exceptions...

Comment thread python/quadrants/lang/ast/ast_transformer_utils.py Outdated
Comment thread .gitignore
/python/test_env
/CHANGELOG.md
/.coverage
/.coverage.*
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

# Phase 2: install torch, run only torch tests
pip install torch --index-url https://download.pytorch.org/whl/cpu
python tests/run_tests.py -v -r 3 -m needs_torch
python tests/run_tests.py -v -r 3 -m needs_torch --coverage --cov-append
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does --cov-append do?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opus:

--cov-append tells pytest-cov not to erase the existing .coverage data file before the test run. Instead, the new coverage data is appended/merged into whatever's already
there.
By default, pytest-cov wipes .coverage at the start of each run. With --cov-append, you can run multiple test sessions and accumulate coverage across all of them into a single
combined report.
Typical use case: running tests in separate phases (e.g., field backend then ndarray backend) and merging the results into one coverage report at the end.

The existing CUDA phases disable kernel coverage to avoid the DLPack
byte_offset crash, but this also skips test_kernel_coverage.py entirely
(its pytestmark requires QD_KERNEL_COVERAGE=1). Add a dedicated phase
scoped to that file so GPU-only tests like test_kernel_coverage_simt_e2e
actually run in CI.
@hughperkins
Copy link
Copy Markdown
Collaborator Author

@claude review

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Code review skipped — your organization's overage spend limit has been reached.

Code review is billed via overage credits. To resume reviews, an organization admin can raise the monthly limit at claude.ai/admin-settings/claude-code.

Once credits are available, comment @claude review on this pull request to trigger a review.

@hughperkins
Copy link
Copy Markdown
Collaborator Author

@claude review

@github-actions
Copy link
Copy Markdown

Coverage Report (49f8dea62)

Metric Value
Diff coverage (changed lines only) 84%
Overall project coverage 72%

Total: 444 lines, 71 missing, 84% covered

🟢 python/quadrants/lang/_func_base.py (82%)
🔴    7  import os
     25  
🔴   26  def _kernel_coverage_enabled() -> bool:
🟢   27      return os.environ.get("QD_KERNEL_COVERAGE") == "1"
     28  
     29  
🟢  255          _kcov = None
🟢  256          if _kernel_coverage_enabled() and autodiff_mode == _qd_core.AutodiffMode.NONE:
🟢  257              from . import (  # pylint: disable=import-outside-toplevel
    258                  _kernel_coverage as _kcov,
    259              )
    260  
🟢  261              tree = _kcov.rewrite_ast(tree, function_source_info.filepath, function_source_info.start_lineno)
    262  
🟢  266          if _kcov is not None:
🟢  267              cov_field = _kcov.get_field()
🟢  268              if cov_field is not None:
🟢  269                  global_vars[_kcov.FIELD_VAR_NAME] = cov_field
🟢 python/quadrants/lang/_kernel_coverage.py (80%)
      1  """Kernel code coverage via Python AST rewriting.
      2  
      3  When enabled (QD_KERNEL_COVERAGE=1), this module rewrites kernel and func ASTs to insert coverage probes — field
      4  stores that record which source lines actually execute on the GPU. At process exit, the collected data is written
      5  to a .coverage file compatible with coverage.py / pytest-cov / diff-cover.
      6  
      7  The probes are compiled as ordinary field stores by the existing pipeline, so no C++ changes are needed. When
      8  disabled, this module is never imported and has zero impact on the normal runtime path.
      9  """
     10  
🟢   11  import ast
🟢   12  import atexit
🟢   13  import logging
🟢   14  import os
🟢   15  import threading
🟢   16  import warnings
🟢   17  from typing import TYPE_CHECKING
     18  
🟢   19  from coverage import CoverageData  # type: ignore[import-not-found]
     20  
🟢   21  import quadrants as qd
🟢   22  from quadrants.lang import impl
     23  
     24  if TYPE_CHECKING:
     25      from quadrants.lang.field import ScalarField
     26  
🟢   27  FIELD_VAR_NAME = "_qd_cov"
🟢   28  _MAX_PROBES = int(os.environ.get("QD_COVERAGE_MAX_PROBES", "100000"))
     29  
🟢   30  _lock = threading.Lock()
🟢   31  _cov_field: "ScalarField | None" = None
🟢   32  _cov_field_prog: object | None = None  # tracks which Program instance owns _cov_field
🟢   33  _probe_counter: int = 0
     34  # {probe_id: (filepath, absolute_lineno)}
🟢   35  _probe_map: dict[int, tuple[str, int]] = {}
     36  # Accumulated coverage lines surviving across qd.init() resets
🟢   37  _accumulated_lines: dict[str, set[int]] = {}
🟢   38  _reset_hook_installed: bool = False
     39  # Directory for .coverage and _qd_kcov.* files, captured when coverage is first enabled
🟢   40  _coverage_dir: str | None = None
     41  
     42  
🟢   43  def _harvest_field() -> None:
     44      """Read probe data from the current field into _accumulated_lines.
     45  
     46      Must be called while the runtime is still alive (before clear()).
     47      """
     48      global _cov_field, _cov_field_prog
🟢   49      with _lock:
🟢   50          if _cov_field is None or not _probe_map:
🟢   51              return
🟢   52          field_ref = _cov_field
🟢   53          probe_snapshot = dict(_probe_map)
🟢   54          _cov_field = None
🟢   55          _cov_field_prog = None
🟢   56      try:
🟢   57          arr = field_ref.to_numpy()
🟢   58      except Exception:
🟢   59          logging.warning("Failed to read coverage field, coverage data for this session will be lost", exc_info=True)
🟢   60          return
🟢   61      with _lock:
🟢   62          for probe_id, (filepath, lineno) in probe_snapshot.items():
🟢   63              if probe_id < len(arr) and arr[probe_id] != 0:
🟢   64                  _accumulated_lines.setdefault(filepath, set()).add(lineno)
     65  
     66  
🟢   67  def _install_reset_hook() -> None:
     68      """Monkey-patch PyQuadrants.clear() to harvest probes before destruction."""
     69      global _reset_hook_installed
🟢   70      if _reset_hook_installed:
🟢   71          return
🟢   72      _original_clear = impl.PyQuadrants.clear
     73  
🟢   74      def _hooked_clear(self) -> None:
🟢   75          _harvest_field()
🟢   76          _original_clear(self)
     77  
🟢   78      impl.PyQuadrants.clear = _hooked_clear  # type: ignore[assignment]
🟢   79      _reset_hook_installed = True
     80  
     81  
🟢   82  def ensure_field_allocated() -> None:
     83      """Allocate (or re-allocate after qd.init()) the global coverage field."""
     84      global _cov_field, _cov_field_prog, _coverage_dir
🟢   85      _install_reset_hook()
🟢   86      if _coverage_dir is None:
🟢   87          _coverage_dir = os.getcwd()
🟢   88      current_prog = impl.get_runtime()._prog
🟢   89      if _cov_field is not None and _cov_field_prog is current_prog:
🟢   90          return
🟢   91      with _lock:
🟢   92          current_prog = impl.get_runtime()._prog
🟢   93          if _cov_field is not None and _cov_field_prog is current_prog:
🔴   94              return
🟢   95          _cov_field = qd.field(dtype=qd.i32, shape=(_MAX_PROBES,))  # type: ignore[assignment]
🟢   96          _cov_field_prog = current_prog
     97  
     98  
🟢   99  def get_field() -> "ScalarField | None":
🟢  100      with _lock:
🟢  101          if _cov_field_prog is not impl.get_runtime()._prog:
🔴  102              return None
🟢  103          return _cov_field
    104  
    105  
🟢  106  def rewrite_ast(tree: ast.Module, filepath: str, start_lineno: int) -> ast.Module:
    107      """Rewrite a kernel/func AST to insert coverage probes.
    108  
    109      Each executable statement at a new source line gets a probe: ``_qd_cov[<probe_id>] = 1``.
    110      Probes inside if/else bodies only fire when that branch is taken, giving true runtime branch coverage.
    111      """
    112      global _probe_counter
🟢  113      with _lock:
🟢  114          rewriter = _CoverageASTRewriter(
    115              field_name=FIELD_VAR_NAME,
    116              filepath=filepath,
    117              start_lineno=start_lineno,
    118              probe_id_start=_probe_counter,
    119          )
🟢  120          tree = rewriter.visit(tree)
🟢  121          ast.fix_missing_locations(tree)
🟢  122          _probe_counter = rewriter.next_probe_id
🟢  123          _probe_map.update(rewriter.probe_map)
🟢  124      return tree
    125  
    126  
🟢  127  def _detect_arc_mode() -> bool:
    128      """Detect whether pytest-cov wrote branch (arc) data by reading .coverage.
    129  
    130      Defaults to True (arc mode) when .coverage doesn't exist or is empty, since run_tests.py --coverage always
    131      enables --cov-branch.
    132      """
🔴  133      try:
🔴  134          cov_path = os.path.join(_coverage_dir, ".coverage") if _coverage_dir else ".coverage"
🔴  135          cd = CoverageData(basename=cov_path)
🔴  136          cd.read()
🔴  137          if not cd.measured_files():
🔴  138              return True
🔴  139          return cd.has_arcs()
🔴  140      except Exception:
🔴  141          logging.debug("Failed to detect arc mode from .coverage file, defaulting to arc mode", exc_info=True)
🔴  142          return True
    143  
    144  
🟢  145  def flush() -> None:
    146      """Harvest any remaining field data and write all results to a .coverage file.
    147  
    148      If .coverage.kernel already exists (e.g. from a prior test phase), the new data is merged into it so nothing
    149      is lost across multiple invocations.
    150      """
🔴  151      _harvest_field()
    152  
🔴  153      with _lock:
🔴  154          if not _accumulated_lines:
🔴  155              return
🔴  156          snapshot = {f: set(lines) for f, lines in _accumulated_lines.items()}
    157  
🔴  158      base_dir = _coverage_dir or os.getcwd()
🔴  159      kernel_path = os.path.join(base_dir, f"_qd_kcov.{os.getpid()}")
🔴  160      use_arcs = _detect_arc_mode()
    161  
🔴  162      cov = CoverageData(basename=kernel_path)
🔴  163      if use_arcs:
🔴  164          arcs_by_file: dict[str, list[tuple[int, int]]] = {}
🔴  165          for filepath, lines in snapshot.items():
    166              # Emit only entry/exit arcs per line — we know which lines ran but not the actual transitions
    167              # between them, so we avoid fabricating inter-line arcs that would misrepresent branch coverage.
🔴  168              arcs = []
🔴  169              for line in sorted(lines):
🔴  170                  arcs.append((-1, line))
🔴  171                  arcs.append((line, -1))
🔴  172              arcs_by_file[filepath] = arcs
🔴  173          cov.add_arcs(arcs_by_file)
    174      else:
🔴  175          cov.add_lines({f: sorted(lines) for f, lines in snapshot.items()})
🔴  176      cov.write()
    177  
    178  
🟢  179  _capacity_warning_emitted = False
    180  
    181  
🟢  182  class _CoverageASTRewriter(ast.NodeTransformer):
    183      """Insert coverage probes before each statement at a new source line."""
    184  
🟢  185      def __init__(self, field_name: str, filepath: str, start_lineno: int, probe_id_start: int) -> None:
🟢  186          self._field_name = field_name
🟢  187          self._filepath = filepath
🟢  188          self._start_lineno = start_lineno
🟢  189          self.next_probe_id = probe_id_start
🟢  190          self._seen_lines: set[int] = set()
🟢  191          self.probe_map: dict[int, tuple[str, int]] = {}
    192  
🟢  193      def _make_probe(self, abs_lineno: int, rel_lineno: int, col_offset: int) -> ast.Assign | None:
    194          global _capacity_warning_emitted
🟢  195          probe_id = self.next_probe_id
🟢  196          if probe_id >= _MAX_PROBES:
🟢  197              if not _capacity_warning_emitted:
🟢  198                  warnings.warn(
    199                      f"Kernel coverage probe capacity ({_MAX_PROBES}) exceeded. "
    200                      f"Additional kernel lines will not be tracked. "
    201                      f"Set QD_COVERAGE_MAX_PROBES to a higher value.",
    202                      stacklevel=2,
    203                  )
🟢  204                  _capacity_warning_emitted = True
🟢  205              return None
🟢  206          self.probe_map[probe_id] = (self._filepath, abs_lineno)
🟢  207          self.next_probe_id += 1
🟢  208          node = ast.Assign(
    209              targets=[
    210                  ast.Subscript(
    211                      value=ast.Name(id=self._field_name, ctx=ast.Load()),
    212                      slice=ast.Constant(value=probe_id),
    213                      ctx=ast.Store(),
    214                  )
    215              ],
    216              value=ast.Constant(value=1),
    217              lineno=rel_lineno,
    218              col_offset=col_offset,
    219              end_lineno=rel_lineno,
    220              end_col_offset=col_offset,
    221          )
🟢  222          return node
    223  
🟢  224      def _instrument_body(self, stmts: list[ast.stmt]) -> list[ast.stmt]:
🟢  225          result: list[ast.stmt] = []
🟢  226          for stmt in stmts:
🟢  227              rel_lineno = getattr(stmt, "lineno", None)
🟢  228              if rel_lineno is not None:
🟢  229                  abs_lineno = rel_lineno + self._start_lineno - 1
🟢  230                  if abs_lineno not in self._seen_lines:
🟢  231                      self._seen_lines.add(abs_lineno)
🟢  232                      col = getattr(stmt, "col_offset", 0)
🟢  233                      probe = self._make_probe(abs_lineno, rel_lineno, col)
🟢  234                      if probe is not None:
🟢  235                          result.append(probe)
🟢  236              result.append(self.visit(stmt))
🟢  237          return result
    238  
🟢  239      def visit_FunctionDef(self, node: ast.FunctionDef) -> ast.FunctionDef:
🟢  240          node.body = self._instrument_body(node.body)
🟢  241          return node
    242  
🟢  243      def visit_AsyncFunctionDef(self, node: ast.AsyncFunctionDef) -> ast.AsyncFunctionDef:
🔴  244          node.body = self._instrument_body(node.body)
🔴  245          return node
    246  
🟢  247      def visit_If(self, node: ast.If) -> ast.If:
🟢  248          node.body = self._instrument_body(node.body)
🟢  249          if node.orelse:
🟢  250              node.orelse = self._instrument_body(node.orelse)
🟢  251          return node
    252  
🟢  253      def visit_For(self, node: ast.For) -> ast.For:
🟢  254          node.body = self._instrument_body(node.body)
🟢  255          if node.orelse:
🟢  256              node.orelse = self._instrument_body(node.orelse)
🟢  257          return node
    258  
🟢  259      def visit_While(self, node: ast.While) -> ast.While:
🟢  260          node.body = self._instrument_body(node.body)
🟢  261          if node.orelse:
🟢  262              node.orelse = self._instrument_body(node.orelse)
🟢  263          return node
    264  
🟢  265      def visit_With(self, node: ast.With) -> ast.With:
🟢  266          node.body = self._instrument_body(node.body)
🟢  267          return node
    268  
🟢  269      def visit_Try(self, node: ast.Try) -> ast.Try:
🟢  270          node.body = self._instrument_body(node.body)
🟢  271          for handler in node.handlers:
🟢  272              handler.body = self._instrument_body(handler.body)
🟢  273          if node.orelse:
🟢  274              node.orelse = self._instrument_body(node.orelse)
🟢  275          if node.finalbody:
🟢  276              node.finalbody = self._instrument_body(node.finalbody)
🟢  277          return node
    278  
    279  
🟢  280  atexit.register(flush)
🟢 python/quadrants/lang/ast/ast_transformer_utils.py (100%)
🟢  335              if not name.startswith("_qd_"):
🟢  336                  reason = f"{name} is in global vars, therefore violates pure"
🟢  337                  violates_pure = True
🟢 python/quadrants/lang/kernel.py (80%)
     19  
🔴   20  def _kernel_coverage_enabled() -> bool:
🟢   21      return os.environ.get("QD_KERNEL_COVERAGE") == "1"
     22  
     23  
🟢  382          if _kernel_coverage_enabled():
🟢  383              from . import _kernel_coverage  # pylint: disable=import-outside-toplevel
    384  
🟢  385              _kernel_coverage.ensure_field_allocated()
    386  
🟢 python/quadrants/lang/misc.py (100%)
    496  
🟢  497      if os.environ.get("QD_KERNEL_COVERAGE") == "1":
🟢  498          from . import _kernel_coverage  # pylint: disable=import-outside-toplevel
    499  
🟢  500          _kernel_coverage.ensure_field_allocated()
    501  
🔴 python/quadrants/pytest_plugin.py (50%)
      1  """Pytest plugin that auto-enables kernel coverage when pytest-cov is active.
      2  
      3  Registered via the ``pytest11`` entry point so it loads automatically when quadrants is installed.
      4  Opt out by setting ``QD_KERNEL_COVERAGE=0`` explicitly.
      5  """
      6  
🔴    7  import os
      8  
      9  
🔴   10  def pytest_configure(config):
🟢   11      if config.pluginmanager.hasplugin("_cov"):
🟢   12          os.environ.setdefault("QD_KERNEL_COVERAGE", "1")
     13          # Kernel coverage always writes arc-format data; ensure pytest-cov matches to avoid
     14          # "Can not mix line and arc data" errors during coverage combine.
🟢   15          if not config.option.__dict__.get("cov_branch", False):
🔴   16              config.option.cov_branch = True
🟢 tests/python/quadrants/lang/fast_caching/test_src_ll_cache.py (100%)
     11  
🟢   12  _KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"
🟢   67  @pytest.mark.skipif(
     68      _KERNEL_COVERAGE,
     69      reason="Coverage probes change LLVM IR addresses after reinit, breaking recompile comparison",
     70  )
🟢 tests/python/quadrants/lang/test_kernel_impl.py (100%)
🟢    1  import os
🟢   11  _KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"
     13  
🟢   14  @pytest.mark.skipif(
     15      _KERNEL_COVERAGE,
     16      reason="Coverage probes change the kernel AST, preventing FE-LL cache hits after reinit",
     17  )
🟢 tests/python/test_api.py (100%)
🟢  438      actual = sorted([s for s in dir(src) if not s.startswith(("_", "@")) and s != "pytest_plugin"])
🟢 tests/python/test_intrinsics.py (100%)
     55      (thread i does (i+1)*200000). Asserts strict monotonicity across threads and that
🟢   91          assert a[i - 1] < a[i] < a[i + 1]
🟢 tests/python/test_kernel_coverage.py (87%)
      1  """Tests for kernel code coverage instrumentation.
      2  
      3  These tests verify that the AST rewriter correctly inserts coverage probes and that the probes fire when kernel
      4  code executes on the device.
      5  """
      6  
🟢    7  import ast
🟢    8  import os
🟢    9  import textwrap
     10  
🟢   11  import pytest
     12  
🟢   13  import quadrants as qd
     14  
🟢   15  from tests import test_utils
     16  
     17  # These tests only run when QD_KERNEL_COVERAGE=1
🟢   18  pytestmark = pytest.mark.skipif(
     19      os.environ.get("QD_KERNEL_COVERAGE", "") != "1",
     20      reason="QD_KERNEL_COVERAGE=1 not set",
     21  )
     22  
     23  
     24  # ---------------------------------------------------------------------------
     25  # AST rewriter unit tests
     26  # ---------------------------------------------------------------------------
     27  
🟢   28  _AST_REWRITER_CASES = [
     29      pytest.param(
     30          """\
     31          def f():
     32              x = 1
     33              y = 2
     34              return x + y
     35          """,
     36          {11, 12, 13},
     37          10,
     38          id="straight_line",
     39      ),
     40      pytest.param(
     41          """\
     42          def f():
     43              if x > 0:
     44                  a = 1
     45              else:
     46                  b = 2
     47          """,
     48          {2, 3, 5},
     49          1,
     50          id="if_else",
     51      ),
     52      pytest.param(
     53          """\
     54          def f():
     55              for i in range(10):
     56                  x = i
     57          """,
     58          {2, 3},
     59          1,
     60          id="for_loop",
     61      ),
     62      pytest.param(
     63          """\
     64          def f():
     65              while x > 0:
     66                  x = x - 1
     67              else:
     68                  y = 0
     69          """,
     70          {2, 3, 5},
     71          1,
     72          id="while_loop_else",
     73      ),
     74      pytest.param(
     75          """\
     76          def f():
     77              with ctx:
     78                  a = 1
     79                  b = 2
     80          """,
     81          {2, 3, 4},
     82          1,
     83          id="with_statement",
     84      ),
     85      pytest.param(
     86          """\
     87          def f():
     88              try:
     89                  a = 1
     90              except:
     91                  b = 2
     92              else:
     93                  c = 3
     94              finally:
     95                  d = 4
     96          """,
     97          {3, 5, 7, 9},
     98          1,
     99          id="try_except_finally",
    100      ),
    101  ]
    102  
    103  
🟢  104  @pytest.mark.parametrize("src,expected_lines,start_lineno", _AST_REWRITER_CASES)
🟢  105  def test_ast_rewriter(src, expected_lines, start_lineno):
    106      """Verify the AST rewriter inserts probes at the expected source lines."""
🟢  107      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    108  
🟢  109      tree = ast.parse(textwrap.dedent(src))
🟢  110      rewriter = _CoverageASTRewriter(
    111          field_name="_qd_cov", filepath="test.py", start_lineno=start_lineno, probe_id_start=0
    112      )
🟢  113      rewriter.visit(tree)
    114  
🟢  115      covered_lines = {lineno for _, (_, lineno) in rewriter.probe_map.items()}
🟢  116      assert expected_lines.issubset(covered_lines), f"Expected lines {expected_lines} to be probed, got {covered_lines}"
    117  
    118  
🟢  119  def test_ast_rewriter_capacity_limit():
    120      """Verify that probes stop being inserted when the capacity limit is hit."""
🟢  121      import warnings
    122  
🟢  123      import quadrants.lang._kernel_coverage as kcov
🟢  124      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    125  
🟢  126      src = textwrap.dedent(
    127          """\
    128          def f():
    129              a = 1
    130              b = 2
    131              c = 3
    132      """
    133      )
🟢  134      tree = ast.parse(src)
🟢  135      old_warning_state = kcov._capacity_warning_emitted
🟢  136      kcov._capacity_warning_emitted = False
🟢  137      try:
🟢  138          with warnings.catch_warnings(record=True) as w:
🟢  139              warnings.simplefilter("always")
🟢  140              rewriter = _CoverageASTRewriter(
    141                  field_name="_qd_cov", filepath="test.py", start_lineno=1, probe_id_start=kcov._MAX_PROBES - 1
    142              )
🟢  143              rewriter.visit(tree)
    144  
🟢  145          assert rewriter.next_probe_id == kcov._MAX_PROBES
🟢  146          assert len(rewriter.probe_map) == 1, f"Only 1 probe should fit, got {len(rewriter.probe_map)}"
🟢  147          assert len(w) == 1
🟢  148          assert "exceeded" in str(w[0].message).lower()
    149      finally:
🟢  150          kcov._capacity_warning_emitted = old_warning_state
    151  
    152  
🟢  153  def test_ast_rewriter_deduplicates_same_line():
    154      """Verify that two statements on the same source line get only one probe."""
🟢  155      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    156  
🟢  157      src = "def f():\n    a = 1; b = 2\n"
🟢  158      tree = ast.parse(src)
🟢  159      rewriter = _CoverageASTRewriter(field_name="_qd_cov", filepath="test.py", start_lineno=1, probe_id_start=0)
🟢  160      rewriter.visit(tree)
    161  
🟢  162      abs_lines = [lineno for _, (_, lineno) in rewriter.probe_map.items()]
🟢  163      assert abs_lines.count(2) == 1, f"Line 2 should have exactly one probe, got {abs_lines.count(2)}"
    164  
    165  
🟢  166  def test_env_var_max_probes():
    167      """Verify that QD_COVERAGE_MAX_PROBES env var is read at import time."""
🟢  168      import quadrants.lang._kernel_coverage as kcov
    169  
🟢  170      assert kcov._MAX_PROBES == int(os.environ.get("QD_COVERAGE_MAX_PROBES", "100000"))
    171  
    172  
🟢  173  def test_harvest_field_exception_path():
    174      """Verify that _harvest_field handles to_numpy() failure gracefully."""
🟢  175      from unittest.mock import MagicMock
    176  
🟢  177      import quadrants.lang._kernel_coverage as kcov
    178  
🟢  179      old_field = kcov._cov_field
🟢  180      old_prog = kcov._cov_field_prog
🟢  181      old_map = kcov._probe_map.copy()
🟢  182      try:
🟢  183          mock_field = MagicMock()
🟢  184          mock_field.to_numpy.side_effect = RuntimeError("runtime destroyed")
🟢  185          kcov._cov_field = mock_field
🟢  186          kcov._cov_field_prog = object()
🟢  187          kcov._probe_map[999999] = ("fake.py", 1)
    188  
    189          # Should not raise — the exception is caught and logged
🟢  190          kcov._harvest_field()
    191  
🟢  192          assert kcov._cov_field is None, "Field should be cleared after failure"
🟢  193          assert kcov._cov_field_prog is None, "Field prog should be cleared after failure"
    194      finally:
🟢  195          kcov._cov_field = old_field
🟢  196          kcov._cov_field_prog = old_prog
🟢  197          kcov._probe_map = old_map
    198  
    199  
    200  # ---------------------------------------------------------------------------
    201  # End-to-end tests
    202  # ---------------------------------------------------------------------------
    203  
    204  
🟢  205  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  206  def test_kernel_coverage_branches_e2e():
    207      """Verify that only the taken branch has its probe fired."""
🟢  208      from quadrants.lang import _kernel_coverage
    209  
🟢  210      _kernel_coverage.ensure_field_allocated()
    211  
🟢  212      probe_count_before = _kernel_coverage._probe_counter
🟢  213      out = qd.field(dtype=qd.i32, shape=(1,))
    214  
🟢  215      @qd.kernel
🟢  216      def branching_kernel():
🟢  217          x = 10
🟢  218          if x > 5:
🟢  219              out[0] = 1
    220          else:
🔴  221              out[0] = 2
    222  
🟢  223      branching_kernel()
    224  
🟢  225      assert out[0] == 1
    226  
🟢  227      cov_field = _kernel_coverage.get_field()
🟢  228      arr = cov_field.to_numpy()
    229  
🟢  230      probes_for_kernel = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
    231  
🟢  232      taken_probes = {pid for pid, loc in probes_for_kernel.items() if arr[pid] != 0}
🟢  233      not_taken_probes = {pid for pid, loc in probes_for_kernel.items() if arr[pid] == 0}
    234  
🟢  235      assert len(taken_probes) > 0, "At least some probes should have fired"
🟢  236      assert len(not_taken_probes) > 0, "The else branch should not have been reached"
    237  
    238  
🟢  239  @test_utils.test(arch=qd.gpu)
🟢  240  def test_kernel_coverage_simt_e2e():
    241      """Verify coverage probes track branches with block.sync() and subgroup shuffle.
    242  
    243      The if/else is based on a runtime value read from a field, so the compiler cannot constant-fold it away.
    244      Only the taken branch's shuffle probe should fire.
    245      """
🔴  246      from quadrants.lang import _kernel_coverage
🔴  247      from quadrants.lang.simt import subgroup
    248  
🔴  249      _kernel_coverage.ensure_field_allocated()
    250  
🔴  251      N = 64
🔴  252      probe_count_before = _kernel_coverage._probe_counter
🔴  253      flag = qd.field(dtype=qd.i32, shape=(1,))
🔴  254      a = qd.field(dtype=qd.i32, shape=(N,))
🔴  255      out = qd.field(dtype=qd.i32, shape=(N,))
    256  
🔴  257      flag[0] = 1  # runtime value: take the if-branch
    258  
🔴  259      @qd.kernel
🔴  260      def simt_kernel():
🔴  261          qd.loop_config(block_dim=N)
🔴  262          for i in range(N):
🔴  263              a[i] = i + 1
🔴  264              qd.simt.block.sync()
🔴  265              if flag[0] > 0:
🔴  266                  val = subgroup.shuffle(a[i], qd.u32(0))
🔴  267                  out[i] = val
    268              else:
🔴  269                  val = subgroup.shuffle(a[i], qd.u32(1))
🔴  270                  out[i] = val + 100
    271  
🔴  272      simt_kernel()
    273  
🔴  274      for i in range(4):
🔴  275          assert out[i] == 1, f"Expected 1 at index {i}, got {out[i]}"
    276  
🔴  277      cov_field = _kernel_coverage.get_field()
🔴  278      arr = cov_field.to_numpy()
    279  
🔴  280      probes_for_kernel = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
    281  
🔴  282      fired = {pid for pid in probes_for_kernel if arr[pid] != 0}
🔴  283      not_fired = {pid for pid in probes_for_kernel if arr[pid] == 0}
🔴  284      assert len(fired) >= 4, f"Expected at least 4 probes to fire, got {len(fired)}"
🔴  285      assert len(not_fired) >= 2, "The else branch should not have been reached"
    286  
    287  
🟢  288  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  289  def test_kernel_coverage_survives_reinit():
    290      """Verify that coverage data accumulated before qd.init() reset is preserved.
    291  
    292      Runs a kernel, then resets via qd.reset()/qd.init() (which triggers the _hooked_clear harvest), runs another
    293      kernel, harvests again, and checks that _accumulated_lines contains data from both sessions.
    294      """
🟢  295      from quadrants.lang import _kernel_coverage, impl
    296  
🟢  297      current_arch = impl.get_runtime()._arch
🟢  298      _kernel_coverage.ensure_field_allocated()
    299  
🟢  300      probe_count_before = _kernel_coverage._probe_counter
🟢  301      out1 = qd.field(dtype=qd.i32, shape=(1,))
    302  
🟢  303      @qd.kernel
🟢  304      def kernel_before_reset():
🟢  305          out1[0] = 1
    306  
🟢  307      kernel_before_reset()
    308  
🟢  309      cov_field = _kernel_coverage.get_field()
🟢  310      assert cov_field is not None
🟢  311      arr = cov_field.to_numpy()
🟢  312      probes_first = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  313      fired_first = {pid for pid in probes_first if arr[pid] != 0}
🟢  314      assert len(fired_first) > 0, "Probes from first kernel should have fired"
    315  
    316      # Don't call _harvest_field() manually — let qd.reset() trigger it via the _hooked_clear hook
🟢  317      qd.reset()
    318  
    319      # Verify the hook harvested data from the first session
🟢  320      files_before = set(_kernel_coverage._accumulated_lines.keys())
🟢  321      assert len(files_before) > 0, "Hook should have harvested data during reset"
🟢  322      lines_before = {}
🟢  323      for f, lines in _kernel_coverage._accumulated_lines.items():
🟢  324          lines_before[f] = set(lines)
    325  
🟢  326      qd.init(arch=current_arch)
    327  
🟢  328      _kernel_coverage.ensure_field_allocated()
    329  
🟢  330      probe_count_mid = _kernel_coverage._probe_counter
🟢  331      out2 = qd.field(dtype=qd.i32, shape=(1,))
    332  
🟢  333      @qd.kernel
🟢  334      def kernel_after_reset():
🟢  335          out2[0] = 2
    336  
🟢  337      kernel_after_reset()
    338  
🟢  339      _kernel_coverage._harvest_field()
    340  
🟢  341      for f in files_before:
🟢  342          assert (
    343              f in _kernel_coverage._accumulated_lines
    344          ), f"File {f} from before reset should still be in _accumulated_lines"
🟢  345          assert lines_before[f].issubset(
    346              _kernel_coverage._accumulated_lines[f]
    347          ), "Lines from before reset should be preserved"
    348  
🟢  349      probes_second = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_mid}
🟢  350      second_files = {loc[0] for loc in probes_second.values()}
🟢  351      for f in second_files:
🟢  352          assert f in _kernel_coverage._accumulated_lines, f"File {f} from second kernel should be in _accumulated_lines"
    353  
    354  
🟢  355  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  356  def test_kernel_coverage_autodiff():
    357      """Verify that autodiff forward pass produces probes but backward does not.
    358  
    359      The forward compilation (AutodiffMode.NONE) should insert probes that fire. The backward compilation
    360      (AutodiffMode.REVERSE) should not add any probes.
    361      """
🟢  362      from quadrants.lang import _kernel_coverage
    363  
🟢  364      _kernel_coverage.ensure_field_allocated()
    365  
🟢  366      x = qd.field(dtype=qd.f32, shape=(), needs_grad=True)
🟢  367      loss = qd.field(dtype=qd.f32, shape=(), needs_grad=True)
    368  
🟢  369      @qd.kernel
🟢  370      def compute():
🟢  371          loss[None] = x[None] * x[None]
    372  
🟢  373      x[None] = 5.0
    374  
🟢  375      probe_count_before = _kernel_coverage._probe_counter
    376  
🟢  377      with qd.ad.Tape(loss):
🟢  378          compute()
    379  
🟢  380      probe_count_after_tape = _kernel_coverage._probe_counter
🟢  381      forward_probes = probe_count_after_tape - probe_count_before
🟢  382      assert forward_probes > 0, "Forward compilation should have inserted probes"
    383  
    384      # Verify forward probes actually fired
🟢  385      cov_field = _kernel_coverage.get_field()
🟢  386      assert cov_field is not None
🟢  387      arr = cov_field.to_numpy()
🟢  388      probes = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  389      fired = {pid for pid in probes if arr[pid] != 0}
🟢  390      assert len(fired) > 0, "Forward pass inside Tape should produce fired coverage probes"
    391  
    392      # Verify backward pass computes correct gradients
🟢  393      assert loss[None] == pytest.approx(25.0)
🟢  394      assert x.grad[None] == pytest.approx(10.0)
    395  
    396  
🟢  397  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  398  def test_kernel_coverage_qd_func():
    399      """Verify that probes fire inside a @qd.func called from a kernel."""
🟢  400      from quadrants.lang import _kernel_coverage
    401  
🟢  402      _kernel_coverage.ensure_field_allocated()
    403  
🟢  404      probe_count_before = _kernel_coverage._probe_counter
🟢  405      out = qd.field(dtype=qd.i32, shape=(1,))
    406  
🟢  407      @qd.func
🟢  408      def helper():
🟢  409          out[0] = 99
    410  
🟢  411      @qd.kernel
🟢  412      def caller():
🟢  413          helper()
    414  
🟢  415      caller()
    416  
🟢  417      assert out[0] == 99
    418  
🟢  419      cov_field = _kernel_coverage.get_field()
🟢  420      assert cov_field is not None
🟢  421      arr = cov_field.to_numpy()
    422  
🟢  423      probes = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  424      fired = {pid for pid in probes if arr[pid] != 0}
    425      # The kernel body has one statement (helper()), and the func body has one (out[0] = 99).
    426      # Both should produce probes that fire.
🟢  427      assert (
    428          len(fired) >= 2
    429      ), f"Expected probes from both kernel and func to fire, got {len(fired)} fired out of {len(probes)}"
    430  
    431  
🟢  432  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  433  def test_kernel_coverage_multiple_kernels_same_session():
    434      """Verify that probes from two different kernels both fire in the same session."""
🟢  435      from quadrants.lang import _kernel_coverage
    436  
🟢  437      _kernel_coverage.ensure_field_allocated()
    438  
🟢  439      probe_count_before = _kernel_coverage._probe_counter
🟢  440      a = qd.field(dtype=qd.i32, shape=(1,))
🟢  441      b = qd.field(dtype=qd.i32, shape=(1,))
    442  
🟢  443      @qd.kernel
🟢  444      def kernel_a():
🟢  445          a[0] = 10
    446  
🟢  447      @qd.kernel
🟢  448      def kernel_b():
🟢  449          b[0] = 20
    450  
🟢  451      kernel_a()
🟢  452      probe_count_after_a = _kernel_coverage._probe_counter
🟢  453      kernel_b()
    454  
🟢  455      assert a[0] == 10
🟢  456      assert b[0] == 20
    457  
🟢  458      cov_field = _kernel_coverage.get_field()
🟢  459      arr = cov_field.to_numpy()
    460  
🟢  461      probes_a = {
    462          pid: loc for pid, loc in _kernel_coverage._probe_map.items() if probe_count_before <= pid < probe_count_after_a
    463      }
🟢  464      probes_b = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_after_a}
    465  
🟢  466      fired_a = {pid for pid in probes_a if arr[pid] != 0}
🟢  467      fired_b = {pid for pid in probes_b if arr[pid] != 0}
    468  
🟢  469      assert len(fired_a) > 0, "Probes from kernel_a should have fired"
🟢  470      assert len(fired_b) > 0, "Probes from kernel_b should have fired"
    471  
    472  
🟢  473  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  474  def test_qd_prefix_exemption_pure_kernel():
    475      """Verify that _qd_-prefixed globals don't violate pure kernel checks.
    476  
    477      With kernel coverage enabled, _qd_cov is injected as a global. This test verifies that a pure (fastcache)
    478      kernel still compiles without error. The kernel uses ndarray arguments (not global fields) because pure
    479      kernels prohibit non-_qd_ globals.
    480      """
🟢  481      a = qd.ndarray(qd.i32, (1,))
    482  
🟢  483      @qd.kernel(fastcache=True)
🟢  484      def pure_kernel(arr: qd.types.NDArray) -> None:
🟢  485          arr[0] = 42
    486  
🟢  487      pure_kernel(a)
🟢  488      assert a[0] == 42
🟢 tests/python/test_offline_cache.py (100%)
     16  # Coverage field allocation creates internal fill kernels that change cache file counts.
     17  # CI runs these tests in a separate phase without QD_KERNEL_COVERAGE (see 4_test.sh).
🟢   18  pytestmark = pytest.mark.skipif(
     19      os.environ.get("QD_KERNEL_COVERAGE") == "1",
     20      reason="Kernel coverage adds internal kernels that invalidate cache file count assertions",
     21  )
     22  
🟢 tests/python/test_snode_layout_inspection.py (100%)
🟢    1  import os
      2  
🟢    3  import pytest
      4  
🟢   10  @pytest.mark.skipif(
     11      os.environ.get("QD_KERNEL_COVERAGE") == "1",
     12      reason="Kernel coverage field on root shifts offset assertions",
     13  )

run_tests.py auto-prepends "test_" to filenames that don't start with
it, so passing the full path "tests/python/test_kernel_coverage.py"
was mangled to "test_tests/python/test_kernel_coverage.py".

Made-with: Cursor
Without this, toggling coverage between runs serves a stale cached
kernel: coverage-off cached kernel gives 0% coverage when re-enabled,
and coverage-on cached kernel causes NameError on _qd_cov when disabled.
The previous approach of setting config.option.cov_branch in
pytest_configure was too late -- pytest-cov had already initialized
its Coverage object in an earlier hook. Instead, the plugin now reads
the cov_branch option and sets _QD_KCOV_ARC env var, which
_detect_arc_mode() checks first. This avoids "Can not mix line and
arc data" errors when users run pytest --cov without --cov-branch.

Also change the default from arc to line mode when nothing is known,
since pytest --cov without --cov-branch is more common.
@hughperkins
Copy link
Copy Markdown
Collaborator Author

@claude review

Comment on lines +106 to +124
def rewrite_ast(tree: ast.Module, filepath: str, start_lineno: int) -> ast.Module:
"""Rewrite a kernel/func AST to insert coverage probes.

Each executable statement at a new source line gets a probe: ``_qd_cov[<probe_id>] = 1``.
Probes inside if/else bodies only fire when that branch is taken, giving true runtime branch coverage.
"""
global _probe_counter
with _lock:
rewriter = _CoverageASTRewriter(
field_name=FIELD_VAR_NAME,
filepath=filepath,
start_lineno=start_lineno,
probe_id_start=_probe_counter,
)
tree = rewriter.visit(tree)
ast.fix_missing_locations(tree)
_probe_counter = rewriter.next_probe_id
_probe_map.update(rewriter.probe_map)
return tree
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Cached kernel IR includes baked-in _qd_cov[N] = 1 stores where N comes from the per-process monotonic _probe_counter, so probe_ids are not stable across processes. The new kcov cache-key partition in src_hasher.create_cache_key (src_hasher.py:53) correctly separates on/off, but within kcov=on two processes with identical source produce IR whose literal probe_ids reflect the first process's compilation order; when Process B loads Process A's cached IR with @qd.kernel(fastcache=True) + offline_cache=True, launches write to Process B's _probe_map[0..N] slots that may belong to a completely different kernel — producing silent false positives (another kernel's lines marked covered) and false negatives (the running kernel's lines marked uncovered). The same mechanism means QD_COVERAGE_MAX_PROBES is also absent from the cache key, so shrinking the cap between runs can load IR whose baked-in probe_id exceeds the new field size, producing OOB device writes. Fix: derive probe_ids deterministically from hash((filepath, abs_lineno)) with a module-level (filepath, lineno) -> probe_id dedup dict, and append str(_MAX_PROBES) to the hash_iterable_strings tuple. CI misses this because test_offline_cache.py is skipped under QD_KERNEL_COVERAGE=1.

Extended reasoning...

What the bug is and how it manifests

_CoverageASTRewriter._make_probe (python/quadrants/lang/_kernel_coverage.py:193-222) produces an ast.Assign node whose subscript is ast.Constant(value=probe_id). The compiled kernel IR therefore contains the literal probe_id value from whichever process first rewrote the AST. Probe IDs come from the module-level _probe_counter (lines 33, 112-123), which is per-process and monotonically increasing: the same kernel gets different probe_ids depending on which kernels materialized first in this process.

On top of that, src_hasher.create_cache_key (src_hasher.py:44-55) now appends "kcov" when QD_KERNEL_COVERAGE=1, correctly partitioning the cache between coverage-on and coverage-off — but the tuple contains no component reflecting the current probe_counter state, _probe_map contents, or _MAX_PROBES. Two processes with identical source + same kcov flag produce the same fastcache key, so Process B can load an IR that was compiled by Process A with Process A-specific probe_id literals.

The specific code path that triggers it

  1. Process A compiles kernels K1, K2 in that order with QD_KERNEL_COVERAGE=1. rewrite_ast assigns probe_ids [0, 1, 2] to K1 and [3, 4, 5] to K2. The C++ IR cache (via offline_cache + offline_cache_file_path) stores the compiled kernels keyed by fastcache key under kernel_compilation_manager.cpp:277-280 (MemAndDiskCache mode). Python-side src_hasher.store writes (src_key → frontend_cache_key).
  2. Process B starts with the same sources. A test happens to materialize K2 first (e.g. different pytest-xdist sharding, different test selection). rewrite_ast assigns probe_ids [0, 1, 2] to K2 and [3, 4, 5] to K1 in Process B's _probe_map.
  3. Kernel.materialize() calls _try_load_fastcache (kernel.py:335-373) for K1. create_cache_key produces the same key as Process A (since kcov marker, source, args, config all match). load returns Process A's frontend_cache_key, and prog.load_fast_cache fetches the IR compiled in Process A — with _qd_cov[0] = 1, _qd_cov[1] = 1, _qd_cov[2] = 1 baked in.
  4. materialize still calls get_tree_and_ctx, which at _func_base.py:249-257 unconditionally calls _kcov.rewrite_ast and bumps _probe_counter / _probe_map based on Process B's state. But at kernel.py:420, only_parse_function_def=self.compiled_kernel_data_by_key.get(key) is not None is True on a cache hit, and function_def_transformer.py returns before processing the body — the freshly rewritten AST is discarded and the cached IR is what runs at launch.
  5. At K1's launch in Process B, the cached IR writes _qd_cov[0] = 1, _qd_cov[1] = 1, _qd_cov[2] = 1. In Process B's _probe_map, indices [0, 1, 2] point to K2's source lines, not K1's.
  6. At harvest: arr[0..2] are nonzero → K2's lines marked covered even though K2 may not have run (false positives). arr[3..5] are zero → K1's lines marked uncovered even though K1 actually ran (false negatives).

Step-by-step proof with a concrete example

Suppose the codebase has two kernels, K1 (3 instrumented lines at file.py:10-12) and K2 (3 instrumented lines at file.py:20-22):

Process Compile order probe_ids for K1 probe_ids for K2 _probe_map
A K1, K2 0, 1, 2 3, 4, 5 0→L10, 1→L11, 2→L12, 3→L20, 4→L21, 5→L22
B K2, K1 3, 4, 5 0, 1, 2 0→L20, 1→L21, 2→L22, 3→L10, 4→L11, 5→L12

Process A writes the offline cache. Process B opens it, and when K1 is materialized:

  • create_cache_key(K1) → same key as Process A → load returns Process A's frontend_cache_keyload_fast_cache fetches IR with _qd_cov[0] = 1, _qd_cov[1] = 1, _qd_cov[2] = 1 baked in.
  • only_parse_function_def=True at kernel.py:420 discards Process B's rewritten AST (which would have written to 3/4/5).
  • Launching K1 fires slots 0, 1, 2 in Process B's field.
  • Harvest: _probe_map[0..2] → L20, L21, L22 (K2's lines) marked covered. K2 may never have executed in Process B — false positive. _probe_map[3..5] → L10, L11, L12 (K1's lines) are all zero — false negative even though K1 did run.

Why existing code does not prevent it

  • The PR added "kcov" to create_cache_key, demonstrating awareness of coverage-related cache partitioning, but only handles the on/off dimension. The probe_id-baked-in problem operates entirely within the kcov=on bucket.
  • rewrite_ast runs unconditionally in _func_base.py:254 — there is no check for whether the subsequent fastcache lookup will hit and discard its output. So _probe_counter advances and _probe_map is populated per-process regardless of what IR actually runs.
  • only_parse_function_def=True is by design: the PR's explicit support for @qd.kernel(fastcache=True) + coverage (via the _qd_ prefix exemption in ast_transformer_utils.py:335) means cached pure kernels are expected to carry _qd_cov writes, but nothing keeps those writes consistent with the loading process's _probe_map.
  • CI does not catch this: tests/python/test_offline_cache.py was skipped under QD_KERNEL_COVERAGE=1 in this same PR, and test_src_ll_cache.py was skipped for similar reasons — so no CI test exercises the kcov + offline_cache + cross-process combination.

What the impact would be

Silent correctness failure of coverage data when a user combines QD_KERNEL_COVERAGE=1, offline_cache=True, a persistent offline_cache_file_path, and @qd.kernel(fastcache=True) across multiple processes (pytest-xdist workers, sequential CI runs, or just two developers running the same test suite). Compilation order differences — which are common due to test parallelism, test discovery order, fixture setup, selective test runs — cause coverage reports with both false positives and false negatives. Nothing warns the user. The documented "automatic with pytest-cov" path explicitly opts into this combination.

Separately but relatedly, because QD_COVERAGE_MAX_PROBES is also not in the cache key, shrinking it between runs can cause a cached IR with baked-in _qd_cov[5000] = 1 to be launched against a freshly allocated (500,)-shaped field — a device-side out-of-bounds write (silent memory corruption on most backends). Narrower scenario, but the fix is the same one-line cache-key addition.

How to fix it

Two complementary changes in _kernel_coverage.py and src_hasher.py:

  1. Make probe_ids deterministic across processes. Replace the monotonic counter with a stable hash:

    _probe_id_cache: dict[tuple[str, int], int] = {}  # (filepath, abs_lineno) -> probe_id
    _collision_offset = 0
    
    def _probe_id_for(filepath: str, abs_lineno: int) -> int | None:
        key = (filepath, abs_lineno)
        with _lock:
            if key in _probe_id_cache:
                return _probe_id_cache[key]
            # derive from hash; resolve collisions within existing dict
            candidate = hash(key) % _MAX_PROBES
            while candidate in _probe_map and _probe_map[candidate] != key:
                candidate = (candidate + 1) % _MAX_PROBES
                if len(_probe_id_cache) >= _MAX_PROBES:
                    return None  # full
            _probe_id_cache[key] = candidate
            _probe_map[candidate] = key
            return candidate

    This makes the IR byte-identical across processes for the same source, so the cached IR's _qd_cov[N] = 1 writes land in slots whose _probe_map entries are the same in every process.

  2. Include _MAX_PROBES in the fastcache key. In src_hasher.create_cache_key, append str(_MAX_PROBES) (or os.environ.get("QD_COVERAGE_MAX_PROBES", "100000")) to the hash_iterable_strings tuple. This prevents cache reuse across runs with different probe caps.

Additionally, test_offline_cache.py should have a non-skipped variant (or a dedicated test) that exercises the kcov=on + offline_cache path across two subprocesses with different compile orders, to catch future regressions.

@github-actions
Copy link
Copy Markdown

Coverage Report (6081d31d0)

Metric Value
Diff coverage (changed lines only) 90%
Overall project coverage 72%

Total: 451 lines, 47 missing, 90% covered

🔴 python/quadrants/lang/_fast_caching/src_hasher.py (0%)
🔴    2  import os
     53              "kcov" if os.environ.get("QD_KERNEL_COVERAGE") == "1" else "",
🟢 python/quadrants/lang/_func_base.py (82%)
🔴    7  import os
     25  
🔴   26  def _kernel_coverage_enabled() -> bool:
🟢   27      return os.environ.get("QD_KERNEL_COVERAGE") == "1"
     28  
     29  
🟢  255          _kcov = None
🟢  256          if _kernel_coverage_enabled() and autodiff_mode == _qd_core.AutodiffMode.NONE:
🟢  257              from . import (  # pylint: disable=import-outside-toplevel
    258                  _kernel_coverage as _kcov,
    259              )
    260  
🟢  261              tree = _kcov.rewrite_ast(tree, function_source_info.filepath, function_source_info.start_lineno)
    262  
🟢  266          if _kcov is not None:
🟢  267              cov_field = _kcov.get_field()
🟢  268              if cov_field is not None:
🟢  269                  global_vars[_kcov.FIELD_VAR_NAME] = cov_field
🔴 python/quadrants/lang/_kernel_coverage.py (79%)
      1  """Kernel code coverage via Python AST rewriting.
      2  
      3  When enabled (QD_KERNEL_COVERAGE=1), this module rewrites kernel and func ASTs to insert coverage probes — field
      4  stores that record which source lines actually execute on the GPU. At process exit, the collected data is written
      5  to a .coverage file compatible with coverage.py / pytest-cov / diff-cover.
      6  
      7  The probes are compiled as ordinary field stores by the existing pipeline, so no C++ changes are needed. When
      8  disabled, this module is never imported and has zero impact on the normal runtime path.
      9  """
     10  
🟢   11  import ast
🟢   12  import atexit
🟢   13  import logging
🟢   14  import os
🟢   15  import threading
🟢   16  import warnings
🟢   17  from typing import TYPE_CHECKING
     18  
🟢   19  from coverage import CoverageData  # type: ignore[import-not-found]
     20  
🟢   21  import quadrants as qd
🟢   22  from quadrants.lang import impl
     23  
     24  if TYPE_CHECKING:
     25      from quadrants.lang.field import ScalarField
     26  
🟢   27  FIELD_VAR_NAME = "_qd_cov"
🟢   28  _MAX_PROBES = int(os.environ.get("QD_COVERAGE_MAX_PROBES", "100000"))
     29  
🟢   30  _lock = threading.Lock()
🟢   31  _cov_field: "ScalarField | None" = None
🟢   32  _cov_field_prog: object | None = None  # tracks which Program instance owns _cov_field
🟢   33  _probe_counter: int = 0
     34  # {probe_id: (filepath, absolute_lineno)}
🟢   35  _probe_map: dict[int, tuple[str, int]] = {}
     36  # Accumulated coverage lines surviving across qd.init() resets
🟢   37  _accumulated_lines: dict[str, set[int]] = {}
🟢   38  _reset_hook_installed: bool = False
     39  # Directory for .coverage and _qd_kcov.* files, captured when coverage is first enabled
🟢   40  _coverage_dir: str | None = None
     41  
     42  
🟢   43  def _harvest_field() -> None:
     44      """Read probe data from the current field into _accumulated_lines.
     45  
     46      Must be called while the runtime is still alive (before clear()).
     47      """
     48      global _cov_field, _cov_field_prog
🟢   49      with _lock:
🟢   50          if _cov_field is None or not _probe_map:
🟢   51              return
🟢   52          field_ref = _cov_field
🟢   53          probe_snapshot = dict(_probe_map)
🟢   54          _cov_field = None
🟢   55          _cov_field_prog = None
🟢   56      try:
🟢   57          arr = field_ref.to_numpy()
🟢   58      except Exception:
🟢   59          logging.warning("Failed to read coverage field, coverage data for this session will be lost", exc_info=True)
🟢   60          return
🟢   61      with _lock:
🟢   62          for probe_id, (filepath, lineno) in probe_snapshot.items():
🟢   63              if probe_id < len(arr) and arr[probe_id] != 0:
🟢   64                  _accumulated_lines.setdefault(filepath, set()).add(lineno)
     65  
     66  
🟢   67  def _install_reset_hook() -> None:
     68      """Monkey-patch PyQuadrants.clear() to harvest probes before destruction."""
     69      global _reset_hook_installed
🟢   70      if _reset_hook_installed:
🟢   71          return
🟢   72      _original_clear = impl.PyQuadrants.clear
     73  
🟢   74      def _hooked_clear(self) -> None:
🟢   75          _harvest_field()
🟢   76          _original_clear(self)
     77  
🟢   78      impl.PyQuadrants.clear = _hooked_clear  # type: ignore[assignment]
🟢   79      _reset_hook_installed = True
     80  
     81  
🟢   82  def ensure_field_allocated() -> None:
     83      """Allocate (or re-allocate after qd.init()) the global coverage field."""
     84      global _cov_field, _cov_field_prog, _coverage_dir
🟢   85      _install_reset_hook()
🟢   86      if _coverage_dir is None:
🟢   87          _coverage_dir = os.getcwd()
🟢   88      current_prog = impl.get_runtime()._prog
🟢   89      if _cov_field is not None and _cov_field_prog is current_prog:
🟢   90          return
🟢   91      with _lock:
🟢   92          current_prog = impl.get_runtime()._prog
🟢   93          if _cov_field is not None and _cov_field_prog is current_prog:
🔴   94              return
🟢   95          _cov_field = qd.field(dtype=qd.i32, shape=(_MAX_PROBES,))  # type: ignore[assignment]
🟢   96          _cov_field_prog = current_prog
     97  
     98  
🟢   99  def get_field() -> "ScalarField | None":
🟢  100      with _lock:
🟢  101          if _cov_field_prog is not impl.get_runtime()._prog:
🔴  102              return None
🟢  103          return _cov_field
    104  
    105  
🟢  106  def rewrite_ast(tree: ast.Module, filepath: str, start_lineno: int) -> ast.Module:
    107      """Rewrite a kernel/func AST to insert coverage probes.
    108  
    109      Each executable statement at a new source line gets a probe: ``_qd_cov[<probe_id>] = 1``.
    110      Probes inside if/else bodies only fire when that branch is taken, giving true runtime branch coverage.
    111      """
    112      global _probe_counter
🟢  113      with _lock:
🟢  114          rewriter = _CoverageASTRewriter(
    115              field_name=FIELD_VAR_NAME,
    116              filepath=filepath,
    117              start_lineno=start_lineno,
    118              probe_id_start=_probe_counter,
    119          )
🟢  120          tree = rewriter.visit(tree)
🟢  121          ast.fix_missing_locations(tree)
🟢  122          _probe_counter = rewriter.next_probe_id
🟢  123          _probe_map.update(rewriter.probe_map)
🟢  124      return tree
    125  
    126  
🟢  127  def _detect_arc_mode() -> bool:
    128      """Detect whether pytest-cov is running in branch (arc) mode.
    129  
    130      Checks _QD_KCOV_ARC env var first (set by the pytest plugin), then falls back to reading .coverage.
    131      Defaults to False (line mode) when nothing is known, since ``pytest --cov`` without ``--cov-branch``
    132      is the more common invocation.
    133      """
🔴  134      arc_env = os.environ.get("_QD_KCOV_ARC")
🔴  135      if arc_env is not None:
🔴  136          return arc_env == "1"
🔴  137      try:
🔴  138          cov_path = os.path.join(_coverage_dir, ".coverage") if _coverage_dir else ".coverage"
🔴  139          cd = CoverageData(basename=cov_path)
🔴  140          cd.read()
🔴  141          if not cd.measured_files():
🔴  142              return False
🔴  143          return cd.has_arcs()
🔴  144      except Exception:
🔴  145          logging.debug("Failed to detect arc mode from .coverage file, defaulting to line mode", exc_info=True)
🔴  146          return False
    147  
    148  
🟢  149  def flush() -> None:
    150      """Harvest any remaining field data and write all results to a .coverage file.
    151  
    152      If .coverage.kernel already exists (e.g. from a prior test phase), the new data is merged into it so nothing
    153      is lost across multiple invocations.
    154      """
🔴  155      _harvest_field()
    156  
🔴  157      with _lock:
🔴  158          if not _accumulated_lines:
🔴  159              return
🔴  160          snapshot = {f: set(lines) for f, lines in _accumulated_lines.items()}
    161  
🔴  162      base_dir = _coverage_dir or os.getcwd()
🔴  163      kernel_path = os.path.join(base_dir, f"_qd_kcov.{os.getpid()}")
🔴  164      use_arcs = _detect_arc_mode()
    165  
🔴  166      cov = CoverageData(basename=kernel_path)
🔴  167      if use_arcs:
🔴  168          arcs_by_file: dict[str, list[tuple[int, int]]] = {}
🔴  169          for filepath, lines in snapshot.items():
    170              # Emit only entry/exit arcs per line — we know which lines ran but not the actual transitions
    171              # between them, so we avoid fabricating inter-line arcs that would misrepresent branch coverage.
🔴  172              arcs = []
🔴  173              for line in sorted(lines):
🔴  174                  arcs.append((-1, line))
🔴  175                  arcs.append((line, -1))
🔴  176              arcs_by_file[filepath] = arcs
🔴  177          cov.add_arcs(arcs_by_file)
    178      else:
🔴  179          cov.add_lines({f: sorted(lines) for f, lines in snapshot.items()})
🔴  180      cov.write()
    181  
    182  
🟢  183  _capacity_warning_emitted = False
    184  
    185  
🟢  186  class _CoverageASTRewriter(ast.NodeTransformer):
    187      """Insert coverage probes before each statement at a new source line."""
    188  
🟢  189      def __init__(self, field_name: str, filepath: str, start_lineno: int, probe_id_start: int) -> None:
🟢  190          self._field_name = field_name
🟢  191          self._filepath = filepath
🟢  192          self._start_lineno = start_lineno
🟢  193          self.next_probe_id = probe_id_start
🟢  194          self._seen_lines: set[int] = set()
🟢  195          self.probe_map: dict[int, tuple[str, int]] = {}
    196  
🟢  197      def _make_probe(self, abs_lineno: int, rel_lineno: int, col_offset: int) -> ast.Assign | None:
    198          global _capacity_warning_emitted
🟢  199          probe_id = self.next_probe_id
🟢  200          if probe_id >= _MAX_PROBES:
🟢  201              if not _capacity_warning_emitted:
🟢  202                  warnings.warn(
    203                      f"Kernel coverage probe capacity ({_MAX_PROBES}) exceeded. "
    204                      f"Additional kernel lines will not be tracked. "
    205                      f"Set QD_COVERAGE_MAX_PROBES to a higher value.",
    206                      stacklevel=2,
    207                  )
🟢  208                  _capacity_warning_emitted = True
🟢  209              return None
🟢  210          self.probe_map[probe_id] = (self._filepath, abs_lineno)
🟢  211          self.next_probe_id += 1
🟢  212          node = ast.Assign(
    213              targets=[
    214                  ast.Subscript(
    215                      value=ast.Name(id=self._field_name, ctx=ast.Load()),
    216                      slice=ast.Constant(value=probe_id),
    217                      ctx=ast.Store(),
    218                  )
    219              ],
    220              value=ast.Constant(value=1),
    221              lineno=rel_lineno,
    222              col_offset=col_offset,
    223              end_lineno=rel_lineno,
    224              end_col_offset=col_offset,
    225          )
🟢  226          return node
    227  
🟢  228      def _instrument_body(self, stmts: list[ast.stmt]) -> list[ast.stmt]:
🟢  229          result: list[ast.stmt] = []
🟢  230          for stmt in stmts:
🟢  231              rel_lineno = getattr(stmt, "lineno", None)
🟢  232              if rel_lineno is not None:
🟢  233                  abs_lineno = rel_lineno + self._start_lineno - 1
🟢  234                  if abs_lineno not in self._seen_lines:
🟢  235                      self._seen_lines.add(abs_lineno)
🟢  236                      col = getattr(stmt, "col_offset", 0)
🟢  237                      probe = self._make_probe(abs_lineno, rel_lineno, col)
🟢  238                      if probe is not None:
🟢  239                          result.append(probe)
🟢  240              result.append(self.visit(stmt))
🟢  241          return result
    242  
🟢  243      def visit_FunctionDef(self, node: ast.FunctionDef) -> ast.FunctionDef:
🟢  244          node.body = self._instrument_body(node.body)
🟢  245          return node
    246  
🟢  247      def visit_AsyncFunctionDef(self, node: ast.AsyncFunctionDef) -> ast.AsyncFunctionDef:
🔴  248          node.body = self._instrument_body(node.body)
🔴  249          return node
    250  
🟢  251      def visit_If(self, node: ast.If) -> ast.If:
🟢  252          node.body = self._instrument_body(node.body)
🟢  253          if node.orelse:
🟢  254              node.orelse = self._instrument_body(node.orelse)
🟢  255          return node
    256  
🟢  257      def visit_For(self, node: ast.For) -> ast.For:
🟢  258          node.body = self._instrument_body(node.body)
🟢  259          if node.orelse:
🟢  260              node.orelse = self._instrument_body(node.orelse)
🟢  261          return node
    262  
🟢  263      def visit_While(self, node: ast.While) -> ast.While:
🟢  264          node.body = self._instrument_body(node.body)
🟢  265          if node.orelse:
🟢  266              node.orelse = self._instrument_body(node.orelse)
🟢  267          return node
    268  
🟢  269      def visit_With(self, node: ast.With) -> ast.With:
🟢  270          node.body = self._instrument_body(node.body)
🟢  271          return node
    272  
🟢  273      def visit_Try(self, node: ast.Try) -> ast.Try:
🟢  274          node.body = self._instrument_body(node.body)
🟢  275          for handler in node.handlers:
🟢  276              handler.body = self._instrument_body(handler.body)
🟢  277          if node.orelse:
🟢  278              node.orelse = self._instrument_body(node.orelse)
🟢  279          if node.finalbody:
🟢  280              node.finalbody = self._instrument_body(node.finalbody)
🟢  281          return node
    282  
    283  
🟢  284  atexit.register(flush)
🟢 python/quadrants/lang/ast/ast_transformer_utils.py (100%)
🟢  335              if not name.startswith("_qd_"):
🟢  336                  reason = f"{name} is in global vars, therefore violates pure"
🟢  337                  violates_pure = True
🟢 python/quadrants/lang/kernel.py (80%)
     19  
🔴   20  def _kernel_coverage_enabled() -> bool:
🟢   21      return os.environ.get("QD_KERNEL_COVERAGE") == "1"
     22  
     23  
🟢  382          if _kernel_coverage_enabled():
🟢  383              from . import _kernel_coverage  # pylint: disable=import-outside-toplevel
    384  
🟢  385              _kernel_coverage.ensure_field_allocated()
    386  
🟢 python/quadrants/lang/misc.py (100%)
    496  
🟢  497      if os.environ.get("QD_KERNEL_COVERAGE") == "1":
🟢  498          from . import _kernel_coverage  # pylint: disable=import-outside-toplevel
    499  
🟢  500          _kernel_coverage.ensure_field_allocated()
    501  
🔴 python/quadrants/pytest_plugin.py (67%)
      1  """Pytest plugin that auto-enables kernel coverage when pytest-cov is active.
      2  
      3  Registered via the ``pytest11`` entry point so it loads automatically when quadrants is installed.
      4  Opt out by setting ``QD_KERNEL_COVERAGE=0`` explicitly.
      5  """
      6  
🔴    7  import os
      8  
      9  
🔴   10  def pytest_configure(config):
🟢   11      if not config.pluginmanager.hasplugin("_cov"):
🔴   12          return
🟢   13      os.environ.setdefault("QD_KERNEL_COVERAGE", "1")
🟢   14      if os.environ.get("QD_KERNEL_COVERAGE") != "1":
🟢   15          return
     16      # Tell the kernel coverage module whether pytest-cov is running in branch (arc) mode,
     17      # so it writes the matching format and avoids "Can not mix line and arc data" at combine time.
     18      # We read config.option.cov_branch which pytest-cov has already populated by this point.
🟢   19      cov_branch = getattr(config.option, "cov_branch", False) or False
🟢   20      os.environ["_QD_KCOV_ARC"] = "1" if cov_branch else "0"
🟢 tests/python/quadrants/lang/fast_caching/test_src_ll_cache.py (100%)
     11  
🟢   12  _KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"
🟢   67  @pytest.mark.skipif(
     68      _KERNEL_COVERAGE,
     69      reason="Coverage probes change LLVM IR addresses after reinit, breaking recompile comparison",
     70  )
🟢 tests/python/quadrants/lang/test_kernel_impl.py (100%)
🟢    1  import os
🟢   11  _KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"
     13  
🟢   14  @pytest.mark.skipif(
     15      _KERNEL_COVERAGE,
     16      reason="Coverage probes change the kernel AST, preventing FE-LL cache hits after reinit",
     17  )
🟢 tests/python/test_api.py (100%)
🟢  438      actual = sorted([s for s in dir(src) if not s.startswith(("_", "@")) and s != "pytest_plugin"])
🟢 tests/python/test_intrinsics.py (100%)
     55      (thread i does (i+1)*200000). Asserts strict monotonicity across threads and that
🟢   91          assert a[i - 1] < a[i] < a[i + 1]
🟢 tests/python/test_kernel_coverage.py (99%)
      1  """Tests for kernel code coverage instrumentation.
      2  
      3  These tests verify that the AST rewriter correctly inserts coverage probes and that the probes fire when kernel
      4  code executes on the device.
      5  """
      6  
🟢    7  import ast
🟢    8  import os
🟢    9  import textwrap
     10  
🟢   11  import pytest
     12  
🟢   13  import quadrants as qd
     14  
🟢   15  from tests import test_utils
     16  
     17  # These tests only run when QD_KERNEL_COVERAGE=1
🟢   18  pytestmark = pytest.mark.skipif(
     19      os.environ.get("QD_KERNEL_COVERAGE", "") != "1",
     20      reason="QD_KERNEL_COVERAGE=1 not set",
     21  )
     22  
     23  
     24  # ---------------------------------------------------------------------------
     25  # AST rewriter unit tests
     26  # ---------------------------------------------------------------------------
     27  
🟢   28  _AST_REWRITER_CASES = [
     29      pytest.param(
     30          """\
     31          def f():
     32              x = 1
     33              y = 2
     34              return x + y
     35          """,
     36          {11, 12, 13},
     37          10,
     38          id="straight_line",
     39      ),
     40      pytest.param(
     41          """\
     42          def f():
     43              if x > 0:
     44                  a = 1
     45              else:
     46                  b = 2
     47          """,
     48          {2, 3, 5},
     49          1,
     50          id="if_else",
     51      ),
     52      pytest.param(
     53          """\
     54          def f():
     55              for i in range(10):
     56                  x = i
     57          """,
     58          {2, 3},
     59          1,
     60          id="for_loop",
     61      ),
     62      pytest.param(
     63          """\
     64          def f():
     65              while x > 0:
     66                  x = x - 1
     67              else:
     68                  y = 0
     69          """,
     70          {2, 3, 5},
     71          1,
     72          id="while_loop_else",
     73      ),
     74      pytest.param(
     75          """\
     76          def f():
     77              with ctx:
     78                  a = 1
     79                  b = 2
     80          """,
     81          {2, 3, 4},
     82          1,
     83          id="with_statement",
     84      ),
     85      pytest.param(
     86          """\
     87          def f():
     88              try:
     89                  a = 1
     90              except:
     91                  b = 2
     92              else:
     93                  c = 3
     94              finally:
     95                  d = 4
     96          """,
     97          {3, 5, 7, 9},
     98          1,
     99          id="try_except_finally",
    100      ),
    101  ]
    102  
    103  
🟢  104  @pytest.mark.parametrize("src,expected_lines,start_lineno", _AST_REWRITER_CASES)
🟢  105  def test_ast_rewriter(src, expected_lines, start_lineno):
    106      """Verify the AST rewriter inserts probes at the expected source lines."""
🟢  107      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    108  
🟢  109      tree = ast.parse(textwrap.dedent(src))
🟢  110      rewriter = _CoverageASTRewriter(
    111          field_name="_qd_cov", filepath="test.py", start_lineno=start_lineno, probe_id_start=0
    112      )
🟢  113      rewriter.visit(tree)
    114  
🟢  115      covered_lines = {lineno for _, (_, lineno) in rewriter.probe_map.items()}
🟢  116      assert expected_lines.issubset(covered_lines), f"Expected lines {expected_lines} to be probed, got {covered_lines}"
    117  
    118  
🟢  119  def test_ast_rewriter_capacity_limit():
    120      """Verify that probes stop being inserted when the capacity limit is hit."""
🟢  121      import warnings
    122  
🟢  123      import quadrants.lang._kernel_coverage as kcov
🟢  124      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    125  
🟢  126      src = textwrap.dedent(
    127          """\
    128          def f():
    129              a = 1
    130              b = 2
    131              c = 3
    132      """
    133      )
🟢  134      tree = ast.parse(src)
🟢  135      old_warning_state = kcov._capacity_warning_emitted
🟢  136      kcov._capacity_warning_emitted = False
🟢  137      try:
🟢  138          with warnings.catch_warnings(record=True) as w:
🟢  139              warnings.simplefilter("always")
🟢  140              rewriter = _CoverageASTRewriter(
    141                  field_name="_qd_cov", filepath="test.py", start_lineno=1, probe_id_start=kcov._MAX_PROBES - 1
    142              )
🟢  143              rewriter.visit(tree)
    144  
🟢  145          assert rewriter.next_probe_id == kcov._MAX_PROBES
🟢  146          assert len(rewriter.probe_map) == 1, f"Only 1 probe should fit, got {len(rewriter.probe_map)}"
🟢  147          assert len(w) == 1
🟢  148          assert "exceeded" in str(w[0].message).lower()
    149      finally:
🟢  150          kcov._capacity_warning_emitted = old_warning_state
    151  
    152  
🟢  153  def test_ast_rewriter_deduplicates_same_line():
    154      """Verify that two statements on the same source line get only one probe."""
🟢  155      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    156  
🟢  157      src = "def f():\n    a = 1; b = 2\n"
🟢  158      tree = ast.parse(src)
🟢  159      rewriter = _CoverageASTRewriter(field_name="_qd_cov", filepath="test.py", start_lineno=1, probe_id_start=0)
🟢  160      rewriter.visit(tree)
    161  
🟢  162      abs_lines = [lineno for _, (_, lineno) in rewriter.probe_map.items()]
🟢  163      assert abs_lines.count(2) == 1, f"Line 2 should have exactly one probe, got {abs_lines.count(2)}"
    164  
    165  
🟢  166  def test_env_var_max_probes():
    167      """Verify that QD_COVERAGE_MAX_PROBES env var is read at import time."""
🟢  168      import quadrants.lang._kernel_coverage as kcov
    169  
🟢  170      assert kcov._MAX_PROBES == int(os.environ.get("QD_COVERAGE_MAX_PROBES", "100000"))
    171  
    172  
🟢  173  def test_harvest_field_exception_path():
    174      """Verify that _harvest_field handles to_numpy() failure gracefully."""
🟢  175      from unittest.mock import MagicMock
    176  
🟢  177      import quadrants.lang._kernel_coverage as kcov
    178  
🟢  179      old_field = kcov._cov_field
🟢  180      old_prog = kcov._cov_field_prog
🟢  181      old_map = kcov._probe_map.copy()
🟢  182      try:
🟢  183          mock_field = MagicMock()
🟢  184          mock_field.to_numpy.side_effect = RuntimeError("runtime destroyed")
🟢  185          kcov._cov_field = mock_field
🟢  186          kcov._cov_field_prog = object()
🟢  187          kcov._probe_map[999999] = ("fake.py", 1)
    188  
    189          # Should not raise — the exception is caught and logged
🟢  190          kcov._harvest_field()
    191  
🟢  192          assert kcov._cov_field is None, "Field should be cleared after failure"
🟢  193          assert kcov._cov_field_prog is None, "Field prog should be cleared after failure"
    194      finally:
🟢  195          kcov._cov_field = old_field
🟢  196          kcov._cov_field_prog = old_prog
🟢  197          kcov._probe_map = old_map
    198  
    199  
    200  # ---------------------------------------------------------------------------
    201  # End-to-end tests
    202  # ---------------------------------------------------------------------------
    203  
    204  
🟢  205  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  206  def test_kernel_coverage_branches_e2e():
    207      """Verify that only the taken branch has its probe fired."""
🟢  208      from quadrants.lang import _kernel_coverage
    209  
🟢  210      _kernel_coverage.ensure_field_allocated()
    211  
🟢  212      probe_count_before = _kernel_coverage._probe_counter
🟢  213      out = qd.field(dtype=qd.i32, shape=(1,))
    214  
🟢  215      @qd.kernel
🟢  216      def branching_kernel():
🟢  217          x = 10
🟢  218          if x > 5:
🟢  219              out[0] = 1
    220          else:
🔴  221              out[0] = 2
    222  
🟢  223      branching_kernel()
    224  
🟢  225      assert out[0] == 1
    226  
🟢  227      cov_field = _kernel_coverage.get_field()
🟢  228      arr = cov_field.to_numpy()
    229  
🟢  230      probes_for_kernel = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
    231  
🟢  232      taken_probes = {pid for pid, loc in probes_for_kernel.items() if arr[pid] != 0}
🟢  233      not_taken_probes = {pid for pid, loc in probes_for_kernel.items() if arr[pid] == 0}
    234  
🟢  235      assert len(taken_probes) > 0, "At least some probes should have fired"
🟢  236      assert len(not_taken_probes) > 0, "The else branch should not have been reached"
    237  
    238  
🟢  239  @test_utils.test(arch=qd.gpu)
🟢  240  def test_kernel_coverage_simt_e2e():
    241      """Verify coverage probes track branches with block.sync() and subgroup shuffle.
    242  
    243      The if/else is based on a runtime value read from a field, so the compiler cannot constant-fold it away.
    244      Only the taken branch's shuffle probe should fire.
    245      """
🟢  246      from quadrants.lang import _kernel_coverage
🟢  247      from quadrants.lang.simt import subgroup
    248  
🟢  249      _kernel_coverage.ensure_field_allocated()
    250  
🟢  251      N = 64
🟢  252      probe_count_before = _kernel_coverage._probe_counter
🟢  253      flag = qd.field(dtype=qd.i32, shape=(1,))
🟢  254      a = qd.field(dtype=qd.i32, shape=(N,))
🟢  255      out = qd.field(dtype=qd.i32, shape=(N,))
    256  
🟢  257      flag[0] = 1  # runtime value: take the if-branch
    258  
🟢  259      @qd.kernel
🟢  260      def simt_kernel():
🟢  261          qd.loop_config(block_dim=N)
🟢  262          for i in range(N):
🟢  263              a[i] = i + 1
🟢  264              qd.simt.block.sync()
🟢  265              if flag[0] > 0:
🟢  266                  val = subgroup.shuffle(a[i], qd.u32(0))
🟢  267                  out[i] = val
    268              else:
🔴  269                  val = subgroup.shuffle(a[i], qd.u32(1))
🔴  270                  out[i] = val + 100
    271  
🟢  272      simt_kernel()
    273  
🟢  274      for i in range(4):
🟢  275          assert out[i] == 1, f"Expected 1 at index {i}, got {out[i]}"
    276  
🟢  277      cov_field = _kernel_coverage.get_field()
🟢  278      arr = cov_field.to_numpy()
    279  
🟢  280      probes_for_kernel = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
    281  
🟢  282      fired = {pid for pid in probes_for_kernel if arr[pid] != 0}
🟢  283      not_fired = {pid for pid in probes_for_kernel if arr[pid] == 0}
🟢  284      assert len(fired) >= 4, f"Expected at least 4 probes to fire, got {len(fired)}"
🟢  285      assert len(not_fired) >= 2, "The else branch should not have been reached"
    286  
    287  
🟢  288  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  289  def test_kernel_coverage_survives_reinit():
    290      """Verify that coverage data accumulated before qd.init() reset is preserved.
    291  
    292      Runs a kernel, then resets via qd.reset()/qd.init() (which triggers the _hooked_clear harvest), runs another
    293      kernel, harvests again, and checks that _accumulated_lines contains data from both sessions.
    294      """
🟢  295      from quadrants.lang import _kernel_coverage, impl
    296  
🟢  297      current_arch = impl.get_runtime()._arch
🟢  298      _kernel_coverage.ensure_field_allocated()
    299  
🟢  300      probe_count_before = _kernel_coverage._probe_counter
🟢  301      out1 = qd.field(dtype=qd.i32, shape=(1,))
    302  
🟢  303      @qd.kernel
🟢  304      def kernel_before_reset():
🟢  305          out1[0] = 1
    306  
🟢  307      kernel_before_reset()
    308  
🟢  309      cov_field = _kernel_coverage.get_field()
🟢  310      assert cov_field is not None
🟢  311      arr = cov_field.to_numpy()
🟢  312      probes_first = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  313      fired_first = {pid for pid in probes_first if arr[pid] != 0}
🟢  314      assert len(fired_first) > 0, "Probes from first kernel should have fired"
    315  
    316      # Don't call _harvest_field() manually — let qd.reset() trigger it via the _hooked_clear hook
🟢  317      qd.reset()
    318  
    319      # Verify the hook harvested data from the first session
🟢  320      files_before = set(_kernel_coverage._accumulated_lines.keys())
🟢  321      assert len(files_before) > 0, "Hook should have harvested data during reset"
🟢  322      lines_before = {}
🟢  323      for f, lines in _kernel_coverage._accumulated_lines.items():
🟢  324          lines_before[f] = set(lines)
    325  
🟢  326      qd.init(arch=current_arch)
    327  
🟢  328      _kernel_coverage.ensure_field_allocated()
    329  
🟢  330      probe_count_mid = _kernel_coverage._probe_counter
🟢  331      out2 = qd.field(dtype=qd.i32, shape=(1,))
    332  
🟢  333      @qd.kernel
🟢  334      def kernel_after_reset():
🟢  335          out2[0] = 2
    336  
🟢  337      kernel_after_reset()
    338  
🟢  339      _kernel_coverage._harvest_field()
    340  
🟢  341      for f in files_before:
🟢  342          assert (
    343              f in _kernel_coverage._accumulated_lines
    344          ), f"File {f} from before reset should still be in _accumulated_lines"
🟢  345          assert lines_before[f].issubset(
    346              _kernel_coverage._accumulated_lines[f]
    347          ), "Lines from before reset should be preserved"
    348  
🟢  349      probes_second = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_mid}
🟢  350      second_files = {loc[0] for loc in probes_second.values()}
🟢  351      for f in second_files:
🟢  352          assert f in _kernel_coverage._accumulated_lines, f"File {f} from second kernel should be in _accumulated_lines"
    353  
    354  
🟢  355  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  356  def test_kernel_coverage_autodiff():
    357      """Verify that autodiff forward pass produces probes but backward does not.
    358  
    359      The forward compilation (AutodiffMode.NONE) should insert probes that fire. The backward compilation
    360      (AutodiffMode.REVERSE) should not add any probes.
    361      """
🟢  362      from quadrants.lang import _kernel_coverage
    363  
🟢  364      _kernel_coverage.ensure_field_allocated()
    365  
🟢  366      x = qd.field(dtype=qd.f32, shape=(), needs_grad=True)
🟢  367      loss = qd.field(dtype=qd.f32, shape=(), needs_grad=True)
    368  
🟢  369      @qd.kernel
🟢  370      def compute():
🟢  371          loss[None] = x[None] * x[None]
    372  
🟢  373      x[None] = 5.0
    374  
🟢  375      probe_count_before = _kernel_coverage._probe_counter
    376  
🟢  377      with qd.ad.Tape(loss):
🟢  378          compute()
    379  
🟢  380      probe_count_after_tape = _kernel_coverage._probe_counter
🟢  381      forward_probes = probe_count_after_tape - probe_count_before
🟢  382      assert forward_probes > 0, "Forward compilation should have inserted probes"
    383  
    384      # Verify forward probes actually fired
🟢  385      cov_field = _kernel_coverage.get_field()
🟢  386      assert cov_field is not None
🟢  387      arr = cov_field.to_numpy()
🟢  388      probes = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  389      fired = {pid for pid in probes if arr[pid] != 0}
🟢  390      assert len(fired) > 0, "Forward pass inside Tape should produce fired coverage probes"
    391  
    392      # Verify backward pass computes correct gradients
🟢  393      assert loss[None] == pytest.approx(25.0)
🟢  394      assert x.grad[None] == pytest.approx(10.0)
    395  
    396  
🟢  397  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  398  def test_kernel_coverage_qd_func():
    399      """Verify that probes fire inside a @qd.func called from a kernel."""
🟢  400      from quadrants.lang import _kernel_coverage
    401  
🟢  402      _kernel_coverage.ensure_field_allocated()
    403  
🟢  404      probe_count_before = _kernel_coverage._probe_counter
🟢  405      out = qd.field(dtype=qd.i32, shape=(1,))
    406  
🟢  407      @qd.func
🟢  408      def helper():
🟢  409          out[0] = 99
    410  
🟢  411      @qd.kernel
🟢  412      def caller():
🟢  413          helper()
    414  
🟢  415      caller()
    416  
🟢  417      assert out[0] == 99
    418  
🟢  419      cov_field = _kernel_coverage.get_field()
🟢  420      assert cov_field is not None
🟢  421      arr = cov_field.to_numpy()
    422  
🟢  423      probes = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  424      fired = {pid for pid in probes if arr[pid] != 0}
    425      # The kernel body has one statement (helper()), and the func body has one (out[0] = 99).
    426      # Both should produce probes that fire.
🟢  427      assert (
    428          len(fired) >= 2
    429      ), f"Expected probes from both kernel and func to fire, got {len(fired)} fired out of {len(probes)}"
    430  
    431  
🟢  432  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  433  def test_kernel_coverage_multiple_kernels_same_session():
    434      """Verify that probes from two different kernels both fire in the same session."""
🟢  435      from quadrants.lang import _kernel_coverage
    436  
🟢  437      _kernel_coverage.ensure_field_allocated()
    438  
🟢  439      probe_count_before = _kernel_coverage._probe_counter
🟢  440      a = qd.field(dtype=qd.i32, shape=(1,))
🟢  441      b = qd.field(dtype=qd.i32, shape=(1,))
    442  
🟢  443      @qd.kernel
🟢  444      def kernel_a():
🟢  445          a[0] = 10
    446  
🟢  447      @qd.kernel
🟢  448      def kernel_b():
🟢  449          b[0] = 20
    450  
🟢  451      kernel_a()
🟢  452      probe_count_after_a = _kernel_coverage._probe_counter
🟢  453      kernel_b()
    454  
🟢  455      assert a[0] == 10
🟢  456      assert b[0] == 20
    457  
🟢  458      cov_field = _kernel_coverage.get_field()
🟢  459      arr = cov_field.to_numpy()
    460  
🟢  461      probes_a = {
    462          pid: loc for pid, loc in _kernel_coverage._probe_map.items() if probe_count_before <= pid < probe_count_after_a
    463      }
🟢  464      probes_b = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_after_a}
    465  
🟢  466      fired_a = {pid for pid in probes_a if arr[pid] != 0}
🟢  467      fired_b = {pid for pid in probes_b if arr[pid] != 0}
    468  
🟢  469      assert len(fired_a) > 0, "Probes from kernel_a should have fired"
🟢  470      assert len(fired_b) > 0, "Probes from kernel_b should have fired"
    471  
    472  
🟢  473  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  474  def test_qd_prefix_exemption_pure_kernel():
    475      """Verify that _qd_-prefixed globals don't violate pure kernel checks.
    476  
    477      With kernel coverage enabled, _qd_cov is injected as a global. This test verifies that a pure (fastcache)
    478      kernel still compiles without error. The kernel uses ndarray arguments (not global fields) because pure
    479      kernels prohibit non-_qd_ globals.
    480      """
🟢  481      a = qd.ndarray(qd.i32, (1,))
    482  
🟢  483      @qd.kernel(fastcache=True)
🟢  484      def pure_kernel(arr: qd.types.NDArray) -> None:
🟢  485          arr[0] = 42
    486  
🟢  487      pure_kernel(a)
🟢  488      assert a[0] == 42
🟢 tests/python/test_offline_cache.py (100%)
     16  # Coverage field allocation creates internal fill kernels that change cache file counts.
     17  # CI runs these tests in a separate phase without QD_KERNEL_COVERAGE (see 4_test.sh).
🟢   18  pytestmark = pytest.mark.skipif(
     19      os.environ.get("QD_KERNEL_COVERAGE") == "1",
     20      reason="Kernel coverage adds internal kernels that invalidate cache file count assertions",
     21  )
     22  
🟢 tests/python/test_snode_layout_inspection.py (100%)
🟢    1  import os
      2  
🟢    3  import pytest
      4  
🟢   10  @pytest.mark.skipif(
     11      os.environ.get("QD_KERNEL_COVERAGE") == "1",
     12      reason="Kernel coverage field on root shifts offset assertions",
     13  )

@hughperkins
Copy link
Copy Markdown
Collaborator Author

Update from Opus:

One new comment (59 → 60). It's a repeat of earlier comment #38 -- three e2e tests dereference get_field() without checking for None first, while the other three tests in the
same file do have the guard.

This is a minor code quality issue (yellow severity). The tests would give an opaque AttributeError instead of a clear AssertionError if get_field() returned None. Want me to
add the guards?

so basically, no need to do anything, and good to merge now?

Yes, that's my read. The remaining comment is a minor style consistency issue (adding assert cov_field is not None to 3 tests that already work fine). Not a bug, not a
correctness issue -- just a slightly better error message in a failure mode that doesn't happen in practice.

Good to merge.

@github-actions
Copy link
Copy Markdown

Coverage Report (50e1f8ad6)

Metric Value
Diff coverage (changed lines only) 90%
Overall project coverage 72%

Total: 451 lines, 47 missing, 90% covered

🔴 python/quadrants/lang/_fast_caching/src_hasher.py (0%)
🔴    2  import os
     53              "kcov" if os.environ.get("QD_KERNEL_COVERAGE") == "1" else "",
🟢 python/quadrants/lang/_func_base.py (82%)
🔴    7  import os
     25  
🔴   26  def _kernel_coverage_enabled() -> bool:
🟢   27      return os.environ.get("QD_KERNEL_COVERAGE") == "1"
     28  
     29  
🟢  255          _kcov = None
🟢  256          if _kernel_coverage_enabled() and autodiff_mode == _qd_core.AutodiffMode.NONE:
🟢  257              from . import (  # pylint: disable=import-outside-toplevel
    258                  _kernel_coverage as _kcov,
    259              )
    260  
🟢  261              tree = _kcov.rewrite_ast(tree, function_source_info.filepath, function_source_info.start_lineno)
    262  
🟢  266          if _kcov is not None:
🟢  267              cov_field = _kcov.get_field()
🟢  268              if cov_field is not None:
🟢  269                  global_vars[_kcov.FIELD_VAR_NAME] = cov_field
🔴 python/quadrants/lang/_kernel_coverage.py (79%)
      1  """Kernel code coverage via Python AST rewriting.
      2  
      3  When enabled (QD_KERNEL_COVERAGE=1), this module rewrites kernel and func ASTs to insert coverage probes — field
      4  stores that record which source lines actually execute on the GPU. At process exit, the collected data is written
      5  to a .coverage file compatible with coverage.py / pytest-cov / diff-cover.
      6  
      7  The probes are compiled as ordinary field stores by the existing pipeline, so no C++ changes are needed. When
      8  disabled, this module is never imported and has zero impact on the normal runtime path.
      9  """
     10  
🟢   11  import ast
🟢   12  import atexit
🟢   13  import logging
🟢   14  import os
🟢   15  import threading
🟢   16  import warnings
🟢   17  from typing import TYPE_CHECKING
     18  
🟢   19  from coverage import CoverageData  # type: ignore[import-not-found]
     20  
🟢   21  import quadrants as qd
🟢   22  from quadrants.lang import impl
     23  
     24  if TYPE_CHECKING:
     25      from quadrants.lang.field import ScalarField
     26  
🟢   27  FIELD_VAR_NAME = "_qd_cov"
🟢   28  _MAX_PROBES = int(os.environ.get("QD_COVERAGE_MAX_PROBES", "100000"))
     29  
🟢   30  _lock = threading.Lock()
🟢   31  _cov_field: "ScalarField | None" = None
🟢   32  _cov_field_prog: object | None = None  # tracks which Program instance owns _cov_field
🟢   33  _probe_counter: int = 0
     34  # {probe_id: (filepath, absolute_lineno)}
🟢   35  _probe_map: dict[int, tuple[str, int]] = {}
     36  # Accumulated coverage lines surviving across qd.init() resets
🟢   37  _accumulated_lines: dict[str, set[int]] = {}
🟢   38  _reset_hook_installed: bool = False
     39  # Directory for .coverage and _qd_kcov.* files, captured when coverage is first enabled
🟢   40  _coverage_dir: str | None = None
     41  
     42  
🟢   43  def _harvest_field() -> None:
     44      """Read probe data from the current field into _accumulated_lines.
     45  
     46      Must be called while the runtime is still alive (before clear()).
     47      """
     48      global _cov_field, _cov_field_prog
🟢   49      with _lock:
🟢   50          if _cov_field is None or not _probe_map:
🟢   51              return
🟢   52          field_ref = _cov_field
🟢   53          probe_snapshot = dict(_probe_map)
🟢   54          _cov_field = None
🟢   55          _cov_field_prog = None
🟢   56      try:
🟢   57          arr = field_ref.to_numpy()
🟢   58      except Exception:
🟢   59          logging.warning("Failed to read coverage field, coverage data for this session will be lost", exc_info=True)
🟢   60          return
🟢   61      with _lock:
🟢   62          for probe_id, (filepath, lineno) in probe_snapshot.items():
🟢   63              if probe_id < len(arr) and arr[probe_id] != 0:
🟢   64                  _accumulated_lines.setdefault(filepath, set()).add(lineno)
     65  
     66  
🟢   67  def _install_reset_hook() -> None:
     68      """Monkey-patch PyQuadrants.clear() to harvest probes before destruction."""
     69      global _reset_hook_installed
🟢   70      if _reset_hook_installed:
🟢   71          return
🟢   72      _original_clear = impl.PyQuadrants.clear
     73  
🟢   74      def _hooked_clear(self) -> None:
🟢   75          _harvest_field()
🟢   76          _original_clear(self)
     77  
🟢   78      impl.PyQuadrants.clear = _hooked_clear  # type: ignore[assignment]
🟢   79      _reset_hook_installed = True
     80  
     81  
🟢   82  def ensure_field_allocated() -> None:
     83      """Allocate (or re-allocate after qd.init()) the global coverage field."""
     84      global _cov_field, _cov_field_prog, _coverage_dir
🟢   85      _install_reset_hook()
🟢   86      if _coverage_dir is None:
🟢   87          _coverage_dir = os.getcwd()
🟢   88      current_prog = impl.get_runtime()._prog
🟢   89      if _cov_field is not None and _cov_field_prog is current_prog:
🟢   90          return
🟢   91      with _lock:
🟢   92          current_prog = impl.get_runtime()._prog
🟢   93          if _cov_field is not None and _cov_field_prog is current_prog:
🔴   94              return
🟢   95          _cov_field = qd.field(dtype=qd.i32, shape=(_MAX_PROBES,))  # type: ignore[assignment]
🟢   96          _cov_field_prog = current_prog
     97  
     98  
🟢   99  def get_field() -> "ScalarField | None":
🟢  100      with _lock:
🟢  101          if _cov_field_prog is not impl.get_runtime()._prog:
🔴  102              return None
🟢  103          return _cov_field
    104  
    105  
🟢  106  def rewrite_ast(tree: ast.Module, filepath: str, start_lineno: int) -> ast.Module:
    107      """Rewrite a kernel/func AST to insert coverage probes.
    108  
    109      Each executable statement at a new source line gets a probe: ``_qd_cov[<probe_id>] = 1``.
    110      Probes inside if/else bodies only fire when that branch is taken, giving true runtime branch coverage.
    111      """
    112      global _probe_counter
🟢  113      with _lock:
🟢  114          rewriter = _CoverageASTRewriter(
    115              field_name=FIELD_VAR_NAME,
    116              filepath=filepath,
    117              start_lineno=start_lineno,
    118              probe_id_start=_probe_counter,
    119          )
🟢  120          tree = rewriter.visit(tree)
🟢  121          ast.fix_missing_locations(tree)
🟢  122          _probe_counter = rewriter.next_probe_id
🟢  123          _probe_map.update(rewriter.probe_map)
🟢  124      return tree
    125  
    126  
🟢  127  def _detect_arc_mode() -> bool:
    128      """Detect whether pytest-cov is running in branch (arc) mode.
    129  
    130      Checks _QD_KCOV_ARC env var first (set by the pytest plugin), then falls back to reading .coverage.
    131      Defaults to False (line mode) when nothing is known, since ``pytest --cov`` without ``--cov-branch``
    132      is the more common invocation.
    133      """
🔴  134      arc_env = os.environ.get("_QD_KCOV_ARC")
🔴  135      if arc_env is not None:
🔴  136          return arc_env == "1"
🔴  137      try:
🔴  138          cov_path = os.path.join(_coverage_dir, ".coverage") if _coverage_dir else ".coverage"
🔴  139          cd = CoverageData(basename=cov_path)
🔴  140          cd.read()
🔴  141          if not cd.measured_files():
🔴  142              return False
🔴  143          return cd.has_arcs()
🔴  144      except Exception:
🔴  145          logging.debug("Failed to detect arc mode from .coverage file, defaulting to line mode", exc_info=True)
🔴  146          return False
    147  
    148  
🟢  149  def flush() -> None:
    150      """Harvest any remaining field data and write all results to a .coverage file.
    151  
    152      If .coverage.kernel already exists (e.g. from a prior test phase), the new data is merged into it so nothing
    153      is lost across multiple invocations.
    154      """
🔴  155      _harvest_field()
    156  
🔴  157      with _lock:
🔴  158          if not _accumulated_lines:
🔴  159              return
🔴  160          snapshot = {f: set(lines) for f, lines in _accumulated_lines.items()}
    161  
🔴  162      base_dir = _coverage_dir or os.getcwd()
🔴  163      kernel_path = os.path.join(base_dir, f"_qd_kcov.{os.getpid()}")
🔴  164      use_arcs = _detect_arc_mode()
    165  
🔴  166      cov = CoverageData(basename=kernel_path)
🔴  167      if use_arcs:
🔴  168          arcs_by_file: dict[str, list[tuple[int, int]]] = {}
🔴  169          for filepath, lines in snapshot.items():
    170              # Emit only entry/exit arcs per line — we know which lines ran but not the actual transitions
    171              # between them, so we avoid fabricating inter-line arcs that would misrepresent branch coverage.
🔴  172              arcs = []
🔴  173              for line in sorted(lines):
🔴  174                  arcs.append((-1, line))
🔴  175                  arcs.append((line, -1))
🔴  176              arcs_by_file[filepath] = arcs
🔴  177          cov.add_arcs(arcs_by_file)
    178      else:
🔴  179          cov.add_lines({f: sorted(lines) for f, lines in snapshot.items()})
🔴  180      cov.write()
    181  
    182  
🟢  183  _capacity_warning_emitted = False
    184  
    185  
🟢  186  class _CoverageASTRewriter(ast.NodeTransformer):
    187      """Insert coverage probes before each statement at a new source line."""
    188  
🟢  189      def __init__(self, field_name: str, filepath: str, start_lineno: int, probe_id_start: int) -> None:
🟢  190          self._field_name = field_name
🟢  191          self._filepath = filepath
🟢  192          self._start_lineno = start_lineno
🟢  193          self.next_probe_id = probe_id_start
🟢  194          self._seen_lines: set[int] = set()
🟢  195          self.probe_map: dict[int, tuple[str, int]] = {}
    196  
🟢  197      def _make_probe(self, abs_lineno: int, rel_lineno: int, col_offset: int) -> ast.Assign | None:
    198          global _capacity_warning_emitted
🟢  199          probe_id = self.next_probe_id
🟢  200          if probe_id >= _MAX_PROBES:
🟢  201              if not _capacity_warning_emitted:
🟢  202                  warnings.warn(
    203                      f"Kernel coverage probe capacity ({_MAX_PROBES}) exceeded. "
    204                      f"Additional kernel lines will not be tracked. "
    205                      f"Set QD_COVERAGE_MAX_PROBES to a higher value.",
    206                      stacklevel=2,
    207                  )
🟢  208                  _capacity_warning_emitted = True
🟢  209              return None
🟢  210          self.probe_map[probe_id] = (self._filepath, abs_lineno)
🟢  211          self.next_probe_id += 1
🟢  212          node = ast.Assign(
    213              targets=[
    214                  ast.Subscript(
    215                      value=ast.Name(id=self._field_name, ctx=ast.Load()),
    216                      slice=ast.Constant(value=probe_id),
    217                      ctx=ast.Store(),
    218                  )
    219              ],
    220              value=ast.Constant(value=1),
    221              lineno=rel_lineno,
    222              col_offset=col_offset,
    223              end_lineno=rel_lineno,
    224              end_col_offset=col_offset,
    225          )
🟢  226          return node
    227  
🟢  228      def _instrument_body(self, stmts: list[ast.stmt]) -> list[ast.stmt]:
🟢  229          result: list[ast.stmt] = []
🟢  230          for stmt in stmts:
🟢  231              rel_lineno = getattr(stmt, "lineno", None)
🟢  232              if rel_lineno is not None:
🟢  233                  abs_lineno = rel_lineno + self._start_lineno - 1
🟢  234                  if abs_lineno not in self._seen_lines:
🟢  235                      self._seen_lines.add(abs_lineno)
🟢  236                      col = getattr(stmt, "col_offset", 0)
🟢  237                      probe = self._make_probe(abs_lineno, rel_lineno, col)
🟢  238                      if probe is not None:
🟢  239                          result.append(probe)
🟢  240              result.append(self.visit(stmt))
🟢  241          return result
    242  
🟢  243      def visit_FunctionDef(self, node: ast.FunctionDef) -> ast.FunctionDef:
🟢  244          node.body = self._instrument_body(node.body)
🟢  245          return node
    246  
🟢  247      def visit_AsyncFunctionDef(self, node: ast.AsyncFunctionDef) -> ast.AsyncFunctionDef:
🔴  248          node.body = self._instrument_body(node.body)
🔴  249          return node
    250  
🟢  251      def visit_If(self, node: ast.If) -> ast.If:
🟢  252          node.body = self._instrument_body(node.body)
🟢  253          if node.orelse:
🟢  254              node.orelse = self._instrument_body(node.orelse)
🟢  255          return node
    256  
🟢  257      def visit_For(self, node: ast.For) -> ast.For:
🟢  258          node.body = self._instrument_body(node.body)
🟢  259          if node.orelse:
🟢  260              node.orelse = self._instrument_body(node.orelse)
🟢  261          return node
    262  
🟢  263      def visit_While(self, node: ast.While) -> ast.While:
🟢  264          node.body = self._instrument_body(node.body)
🟢  265          if node.orelse:
🟢  266              node.orelse = self._instrument_body(node.orelse)
🟢  267          return node
    268  
🟢  269      def visit_With(self, node: ast.With) -> ast.With:
🟢  270          node.body = self._instrument_body(node.body)
🟢  271          return node
    272  
🟢  273      def visit_Try(self, node: ast.Try) -> ast.Try:
🟢  274          node.body = self._instrument_body(node.body)
🟢  275          for handler in node.handlers:
🟢  276              handler.body = self._instrument_body(handler.body)
🟢  277          if node.orelse:
🟢  278              node.orelse = self._instrument_body(node.orelse)
🟢  279          if node.finalbody:
🟢  280              node.finalbody = self._instrument_body(node.finalbody)
🟢  281          return node
    282  
    283  
🟢  284  atexit.register(flush)
🟢 python/quadrants/lang/ast/ast_transformer_utils.py (100%)
🟢  335              if not name.startswith("_qd_"):
🟢  336                  reason = f"{name} is in global vars, therefore violates pure"
🟢  337                  violates_pure = True
🟢 python/quadrants/lang/kernel.py (80%)
     19  
🔴   20  def _kernel_coverage_enabled() -> bool:
🟢   21      return os.environ.get("QD_KERNEL_COVERAGE") == "1"
     22  
     23  
🟢  382          if _kernel_coverage_enabled():
🟢  383              from . import _kernel_coverage  # pylint: disable=import-outside-toplevel
    384  
🟢  385              _kernel_coverage.ensure_field_allocated()
    386  
🟢 python/quadrants/lang/misc.py (100%)
    496  
🟢  497      if os.environ.get("QD_KERNEL_COVERAGE") == "1":
🟢  498          from . import _kernel_coverage  # pylint: disable=import-outside-toplevel
    499  
🟢  500          _kernel_coverage.ensure_field_allocated()
    501  
🔴 python/quadrants/pytest_plugin.py (67%)
      1  """Pytest plugin that auto-enables kernel coverage when pytest-cov is active.
      2  
      3  Registered via the ``pytest11`` entry point so it loads automatically when quadrants is installed.
      4  Opt out by setting ``QD_KERNEL_COVERAGE=0`` explicitly.
      5  """
      6  
🔴    7  import os
      8  
      9  
🔴   10  def pytest_configure(config):
🟢   11      if not config.pluginmanager.hasplugin("_cov"):
🔴   12          return
🟢   13      os.environ.setdefault("QD_KERNEL_COVERAGE", "1")
🟢   14      if os.environ.get("QD_KERNEL_COVERAGE") != "1":
🟢   15          return
     16      # Tell the kernel coverage module whether pytest-cov is running in branch (arc) mode,
     17      # so it writes the matching format and avoids "Can not mix line and arc data" at combine time.
     18      # We read config.option.cov_branch which pytest-cov has already populated by this point.
🟢   19      cov_branch = getattr(config.option, "cov_branch", False) or False
🟢   20      os.environ["_QD_KCOV_ARC"] = "1" if cov_branch else "0"
🟢 tests/python/quadrants/lang/fast_caching/test_src_ll_cache.py (100%)
     11  
🟢   12  _KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"
🟢   67  @pytest.mark.skipif(
     68      _KERNEL_COVERAGE,
     69      reason="Coverage probes change LLVM IR addresses after reinit, breaking recompile comparison",
     70  )
🟢 tests/python/quadrants/lang/test_kernel_impl.py (100%)
🟢    1  import os
🟢   11  _KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"
     13  
🟢   14  @pytest.mark.skipif(
     15      _KERNEL_COVERAGE,
     16      reason="Coverage probes change the kernel AST, preventing FE-LL cache hits after reinit",
     17  )
🟢 tests/python/test_api.py (100%)
🟢  438      actual = sorted([s for s in dir(src) if not s.startswith(("_", "@")) and s != "pytest_plugin"])
🟢 tests/python/test_intrinsics.py (100%)
     55      (thread i does (i+1)*200000). Asserts strict monotonicity across threads and that
🟢   91          assert a[i - 1] < a[i] < a[i + 1]
🟢 tests/python/test_kernel_coverage.py (99%)
      1  """Tests for kernel code coverage instrumentation.
      2  
      3  These tests verify that the AST rewriter correctly inserts coverage probes and that the probes fire when kernel
      4  code executes on the device.
      5  """
      6  
🟢    7  import ast
🟢    8  import os
🟢    9  import textwrap
     10  
🟢   11  import pytest
     12  
🟢   13  import quadrants as qd
     14  
🟢   15  from tests import test_utils
     16  
     17  # These tests only run when QD_KERNEL_COVERAGE=1
🟢   18  pytestmark = pytest.mark.skipif(
     19      os.environ.get("QD_KERNEL_COVERAGE", "") != "1",
     20      reason="QD_KERNEL_COVERAGE=1 not set",
     21  )
     22  
     23  
     24  # ---------------------------------------------------------------------------
     25  # AST rewriter unit tests
     26  # ---------------------------------------------------------------------------
     27  
🟢   28  _AST_REWRITER_CASES = [
     29      pytest.param(
     30          """\
     31          def f():
     32              x = 1
     33              y = 2
     34              return x + y
     35          """,
     36          {11, 12, 13},
     37          10,
     38          id="straight_line",
     39      ),
     40      pytest.param(
     41          """\
     42          def f():
     43              if x > 0:
     44                  a = 1
     45              else:
     46                  b = 2
     47          """,
     48          {2, 3, 5},
     49          1,
     50          id="if_else",
     51      ),
     52      pytest.param(
     53          """\
     54          def f():
     55              for i in range(10):
     56                  x = i
     57          """,
     58          {2, 3},
     59          1,
     60          id="for_loop",
     61      ),
     62      pytest.param(
     63          """\
     64          def f():
     65              while x > 0:
     66                  x = x - 1
     67              else:
     68                  y = 0
     69          """,
     70          {2, 3, 5},
     71          1,
     72          id="while_loop_else",
     73      ),
     74      pytest.param(
     75          """\
     76          def f():
     77              with ctx:
     78                  a = 1
     79                  b = 2
     80          """,
     81          {2, 3, 4},
     82          1,
     83          id="with_statement",
     84      ),
     85      pytest.param(
     86          """\
     87          def f():
     88              try:
     89                  a = 1
     90              except:
     91                  b = 2
     92              else:
     93                  c = 3
     94              finally:
     95                  d = 4
     96          """,
     97          {3, 5, 7, 9},
     98          1,
     99          id="try_except_finally",
    100      ),
    101  ]
    102  
    103  
🟢  104  @pytest.mark.parametrize("src,expected_lines,start_lineno", _AST_REWRITER_CASES)
🟢  105  def test_ast_rewriter(src, expected_lines, start_lineno):
    106      """Verify the AST rewriter inserts probes at the expected source lines."""
🟢  107      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    108  
🟢  109      tree = ast.parse(textwrap.dedent(src))
🟢  110      rewriter = _CoverageASTRewriter(
    111          field_name="_qd_cov", filepath="test.py", start_lineno=start_lineno, probe_id_start=0
    112      )
🟢  113      rewriter.visit(tree)
    114  
🟢  115      covered_lines = {lineno for _, (_, lineno) in rewriter.probe_map.items()}
🟢  116      assert expected_lines.issubset(covered_lines), f"Expected lines {expected_lines} to be probed, got {covered_lines}"
    117  
    118  
🟢  119  def test_ast_rewriter_capacity_limit():
    120      """Verify that probes stop being inserted when the capacity limit is hit."""
🟢  121      import warnings
    122  
🟢  123      import quadrants.lang._kernel_coverage as kcov
🟢  124      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    125  
🟢  126      src = textwrap.dedent(
    127          """\
    128          def f():
    129              a = 1
    130              b = 2
    131              c = 3
    132      """
    133      )
🟢  134      tree = ast.parse(src)
🟢  135      old_warning_state = kcov._capacity_warning_emitted
🟢  136      kcov._capacity_warning_emitted = False
🟢  137      try:
🟢  138          with warnings.catch_warnings(record=True) as w:
🟢  139              warnings.simplefilter("always")
🟢  140              rewriter = _CoverageASTRewriter(
    141                  field_name="_qd_cov", filepath="test.py", start_lineno=1, probe_id_start=kcov._MAX_PROBES - 1
    142              )
🟢  143              rewriter.visit(tree)
    144  
🟢  145          assert rewriter.next_probe_id == kcov._MAX_PROBES
🟢  146          assert len(rewriter.probe_map) == 1, f"Only 1 probe should fit, got {len(rewriter.probe_map)}"
🟢  147          assert len(w) == 1
🟢  148          assert "exceeded" in str(w[0].message).lower()
    149      finally:
🟢  150          kcov._capacity_warning_emitted = old_warning_state
    151  
    152  
🟢  153  def test_ast_rewriter_deduplicates_same_line():
    154      """Verify that two statements on the same source line get only one probe."""
🟢  155      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    156  
🟢  157      src = "def f():\n    a = 1; b = 2\n"
🟢  158      tree = ast.parse(src)
🟢  159      rewriter = _CoverageASTRewriter(field_name="_qd_cov", filepath="test.py", start_lineno=1, probe_id_start=0)
🟢  160      rewriter.visit(tree)
    161  
🟢  162      abs_lines = [lineno for _, (_, lineno) in rewriter.probe_map.items()]
🟢  163      assert abs_lines.count(2) == 1, f"Line 2 should have exactly one probe, got {abs_lines.count(2)}"
    164  
    165  
🟢  166  def test_env_var_max_probes():
    167      """Verify that QD_COVERAGE_MAX_PROBES env var is read at import time."""
🟢  168      import quadrants.lang._kernel_coverage as kcov
    169  
🟢  170      assert kcov._MAX_PROBES == int(os.environ.get("QD_COVERAGE_MAX_PROBES", "100000"))
    171  
    172  
🟢  173  def test_harvest_field_exception_path():
    174      """Verify that _harvest_field handles to_numpy() failure gracefully."""
🟢  175      from unittest.mock import MagicMock
    176  
🟢  177      import quadrants.lang._kernel_coverage as kcov
    178  
🟢  179      old_field = kcov._cov_field
🟢  180      old_prog = kcov._cov_field_prog
🟢  181      old_map = kcov._probe_map.copy()
🟢  182      try:
🟢  183          mock_field = MagicMock()
🟢  184          mock_field.to_numpy.side_effect = RuntimeError("runtime destroyed")
🟢  185          kcov._cov_field = mock_field
🟢  186          kcov._cov_field_prog = object()
🟢  187          kcov._probe_map[999999] = ("fake.py", 1)
    188  
    189          # Should not raise — the exception is caught and logged
🟢  190          kcov._harvest_field()
    191  
🟢  192          assert kcov._cov_field is None, "Field should be cleared after failure"
🟢  193          assert kcov._cov_field_prog is None, "Field prog should be cleared after failure"
    194      finally:
🟢  195          kcov._cov_field = old_field
🟢  196          kcov._cov_field_prog = old_prog
🟢  197          kcov._probe_map = old_map
    198  
    199  
    200  # ---------------------------------------------------------------------------
    201  # End-to-end tests
    202  # ---------------------------------------------------------------------------
    203  
    204  
🟢  205  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  206  def test_kernel_coverage_branches_e2e():
    207      """Verify that only the taken branch has its probe fired."""
🟢  208      from quadrants.lang import _kernel_coverage
    209  
🟢  210      _kernel_coverage.ensure_field_allocated()
    211  
🟢  212      probe_count_before = _kernel_coverage._probe_counter
🟢  213      out = qd.field(dtype=qd.i32, shape=(1,))
    214  
🟢  215      @qd.kernel
🟢  216      def branching_kernel():
🟢  217          x = 10
🟢  218          if x > 5:
🟢  219              out[0] = 1
    220          else:
🔴  221              out[0] = 2
    222  
🟢  223      branching_kernel()
    224  
🟢  225      assert out[0] == 1
    226  
🟢  227      cov_field = _kernel_coverage.get_field()
🟢  228      arr = cov_field.to_numpy()
    229  
🟢  230      probes_for_kernel = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
    231  
🟢  232      taken_probes = {pid for pid, loc in probes_for_kernel.items() if arr[pid] != 0}
🟢  233      not_taken_probes = {pid for pid, loc in probes_for_kernel.items() if arr[pid] == 0}
    234  
🟢  235      assert len(taken_probes) > 0, "At least some probes should have fired"
🟢  236      assert len(not_taken_probes) > 0, "The else branch should not have been reached"
    237  
    238  
🟢  239  @test_utils.test(arch=qd.gpu)
🟢  240  def test_kernel_coverage_simt_e2e():
    241      """Verify coverage probes track branches with block.sync() and subgroup shuffle.
    242  
    243      The if/else is based on a runtime value read from a field, so the compiler cannot constant-fold it away.
    244      Only the taken branch's shuffle probe should fire.
    245      """
🟢  246      from quadrants.lang import _kernel_coverage
🟢  247      from quadrants.lang.simt import subgroup
    248  
🟢  249      _kernel_coverage.ensure_field_allocated()
    250  
🟢  251      N = 64
🟢  252      probe_count_before = _kernel_coverage._probe_counter
🟢  253      flag = qd.field(dtype=qd.i32, shape=(1,))
🟢  254      a = qd.field(dtype=qd.i32, shape=(N,))
🟢  255      out = qd.field(dtype=qd.i32, shape=(N,))
    256  
🟢  257      flag[0] = 1  # runtime value: take the if-branch
    258  
🟢  259      @qd.kernel
🟢  260      def simt_kernel():
🟢  261          qd.loop_config(block_dim=N)
🟢  262          for i in range(N):
🟢  263              a[i] = i + 1
🟢  264              qd.simt.block.sync()
🟢  265              if flag[0] > 0:
🟢  266                  val = subgroup.shuffle(a[i], qd.u32(0))
🟢  267                  out[i] = val
    268              else:
🔴  269                  val = subgroup.shuffle(a[i], qd.u32(1))
🔴  270                  out[i] = val + 100
    271  
🟢  272      simt_kernel()
    273  
🟢  274      for i in range(4):
🟢  275          assert out[i] == 1, f"Expected 1 at index {i}, got {out[i]}"
    276  
🟢  277      cov_field = _kernel_coverage.get_field()
🟢  278      arr = cov_field.to_numpy()
    279  
🟢  280      probes_for_kernel = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
    281  
🟢  282      fired = {pid for pid in probes_for_kernel if arr[pid] != 0}
🟢  283      not_fired = {pid for pid in probes_for_kernel if arr[pid] == 0}
🟢  284      assert len(fired) >= 4, f"Expected at least 4 probes to fire, got {len(fired)}"
🟢  285      assert len(not_fired) >= 2, "The else branch should not have been reached"
    286  
    287  
🟢  288  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  289  def test_kernel_coverage_survives_reinit():
    290      """Verify that coverage data accumulated before qd.init() reset is preserved.
    291  
    292      Runs a kernel, then resets via qd.reset()/qd.init() (which triggers the _hooked_clear harvest), runs another
    293      kernel, harvests again, and checks that _accumulated_lines contains data from both sessions.
    294      """
🟢  295      from quadrants.lang import _kernel_coverage, impl
    296  
🟢  297      current_arch = impl.get_runtime()._arch
🟢  298      _kernel_coverage.ensure_field_allocated()
    299  
🟢  300      probe_count_before = _kernel_coverage._probe_counter
🟢  301      out1 = qd.field(dtype=qd.i32, shape=(1,))
    302  
🟢  303      @qd.kernel
🟢  304      def kernel_before_reset():
🟢  305          out1[0] = 1
    306  
🟢  307      kernel_before_reset()
    308  
🟢  309      cov_field = _kernel_coverage.get_field()
🟢  310      assert cov_field is not None
🟢  311      arr = cov_field.to_numpy()
🟢  312      probes_first = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  313      fired_first = {pid for pid in probes_first if arr[pid] != 0}
🟢  314      assert len(fired_first) > 0, "Probes from first kernel should have fired"
    315  
    316      # Don't call _harvest_field() manually — let qd.reset() trigger it via the _hooked_clear hook
🟢  317      qd.reset()
    318  
    319      # Verify the hook harvested data from the first session
🟢  320      files_before = set(_kernel_coverage._accumulated_lines.keys())
🟢  321      assert len(files_before) > 0, "Hook should have harvested data during reset"
🟢  322      lines_before = {}
🟢  323      for f, lines in _kernel_coverage._accumulated_lines.items():
🟢  324          lines_before[f] = set(lines)
    325  
🟢  326      qd.init(arch=current_arch)
    327  
🟢  328      _kernel_coverage.ensure_field_allocated()
    329  
🟢  330      probe_count_mid = _kernel_coverage._probe_counter
🟢  331      out2 = qd.field(dtype=qd.i32, shape=(1,))
    332  
🟢  333      @qd.kernel
🟢  334      def kernel_after_reset():
🟢  335          out2[0] = 2
    336  
🟢  337      kernel_after_reset()
    338  
🟢  339      _kernel_coverage._harvest_field()
    340  
🟢  341      for f in files_before:
🟢  342          assert (
    343              f in _kernel_coverage._accumulated_lines
    344          ), f"File {f} from before reset should still be in _accumulated_lines"
🟢  345          assert lines_before[f].issubset(
    346              _kernel_coverage._accumulated_lines[f]
    347          ), "Lines from before reset should be preserved"
    348  
🟢  349      probes_second = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_mid}
🟢  350      second_files = {loc[0] for loc in probes_second.values()}
🟢  351      for f in second_files:
🟢  352          assert f in _kernel_coverage._accumulated_lines, f"File {f} from second kernel should be in _accumulated_lines"
    353  
    354  
🟢  355  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  356  def test_kernel_coverage_autodiff():
    357      """Verify that autodiff forward pass produces probes but backward does not.
    358  
    359      The forward compilation (AutodiffMode.NONE) should insert probes that fire. The backward compilation
    360      (AutodiffMode.REVERSE) should not add any probes.
    361      """
🟢  362      from quadrants.lang import _kernel_coverage
    363  
🟢  364      _kernel_coverage.ensure_field_allocated()
    365  
🟢  366      x = qd.field(dtype=qd.f32, shape=(), needs_grad=True)
🟢  367      loss = qd.field(dtype=qd.f32, shape=(), needs_grad=True)
    368  
🟢  369      @qd.kernel
🟢  370      def compute():
🟢  371          loss[None] = x[None] * x[None]
    372  
🟢  373      x[None] = 5.0
    374  
🟢  375      probe_count_before = _kernel_coverage._probe_counter
    376  
🟢  377      with qd.ad.Tape(loss):
🟢  378          compute()
    379  
🟢  380      probe_count_after_tape = _kernel_coverage._probe_counter
🟢  381      forward_probes = probe_count_after_tape - probe_count_before
🟢  382      assert forward_probes > 0, "Forward compilation should have inserted probes"
    383  
    384      # Verify forward probes actually fired
🟢  385      cov_field = _kernel_coverage.get_field()
🟢  386      assert cov_field is not None
🟢  387      arr = cov_field.to_numpy()
🟢  388      probes = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  389      fired = {pid for pid in probes if arr[pid] != 0}
🟢  390      assert len(fired) > 0, "Forward pass inside Tape should produce fired coverage probes"
    391  
    392      # Verify backward pass computes correct gradients
🟢  393      assert loss[None] == pytest.approx(25.0)
🟢  394      assert x.grad[None] == pytest.approx(10.0)
    395  
    396  
🟢  397  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  398  def test_kernel_coverage_qd_func():
    399      """Verify that probes fire inside a @qd.func called from a kernel."""
🟢  400      from quadrants.lang import _kernel_coverage
    401  
🟢  402      _kernel_coverage.ensure_field_allocated()
    403  
🟢  404      probe_count_before = _kernel_coverage._probe_counter
🟢  405      out = qd.field(dtype=qd.i32, shape=(1,))
    406  
🟢  407      @qd.func
🟢  408      def helper():
🟢  409          out[0] = 99
    410  
🟢  411      @qd.kernel
🟢  412      def caller():
🟢  413          helper()
    414  
🟢  415      caller()
    416  
🟢  417      assert out[0] == 99
    418  
🟢  419      cov_field = _kernel_coverage.get_field()
🟢  420      assert cov_field is not None
🟢  421      arr = cov_field.to_numpy()
    422  
🟢  423      probes = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  424      fired = {pid for pid in probes if arr[pid] != 0}
    425      # The kernel body has one statement (helper()), and the func body has one (out[0] = 99).
    426      # Both should produce probes that fire.
🟢  427      assert (
    428          len(fired) >= 2
    429      ), f"Expected probes from both kernel and func to fire, got {len(fired)} fired out of {len(probes)}"
    430  
    431  
🟢  432  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  433  def test_kernel_coverage_multiple_kernels_same_session():
    434      """Verify that probes from two different kernels both fire in the same session."""
🟢  435      from quadrants.lang import _kernel_coverage
    436  
🟢  437      _kernel_coverage.ensure_field_allocated()
    438  
🟢  439      probe_count_before = _kernel_coverage._probe_counter
🟢  440      a = qd.field(dtype=qd.i32, shape=(1,))
🟢  441      b = qd.field(dtype=qd.i32, shape=(1,))
    442  
🟢  443      @qd.kernel
🟢  444      def kernel_a():
🟢  445          a[0] = 10
    446  
🟢  447      @qd.kernel
🟢  448      def kernel_b():
🟢  449          b[0] = 20
    450  
🟢  451      kernel_a()
🟢  452      probe_count_after_a = _kernel_coverage._probe_counter
🟢  453      kernel_b()
    454  
🟢  455      assert a[0] == 10
🟢  456      assert b[0] == 20
    457  
🟢  458      cov_field = _kernel_coverage.get_field()
🟢  459      arr = cov_field.to_numpy()
    460  
🟢  461      probes_a = {
    462          pid: loc for pid, loc in _kernel_coverage._probe_map.items() if probe_count_before <= pid < probe_count_after_a
    463      }
🟢  464      probes_b = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_after_a}
    465  
🟢  466      fired_a = {pid for pid in probes_a if arr[pid] != 0}
🟢  467      fired_b = {pid for pid in probes_b if arr[pid] != 0}
    468  
🟢  469      assert len(fired_a) > 0, "Probes from kernel_a should have fired"
🟢  470      assert len(fired_b) > 0, "Probes from kernel_b should have fired"
    471  
    472  
🟢  473  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  474  def test_qd_prefix_exemption_pure_kernel():
    475      """Verify that _qd_-prefixed globals don't violate pure kernel checks.
    476  
    477      With kernel coverage enabled, _qd_cov is injected as a global. This test verifies that a pure (fastcache)
    478      kernel still compiles without error. The kernel uses ndarray arguments (not global fields) because pure
    479      kernels prohibit non-_qd_ globals.
    480      """
🟢  481      a = qd.ndarray(qd.i32, (1,))
    482  
🟢  483      @qd.kernel(fastcache=True)
🟢  484      def pure_kernel(arr: qd.types.NDArray) -> None:
🟢  485          arr[0] = 42
    486  
🟢  487      pure_kernel(a)
🟢  488      assert a[0] == 42
🟢 tests/python/test_offline_cache.py (100%)
     16  # Coverage field allocation creates internal fill kernels that change cache file counts.
     17  # CI runs these tests in a separate phase without QD_KERNEL_COVERAGE (see 4_test.sh).
🟢   18  pytestmark = pytest.mark.skipif(
     19      os.environ.get("QD_KERNEL_COVERAGE") == "1",
     20      reason="Kernel coverage adds internal kernels that invalidate cache file count assertions",
     21  )
     22  
🟢 tests/python/test_snode_layout_inspection.py (100%)
🟢    1  import os
      2  
🟢    3  import pytest
      4  
🟢   10  @pytest.mark.skipif(
     11      os.environ.get("QD_KERNEL_COVERAGE") == "1",
     12      reason="Kernel coverage field on root shifts offset assertions",
     13  )

Copy link
Copy Markdown
Collaborator

@erizmr erizmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved as discussed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated.

Comment on lines +19 to +20
cov_branch = getattr(config.option, "cov_branch", False) or False
os.environ["_QD_KCOV_ARC"] = "1" if cov_branch else "0"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The pytest plugin at pytest_plugin.py:19 reads only config.option.cov_branch to decide arc vs line mode, but pytest-cov defines --cov-branch with default=None, so users who enable branch coverage via branch = True in .coveragerc or [tool.coverage.run] in pyproject.toml (without passing the CLI flag) cause cov_branch to evaluate to None or False = False, setting _QD_KCOV_ARC=0. pytest-cov actually runs in arc mode (coverage.py applies the config-file setting), so _qd_kcov.* gets written in line format while .coverage is in arc format. coverage combine then fails with "Can not mix line and arc data", and the fallback in tests/coverage_report.py:combine_coverage() silently drops all kernel coverage. Fix: read the authoritative resolved value via config.pluginmanager.get_plugin("_cov").cov_controller.cov.config.branch.

Extended reasoning...

What the bug is and how it manifests

The pytest plugin in python/quadrants/pytest_plugin.py (lines 10-20) tries to inform _kernel_coverage whether pytest-cov is running in branch (arc) mode by setting the _QD_KCOV_ARC env var. The check at line 19 is:

cov_branch = getattr(config.option, "cov_branch", False) or False
os.environ["_QD_KCOV_ARC"] = "1" if cov_branch else "0"

This reflects only the pytest CLI flag --cov-branch, not the branch setting from coverage.py configuration files. pytest-cov registers --cov-branch with default=None and action='store_true' (verified at /usr/local/lib/python3.11/dist-packages/pytest_cov/plugin.py:160-163), so config.option.cov_branch is None whenever the CLI flag is absent. None or False == False, so _QD_KCOV_ARC gets set to '0' (line mode).

But pytest-cov passes branch=options.cov_branch directly to coverage.Coverage(...) (engine.py:61, 239). When None is passed, coverage.config.from_args() skips it (if v is not None, config.py:288-294), so any config-file [run] branch = True setting survives. pytest-cov is therefore in arc mode while the plugin sets _QD_KCOV_ARC=0.

The specific code path that triggers it

  1. User has [run] branch = true in .coveragerc (or [tool.coverage.run] branch = true in pyproject.toml) and runs pytest --cov=my_pkg without --cov-branch.
  2. Quadrants pytest plugin runs: config.option.cov_branch is None; None or False = False; sets _QD_KCOV_ARC=0.
  3. _kernel_coverage._detect_arc_mode() reads _QD_KCOV_ARC=0 and returns False (line mode).
  4. flush() writes _qd_kcov.<pid> in line format.
  5. Meanwhile pytest-cov writes .coverage in arc format (because coverage.Coverage(branch=None) reads .coveragerc and applies branch=True).
  6. tests/coverage_report.py:combine_coverage() runs coverage combine .coverage.pytest _qd_kcov.*DataError: Can not mix line and arc data.
  7. The fallback at lines 50-51 (coverage combine .coverage.pytest alone) silently drops all kernel coverage.

Empirical verification

A verifier confirmed empirically with pytest-cov 7.1.0:

# With .coveragerc containing [run]\nbranch = True:
cov = coverage.Coverage(source=['.'], branch=None, config_file='.coveragerc')
print(cov.config.branch)  # -> True
# config.option.cov_branch                    -> None
# cov_branch after `or False`                  -> False  (incorrect)
# cov_controller.cov.config.branch            -> True   (authoritative)

This proves the plugin's check disagrees with pytest-cov's actual mode.

Why existing code does not prevent it

The plugin guard getattr(config.option, "cov_branch", False) or False is purely a CLI-flag check. Coverage.py's three-layer config resolution (defaults → config file → from_args) is opaque to the plugin code. There is no fallback path that reads coverage.config after pytest-cov has resolved it. The PR's own combine_coverage() fallback masks the resulting error rather than surfacing it.

Impact

Any downstream user of quadrants who enables branch coverage via .coveragerc or pyproject.toml and runs pytest --cov=my_pkg without explicitly passing --cov-branch will silently lose all kernel coverage. This is a common, idiomatic configuration pattern — many Python projects keep coverage settings in config files rather than command lines. The failure is silent: the coverage comment on the PR shows pytest-cov coverage but no kernel data, with no warning or error visible to the user.

Step-by-step proof

  1. Project has pyproject.toml containing [tool.coverage.run] with branch = true.
  2. Developer runs pytest --cov=my_pkg tests/ (no --cov-branch).
  3. pytest-cov initializes: options.cov_branch = None. Coverage(branch=None, ...) reads pyproject.toml, sets config.branch = True. Tracer starts in arc mode.
  4. quadrants pytest plugin's pytest_configure fires: config.option.cov_branch = None; None or False = False; _QD_KCOV_ARC = "0".
  5. Tests run. Kernels execute. Probes accumulate.
  6. atexit triggers flush(): _detect_arc_mode() reads _QD_KCOV_ARC=0 → False → add_lines path → _qd_kcov.<pid> in line format.
  7. CI runs combine_coverage(): coverage combine .coverage.pytest _qd_kcov.*DataError: Can not mix line and arc data → fallback drops kernel files → final .coverage contains only pytest-cov data.
  8. Coverage report shows 0% kernel coverage.

How to fix it

Query the authoritative source — pytest-cov's already-constructed Coverage instance — instead of the CLI option:

def pytest_configure(config):
    if not config.pluginmanager.hasplugin("_cov"):
        return
    os.environ.setdefault("QD_KERNEL_COVERAGE", "1")
    if os.environ.get("QD_KERNEL_COVERAGE") != "1":
        return
    branch = False
    cov_plugin = config.pluginmanager.get_plugin("_cov")
    if cov_plugin is not None:
        ctrl = getattr(cov_plugin, "cov_controller", None)
        if ctrl is not None and getattr(ctrl, "cov", None) is not None:
            branch = bool(ctrl.cov.config.branch)
    os.environ["_QD_KCOV_ARC"] = "1" if branch else "0"

pytest-cov constructs cov_controller in pytest_load_initial_conftests (with tryfirst=True), which runs strictly before pytest_configure, so cov_controller.cov.config.branch is already authoritative and queryable by the time this hook fires. This reflects the merged result of CLI flags, config files, and defaults — matching pytest-cov's actual runtime behavior.

@github-actions
Copy link
Copy Markdown

Coverage Report (40a0ca011)

Metric Value
Diff coverage (changed lines only) 90%
Overall project coverage 72%

Total: 450 lines, 47 missing, 90% covered

🔴 python/quadrants/lang/_fast_caching/src_hasher.py (0%)
🔴    2  import os
     53              "kcov" if os.environ.get("QD_KERNEL_COVERAGE") == "1" else "",
🟢 python/quadrants/lang/_func_base.py (82%)
🔴    7  import os
     25  
🔴   26  def _kernel_coverage_enabled() -> bool:
🟢   27      return os.environ.get("QD_KERNEL_COVERAGE") == "1"
     28  
     29  
🟢  255          _kcov = None
🟢  256          if _kernel_coverage_enabled() and autodiff_mode == _qd_core.AutodiffMode.NONE:
🟢  257              from . import (  # pylint: disable=import-outside-toplevel
    258                  _kernel_coverage as _kcov,
    259              )
    260  
🟢  261              tree = _kcov.rewrite_ast(tree, function_source_info.filepath, function_source_info.start_lineno)
    262  
🟢  266          if _kcov is not None:
🟢  267              cov_field = _kcov.get_field()
🟢  268              if cov_field is not None:
🟢  269                  global_vars[_kcov.FIELD_VAR_NAME] = cov_field
🔴 python/quadrants/lang/_kernel_coverage.py (79%)
      1  """Kernel code coverage via Python AST rewriting.
      2  
      3  When enabled (QD_KERNEL_COVERAGE=1), this module rewrites kernel and func ASTs to insert coverage probes — field
      4  stores that record which source lines actually execute on the GPU. At process exit, the collected data is written
      5  to a .coverage file compatible with coverage.py / pytest-cov / diff-cover.
      6  
      7  The probes are compiled as ordinary field stores by the existing pipeline, so no C++ changes are needed. When
      8  disabled, this module is never imported and has zero impact on the normal runtime path.
      9  """
     10  
🟢   11  import ast
🟢   12  import atexit
🟢   13  import logging
🟢   14  import os
🟢   15  import threading
🟢   16  import warnings
🟢   17  from typing import TYPE_CHECKING
     18  
🟢   19  from coverage import CoverageData  # type: ignore[import-not-found]
     20  
🟢   21  import quadrants as qd
🟢   22  from quadrants.lang import impl
     23  
     24  if TYPE_CHECKING:
     25      from quadrants.lang.field import ScalarField
     26  
🟢   27  FIELD_VAR_NAME = "_qd_cov"
🟢   28  _MAX_PROBES = int(os.environ.get("QD_COVERAGE_MAX_PROBES", "100000"))
     29  
🟢   30  _lock = threading.Lock()
🟢   31  _cov_field: "ScalarField | None" = None
🟢   32  _cov_field_prog: object | None = None  # tracks which Program instance owns _cov_field
🟢   33  _probe_counter: int = 0
     34  # {probe_id: (filepath, absolute_lineno)}
🟢   35  _probe_map: dict[int, tuple[str, int]] = {}
     36  # Accumulated coverage lines surviving across qd.init() resets
🟢   37  _accumulated_lines: dict[str, set[int]] = {}
🟢   38  _reset_hook_installed: bool = False
     39  # Directory for .coverage and _qd_kcov.* files, captured when coverage is first enabled
🟢   40  _coverage_dir: str | None = None
     41  
     42  
🟢   43  def _harvest_field() -> None:
     44      """Read probe data from the current field into _accumulated_lines.
     45  
     46      Must be called while the runtime is still alive (before clear()).
     47      """
     48      global _cov_field, _cov_field_prog
🟢   49      with _lock:
🟢   50          if _cov_field is None or not _probe_map:
🟢   51              return
🟢   52          field_ref = _cov_field
🟢   53          probe_snapshot = dict(_probe_map)
🟢   54          _cov_field = None
🟢   55          _cov_field_prog = None
🟢   56      try:
🟢   57          arr = field_ref.to_numpy()
🟢   58      except Exception:
🟢   59          logging.warning("Failed to read coverage field, coverage data for this session will be lost", exc_info=True)
🟢   60          return
🟢   61      with _lock:
🟢   62          for probe_id, (filepath, lineno) in probe_snapshot.items():
🟢   63              if probe_id < len(arr) and arr[probe_id] != 0:
🟢   64                  _accumulated_lines.setdefault(filepath, set()).add(lineno)
     65  
     66  
🟢   67  def _install_reset_hook() -> None:
     68      """Monkey-patch PyQuadrants.clear() to harvest probes before destruction."""
     69      global _reset_hook_installed
🟢   70      if _reset_hook_installed:
🟢   71          return
🟢   72      _original_clear = impl.PyQuadrants.clear
     73  
🟢   74      def _hooked_clear(self) -> None:
🟢   75          _harvest_field()
🟢   76          _original_clear(self)
     77  
🟢   78      impl.PyQuadrants.clear = _hooked_clear  # type: ignore[assignment]
🟢   79      _reset_hook_installed = True
     80  
     81  
🟢   82  def ensure_field_allocated() -> None:
     83      """Allocate (or re-allocate after qd.init()) the global coverage field."""
     84      global _cov_field, _cov_field_prog, _coverage_dir
🟢   85      _install_reset_hook()
🟢   86      if _coverage_dir is None:
🟢   87          _coverage_dir = os.getcwd()
🟢   88      current_prog = impl.get_runtime()._prog
🟢   89      if _cov_field is not None and _cov_field_prog is current_prog:
🟢   90          return
🟢   91      with _lock:
🟢   92          current_prog = impl.get_runtime()._prog
🟢   93          if _cov_field is not None and _cov_field_prog is current_prog:
🔴   94              return
🟢   95          _cov_field = qd.field(dtype=qd.i32, shape=(_MAX_PROBES,))  # type: ignore[assignment]
🟢   96          _cov_field_prog = current_prog
     97  
     98  
🟢   99  def get_field() -> "ScalarField | None":
🟢  100      with _lock:
🟢  101          if _cov_field_prog is not impl.get_runtime()._prog:
🔴  102              return None
🟢  103          return _cov_field
    104  
    105  
🟢  106  def rewrite_ast(tree: ast.Module, filepath: str, start_lineno: int) -> ast.Module:
    107      """Rewrite a kernel/func AST to insert coverage probes.
    108  
    109      Each executable statement at a new source line gets a probe: ``_qd_cov[<probe_id>] = 1``.
    110      Probes inside if/else bodies only fire when that branch is taken, giving true runtime branch coverage.
    111      """
    112      global _probe_counter
🟢  113      with _lock:
🟢  114          rewriter = _CoverageASTRewriter(
    115              field_name=FIELD_VAR_NAME,
    116              filepath=filepath,
    117              start_lineno=start_lineno,
    118              probe_id_start=_probe_counter,
    119          )
🟢  120          tree = rewriter.visit(tree)
🟢  121          ast.fix_missing_locations(tree)
🟢  122          _probe_counter = rewriter.next_probe_id
🟢  123          _probe_map.update(rewriter.probe_map)
🟢  124      return tree
    125  
    126  
🟢  127  def _detect_arc_mode() -> bool:
    128      """Detect whether pytest-cov is running in branch (arc) mode.
    129  
    130      Checks _QD_KCOV_ARC env var first (set by the pytest plugin), then falls back to reading .coverage.
    131      Defaults to False (line mode) when nothing is known, since ``pytest --cov`` without ``--cov-branch``
    132      is the more common invocation.
    133      """
🔴  134      arc_env = os.environ.get("_QD_KCOV_ARC")
🔴  135      if arc_env is not None:
🔴  136          return arc_env == "1"
🔴  137      try:
🔴  138          cov_path = os.path.join(_coverage_dir, ".coverage") if _coverage_dir else ".coverage"
🔴  139          cd = CoverageData(basename=cov_path)
🔴  140          cd.read()
🔴  141          if not cd.measured_files():
🔴  142              return False
🔴  143          return cd.has_arcs()
🔴  144      except Exception:
🔴  145          logging.debug("Failed to detect arc mode from .coverage file, defaulting to line mode", exc_info=True)
🔴  146          return False
    147  
    148  
🟢  149  def flush() -> None:
    150      """Harvest any remaining field data and write all results to a .coverage file.
    151  
    152      If .coverage.kernel already exists (e.g. from a prior test phase), the new data is merged into it so nothing
    153      is lost across multiple invocations.
    154      """
🔴  155      _harvest_field()
    156  
🔴  157      with _lock:
🔴  158          if not _accumulated_lines:
🔴  159              return
🔴  160          snapshot = {f: set(lines) for f, lines in _accumulated_lines.items()}
    161  
🔴  162      base_dir = _coverage_dir or os.getcwd()
🔴  163      kernel_path = os.path.join(base_dir, f"_qd_kcov.{os.getpid()}")
🔴  164      use_arcs = _detect_arc_mode()
    165  
🔴  166      cov = CoverageData(basename=kernel_path)
🔴  167      if use_arcs:
🔴  168          arcs_by_file: dict[str, list[tuple[int, int]]] = {}
🔴  169          for filepath, lines in snapshot.items():
    170              # Emit only entry/exit arcs per line — we know which lines ran but not the actual transitions
    171              # between them, so we avoid fabricating inter-line arcs that would misrepresent branch coverage.
🔴  172              arcs = []
🔴  173              for line in sorted(lines):
🔴  174                  arcs.append((-1, line))
🔴  175                  arcs.append((line, -1))
🔴  176              arcs_by_file[filepath] = arcs
🔴  177          cov.add_arcs(arcs_by_file)
    178      else:
🔴  179          cov.add_lines({f: sorted(lines) for f, lines in snapshot.items()})
🔴  180      cov.write()
    181  
    182  
🟢  183  _capacity_warning_emitted = False
    184  
    185  
🟢  186  class _CoverageASTRewriter(ast.NodeTransformer):
    187      """Insert coverage probes before each statement at a new source line."""
    188  
🟢  189      def __init__(self, field_name: str, filepath: str, start_lineno: int, probe_id_start: int) -> None:
🟢  190          self._field_name = field_name
🟢  191          self._filepath = filepath
🟢  192          self._start_lineno = start_lineno
🟢  193          self.next_probe_id = probe_id_start
🟢  194          self._seen_lines: set[int] = set()
🟢  195          self.probe_map: dict[int, tuple[str, int]] = {}
    196  
🟢  197      def _make_probe(self, abs_lineno: int, rel_lineno: int, col_offset: int) -> ast.Assign | None:
    198          global _capacity_warning_emitted
🟢  199          probe_id = self.next_probe_id
🟢  200          if probe_id >= _MAX_PROBES:
🟢  201              if not _capacity_warning_emitted:
🟢  202                  warnings.warn(
    203                      f"Kernel coverage probe capacity ({_MAX_PROBES}) exceeded. "
    204                      f"Additional kernel lines will not be tracked. "
    205                      f"Set QD_COVERAGE_MAX_PROBES to a higher value.",
    206                      stacklevel=2,
    207                  )
🟢  208                  _capacity_warning_emitted = True
🟢  209              return None
🟢  210          self.probe_map[probe_id] = (self._filepath, abs_lineno)
🟢  211          self.next_probe_id += 1
🟢  212          node = ast.Assign(
    213              targets=[
    214                  ast.Subscript(
    215                      value=ast.Name(id=self._field_name, ctx=ast.Load()),
    216                      slice=ast.Constant(value=probe_id),
    217                      ctx=ast.Store(),
    218                  )
    219              ],
    220              value=ast.Constant(value=1),
    221              lineno=rel_lineno,
    222              col_offset=col_offset,
    223              end_lineno=rel_lineno,
    224              end_col_offset=col_offset,
    225          )
🟢  226          return node
    227  
🟢  228      def _instrument_body(self, stmts: list[ast.stmt]) -> list[ast.stmt]:
🟢  229          result: list[ast.stmt] = []
🟢  230          for stmt in stmts:
🟢  231              rel_lineno = getattr(stmt, "lineno", None)
🟢  232              if rel_lineno is not None:
🟢  233                  abs_lineno = rel_lineno + self._start_lineno - 1
🟢  234                  if abs_lineno not in self._seen_lines:
🟢  235                      self._seen_lines.add(abs_lineno)
🟢  236                      col = getattr(stmt, "col_offset", 0)
🟢  237                      probe = self._make_probe(abs_lineno, rel_lineno, col)
🟢  238                      if probe is not None:
🟢  239                          result.append(probe)
🟢  240              result.append(self.visit(stmt))
🟢  241          return result
    242  
🟢  243      def visit_FunctionDef(self, node: ast.FunctionDef) -> ast.FunctionDef:
🟢  244          node.body = self._instrument_body(node.body)
🟢  245          return node
    246  
🟢  247      def visit_AsyncFunctionDef(self, node: ast.AsyncFunctionDef) -> ast.AsyncFunctionDef:
🔴  248          node.body = self._instrument_body(node.body)
🔴  249          return node
    250  
🟢  251      def visit_If(self, node: ast.If) -> ast.If:
🟢  252          node.body = self._instrument_body(node.body)
🟢  253          if node.orelse:
🟢  254              node.orelse = self._instrument_body(node.orelse)
🟢  255          return node
    256  
🟢  257      def visit_For(self, node: ast.For) -> ast.For:
🟢  258          node.body = self._instrument_body(node.body)
🟢  259          if node.orelse:
🟢  260              node.orelse = self._instrument_body(node.orelse)
🟢  261          return node
    262  
🟢  263      def visit_While(self, node: ast.While) -> ast.While:
🟢  264          node.body = self._instrument_body(node.body)
🟢  265          if node.orelse:
🟢  266              node.orelse = self._instrument_body(node.orelse)
🟢  267          return node
    268  
🟢  269      def visit_With(self, node: ast.With) -> ast.With:
🟢  270          node.body = self._instrument_body(node.body)
🟢  271          return node
    272  
🟢  273      def visit_Try(self, node: ast.Try) -> ast.Try:
🟢  274          node.body = self._instrument_body(node.body)
🟢  275          for handler in node.handlers:
🟢  276              handler.body = self._instrument_body(handler.body)
🟢  277          if node.orelse:
🟢  278              node.orelse = self._instrument_body(node.orelse)
🟢  279          if node.finalbody:
🟢  280              node.finalbody = self._instrument_body(node.finalbody)
🟢  281          return node
    282  
    283  
🟢  284  atexit.register(flush)
🟢 python/quadrants/lang/ast/ast_transformer_utils.py (100%)
🟢  335              if not name.startswith("_qd_"):
🟢  336                  reason = f"{name} is in global vars, therefore violates pure"
🟢  337                  violates_pure = True
🟢 python/quadrants/lang/kernel.py (80%)
     19  
🔴   20  def _kernel_coverage_enabled() -> bool:
🟢   21      return os.environ.get("QD_KERNEL_COVERAGE") == "1"
     22  
     23  
🟢  382          if _kernel_coverage_enabled():
🟢  383              from . import _kernel_coverage  # pylint: disable=import-outside-toplevel
    384  
🟢  385              _kernel_coverage.ensure_field_allocated()
    386  
🟢 python/quadrants/lang/misc.py (100%)
    496  
🟢  497      if os.environ.get("QD_KERNEL_COVERAGE") == "1":
🟢  498          from . import _kernel_coverage  # pylint: disable=import-outside-toplevel
    499  
🟢  500          _kernel_coverage.ensure_field_allocated()
    501  
🔴 python/quadrants/pytest_plugin.py (67%)
      1  """Pytest plugin that auto-enables kernel coverage when pytest-cov is active.
      2  
      3  Registered via the ``pytest11`` entry point so it loads automatically when quadrants is installed.
      4  Opt out by setting ``QD_KERNEL_COVERAGE=0`` explicitly.
      5  """
      6  
🔴    7  import os
      8  
      9  
🔴   10  def pytest_configure(config):
🟢   11      if not config.pluginmanager.hasplugin("_cov"):
🔴   12          return
🟢   13      os.environ.setdefault("QD_KERNEL_COVERAGE", "1")
🟢   14      if os.environ.get("QD_KERNEL_COVERAGE") != "1":
🟢   15          return
     16      # Tell the kernel coverage module whether pytest-cov is running in branch (arc) mode,
     17      # so it writes the matching format and avoids "Can not mix line and arc data" at combine time.
     18      # We read config.option.cov_branch which pytest-cov has already populated by this point.
🟢   19      cov_branch = getattr(config.option, "cov_branch", False) or False
🟢   20      os.environ["_QD_KCOV_ARC"] = "1" if cov_branch else "0"
🟢 tests/python/quadrants/lang/fast_caching/test_src_ll_cache.py (100%)
     11  
🟢   12  _KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"
🟢   67  @pytest.mark.skipif(
     68      _KERNEL_COVERAGE,
     69      reason="Coverage probes change LLVM IR addresses after reinit, breaking recompile comparison",
     70  )
🟢 tests/python/quadrants/lang/test_kernel_impl.py (100%)
🟢    1  import os
🟢   11  _KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"
     13  
🟢   14  @pytest.mark.skipif(
     15      _KERNEL_COVERAGE,
     16      reason="Coverage probes change the kernel AST, preventing FE-LL cache hits after reinit",
     17  )
🟢 tests/python/test_api.py (100%)
🟢  438      actual = sorted([s for s in dir(src) if not s.startswith(("_", "@")) and s != "pytest_plugin"])
🟢 tests/python/test_kernel_coverage.py (99%)
      1  """Tests for kernel code coverage instrumentation.
      2  
      3  These tests verify that the AST rewriter correctly inserts coverage probes and that the probes fire when kernel
      4  code executes on the device.
      5  """
      6  
🟢    7  import ast
🟢    8  import os
🟢    9  import textwrap
     10  
🟢   11  import pytest
     12  
🟢   13  import quadrants as qd
     14  
🟢   15  from tests import test_utils
     16  
     17  # These tests only run when QD_KERNEL_COVERAGE=1
🟢   18  pytestmark = pytest.mark.skipif(
     19      os.environ.get("QD_KERNEL_COVERAGE", "") != "1",
     20      reason="QD_KERNEL_COVERAGE=1 not set",
     21  )
     22  
     23  
     24  # ---------------------------------------------------------------------------
     25  # AST rewriter unit tests
     26  # ---------------------------------------------------------------------------
     27  
🟢   28  _AST_REWRITER_CASES = [
     29      pytest.param(
     30          """\
     31          def f():
     32              x = 1
     33              y = 2
     34              return x + y
     35          """,
     36          {11, 12, 13},
     37          10,
     38          id="straight_line",
     39      ),
     40      pytest.param(
     41          """\
     42          def f():
     43              if x > 0:
     44                  a = 1
     45              else:
     46                  b = 2
     47          """,
     48          {2, 3, 5},
     49          1,
     50          id="if_else",
     51      ),
     52      pytest.param(
     53          """\
     54          def f():
     55              for i in range(10):
     56                  x = i
     57          """,
     58          {2, 3},
     59          1,
     60          id="for_loop",
     61      ),
     62      pytest.param(
     63          """\
     64          def f():
     65              while x > 0:
     66                  x = x - 1
     67              else:
     68                  y = 0
     69          """,
     70          {2, 3, 5},
     71          1,
     72          id="while_loop_else",
     73      ),
     74      pytest.param(
     75          """\
     76          def f():
     77              with ctx:
     78                  a = 1
     79                  b = 2
     80          """,
     81          {2, 3, 4},
     82          1,
     83          id="with_statement",
     84      ),
     85      pytest.param(
     86          """\
     87          def f():
     88              try:
     89                  a = 1
     90              except:
     91                  b = 2
     92              else:
     93                  c = 3
     94              finally:
     95                  d = 4
     96          """,
     97          {3, 5, 7, 9},
     98          1,
     99          id="try_except_finally",
    100      ),
    101  ]
    102  
    103  
🟢  104  @pytest.mark.parametrize("src,expected_lines,start_lineno", _AST_REWRITER_CASES)
🟢  105  def test_ast_rewriter(src, expected_lines, start_lineno):
    106      """Verify the AST rewriter inserts probes at the expected source lines."""
🟢  107      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    108  
🟢  109      tree = ast.parse(textwrap.dedent(src))
🟢  110      rewriter = _CoverageASTRewriter(
    111          field_name="_qd_cov", filepath="test.py", start_lineno=start_lineno, probe_id_start=0
    112      )
🟢  113      rewriter.visit(tree)
    114  
🟢  115      covered_lines = {lineno for _, (_, lineno) in rewriter.probe_map.items()}
🟢  116      assert expected_lines.issubset(covered_lines), f"Expected lines {expected_lines} to be probed, got {covered_lines}"
    117  
    118  
🟢  119  def test_ast_rewriter_capacity_limit():
    120      """Verify that probes stop being inserted when the capacity limit is hit."""
🟢  121      import warnings
    122  
🟢  123      import quadrants.lang._kernel_coverage as kcov
🟢  124      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    125  
🟢  126      src = textwrap.dedent(
    127          """\
    128          def f():
    129              a = 1
    130              b = 2
    131              c = 3
    132      """
    133      )
🟢  134      tree = ast.parse(src)
🟢  135      old_warning_state = kcov._capacity_warning_emitted
🟢  136      kcov._capacity_warning_emitted = False
🟢  137      try:
🟢  138          with warnings.catch_warnings(record=True) as w:
🟢  139              warnings.simplefilter("always")
🟢  140              rewriter = _CoverageASTRewriter(
    141                  field_name="_qd_cov", filepath="test.py", start_lineno=1, probe_id_start=kcov._MAX_PROBES - 1
    142              )
🟢  143              rewriter.visit(tree)
    144  
🟢  145          assert rewriter.next_probe_id == kcov._MAX_PROBES
🟢  146          assert len(rewriter.probe_map) == 1, f"Only 1 probe should fit, got {len(rewriter.probe_map)}"
🟢  147          assert len(w) == 1
🟢  148          assert "exceeded" in str(w[0].message).lower()
    149      finally:
🟢  150          kcov._capacity_warning_emitted = old_warning_state
    151  
    152  
🟢  153  def test_ast_rewriter_deduplicates_same_line():
    154      """Verify that two statements on the same source line get only one probe."""
🟢  155      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    156  
🟢  157      src = "def f():\n    a = 1; b = 2\n"
🟢  158      tree = ast.parse(src)
🟢  159      rewriter = _CoverageASTRewriter(field_name="_qd_cov", filepath="test.py", start_lineno=1, probe_id_start=0)
🟢  160      rewriter.visit(tree)
    161  
🟢  162      abs_lines = [lineno for _, (_, lineno) in rewriter.probe_map.items()]
🟢  163      assert abs_lines.count(2) == 1, f"Line 2 should have exactly one probe, got {abs_lines.count(2)}"
    164  
    165  
🟢  166  def test_env_var_max_probes():
    167      """Verify that QD_COVERAGE_MAX_PROBES env var is read at import time."""
🟢  168      import quadrants.lang._kernel_coverage as kcov
    169  
🟢  170      assert kcov._MAX_PROBES == int(os.environ.get("QD_COVERAGE_MAX_PROBES", "100000"))
    171  
    172  
🟢  173  def test_harvest_field_exception_path():
    174      """Verify that _harvest_field handles to_numpy() failure gracefully."""
🟢  175      from unittest.mock import MagicMock
    176  
🟢  177      import quadrants.lang._kernel_coverage as kcov
    178  
🟢  179      old_field = kcov._cov_field
🟢  180      old_prog = kcov._cov_field_prog
🟢  181      old_map = kcov._probe_map.copy()
🟢  182      try:
🟢  183          mock_field = MagicMock()
🟢  184          mock_field.to_numpy.side_effect = RuntimeError("runtime destroyed")
🟢  185          kcov._cov_field = mock_field
🟢  186          kcov._cov_field_prog = object()
🟢  187          kcov._probe_map[999999] = ("fake.py", 1)
    188  
    189          # Should not raise — the exception is caught and logged
🟢  190          kcov._harvest_field()
    191  
🟢  192          assert kcov._cov_field is None, "Field should be cleared after failure"
🟢  193          assert kcov._cov_field_prog is None, "Field prog should be cleared after failure"
    194      finally:
🟢  195          kcov._cov_field = old_field
🟢  196          kcov._cov_field_prog = old_prog
🟢  197          kcov._probe_map = old_map
    198  
    199  
    200  # ---------------------------------------------------------------------------
    201  # End-to-end tests
    202  # ---------------------------------------------------------------------------
    203  
    204  
🟢  205  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  206  def test_kernel_coverage_branches_e2e():
    207      """Verify that only the taken branch has its probe fired."""
🟢  208      from quadrants.lang import _kernel_coverage
    209  
🟢  210      _kernel_coverage.ensure_field_allocated()
    211  
🟢  212      probe_count_before = _kernel_coverage._probe_counter
🟢  213      out = qd.field(dtype=qd.i32, shape=(1,))
    214  
🟢  215      @qd.kernel
🟢  216      def branching_kernel():
🟢  217          x = 10
🟢  218          if x > 5:
🟢  219              out[0] = 1
    220          else:
🔴  221              out[0] = 2
    222  
🟢  223      branching_kernel()
    224  
🟢  225      assert out[0] == 1
    226  
🟢  227      cov_field = _kernel_coverage.get_field()
🟢  228      arr = cov_field.to_numpy()
    229  
🟢  230      probes_for_kernel = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
    231  
🟢  232      taken_probes = {pid for pid, loc in probes_for_kernel.items() if arr[pid] != 0}
🟢  233      not_taken_probes = {pid for pid, loc in probes_for_kernel.items() if arr[pid] == 0}
    234  
🟢  235      assert len(taken_probes) > 0, "At least some probes should have fired"
🟢  236      assert len(not_taken_probes) > 0, "The else branch should not have been reached"
    237  
    238  
🟢  239  @test_utils.test(arch=qd.gpu)
🟢  240  def test_kernel_coverage_simt_e2e():
    241      """Verify coverage probes track branches with block.sync() and subgroup shuffle.
    242  
    243      The if/else is based on a runtime value read from a field, so the compiler cannot constant-fold it away.
    244      Only the taken branch's shuffle probe should fire.
    245      """
🟢  246      from quadrants.lang import _kernel_coverage
🟢  247      from quadrants.lang.simt import subgroup
    248  
🟢  249      _kernel_coverage.ensure_field_allocated()
    250  
🟢  251      N = 64
🟢  252      probe_count_before = _kernel_coverage._probe_counter
🟢  253      flag = qd.field(dtype=qd.i32, shape=(1,))
🟢  254      a = qd.field(dtype=qd.i32, shape=(N,))
🟢  255      out = qd.field(dtype=qd.i32, shape=(N,))
    256  
🟢  257      flag[0] = 1  # runtime value: take the if-branch
    258  
🟢  259      @qd.kernel
🟢  260      def simt_kernel():
🟢  261          qd.loop_config(block_dim=N)
🟢  262          for i in range(N):
🟢  263              a[i] = i + 1
🟢  264              qd.simt.block.sync()
🟢  265              if flag[0] > 0:
🟢  266                  val = subgroup.shuffle(a[i], qd.u32(0))
🟢  267                  out[i] = val
    268              else:
🔴  269                  val = subgroup.shuffle(a[i], qd.u32(1))
🔴  270                  out[i] = val + 100
    271  
🟢  272      simt_kernel()
    273  
🟢  274      for i in range(4):
🟢  275          assert out[i] == 1, f"Expected 1 at index {i}, got {out[i]}"
    276  
🟢  277      cov_field = _kernel_coverage.get_field()
🟢  278      arr = cov_field.to_numpy()
    279  
🟢  280      probes_for_kernel = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
    281  
🟢  282      fired = {pid for pid in probes_for_kernel if arr[pid] != 0}
🟢  283      not_fired = {pid for pid in probes_for_kernel if arr[pid] == 0}
🟢  284      assert len(fired) >= 4, f"Expected at least 4 probes to fire, got {len(fired)}"
🟢  285      assert len(not_fired) >= 2, "The else branch should not have been reached"
    286  
    287  
🟢  288  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  289  def test_kernel_coverage_survives_reinit():
    290      """Verify that coverage data accumulated before qd.init() reset is preserved.
    291  
    292      Runs a kernel, then resets via qd.reset()/qd.init() (which triggers the _hooked_clear harvest), runs another
    293      kernel, harvests again, and checks that _accumulated_lines contains data from both sessions.
    294      """
🟢  295      from quadrants.lang import _kernel_coverage, impl
    296  
🟢  297      current_arch = impl.get_runtime()._arch
🟢  298      _kernel_coverage.ensure_field_allocated()
    299  
🟢  300      probe_count_before = _kernel_coverage._probe_counter
🟢  301      out1 = qd.field(dtype=qd.i32, shape=(1,))
    302  
🟢  303      @qd.kernel
🟢  304      def kernel_before_reset():
🟢  305          out1[0] = 1
    306  
🟢  307      kernel_before_reset()
    308  
🟢  309      cov_field = _kernel_coverage.get_field()
🟢  310      assert cov_field is not None
🟢  311      arr = cov_field.to_numpy()
🟢  312      probes_first = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  313      fired_first = {pid for pid in probes_first if arr[pid] != 0}
🟢  314      assert len(fired_first) > 0, "Probes from first kernel should have fired"
    315  
    316      # Don't call _harvest_field() manually — let qd.reset() trigger it via the _hooked_clear hook
🟢  317      qd.reset()
    318  
    319      # Verify the hook harvested data from the first session
🟢  320      files_before = set(_kernel_coverage._accumulated_lines.keys())
🟢  321      assert len(files_before) > 0, "Hook should have harvested data during reset"
🟢  322      lines_before = {}
🟢  323      for f, lines in _kernel_coverage._accumulated_lines.items():
🟢  324          lines_before[f] = set(lines)
    325  
🟢  326      qd.init(arch=current_arch)
    327  
🟢  328      _kernel_coverage.ensure_field_allocated()
    329  
🟢  330      probe_count_mid = _kernel_coverage._probe_counter
🟢  331      out2 = qd.field(dtype=qd.i32, shape=(1,))
    332  
🟢  333      @qd.kernel
🟢  334      def kernel_after_reset():
🟢  335          out2[0] = 2
    336  
🟢  337      kernel_after_reset()
    338  
🟢  339      _kernel_coverage._harvest_field()
    340  
🟢  341      for f in files_before:
🟢  342          assert (
    343              f in _kernel_coverage._accumulated_lines
    344          ), f"File {f} from before reset should still be in _accumulated_lines"
🟢  345          assert lines_before[f].issubset(
    346              _kernel_coverage._accumulated_lines[f]
    347          ), "Lines from before reset should be preserved"
    348  
🟢  349      probes_second = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_mid}
🟢  350      second_files = {loc[0] for loc in probes_second.values()}
🟢  351      for f in second_files:
🟢  352          assert f in _kernel_coverage._accumulated_lines, f"File {f} from second kernel should be in _accumulated_lines"
    353  
    354  
🟢  355  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  356  def test_kernel_coverage_autodiff():
    357      """Verify that autodiff forward pass produces probes but backward does not.
    358  
    359      The forward compilation (AutodiffMode.NONE) should insert probes that fire. The backward compilation
    360      (AutodiffMode.REVERSE) should not add any probes.
    361      """
🟢  362      from quadrants.lang import _kernel_coverage
    363  
🟢  364      _kernel_coverage.ensure_field_allocated()
    365  
🟢  366      x = qd.field(dtype=qd.f32, shape=(), needs_grad=True)
🟢  367      loss = qd.field(dtype=qd.f32, shape=(), needs_grad=True)
    368  
🟢  369      @qd.kernel
🟢  370      def compute():
🟢  371          loss[None] = x[None] * x[None]
    372  
🟢  373      x[None] = 5.0
    374  
🟢  375      probe_count_before = _kernel_coverage._probe_counter
    376  
🟢  377      with qd.ad.Tape(loss):
🟢  378          compute()
    379  
🟢  380      probe_count_after_tape = _kernel_coverage._probe_counter
🟢  381      forward_probes = probe_count_after_tape - probe_count_before
🟢  382      assert forward_probes > 0, "Forward compilation should have inserted probes"
    383  
    384      # Verify forward probes actually fired
🟢  385      cov_field = _kernel_coverage.get_field()
🟢  386      assert cov_field is not None
🟢  387      arr = cov_field.to_numpy()
🟢  388      probes = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  389      fired = {pid for pid in probes if arr[pid] != 0}
🟢  390      assert len(fired) > 0, "Forward pass inside Tape should produce fired coverage probes"
    391  
    392      # Verify backward pass computes correct gradients
🟢  393      assert loss[None] == pytest.approx(25.0)
🟢  394      assert x.grad[None] == pytest.approx(10.0)
    395  
    396  
🟢  397  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  398  def test_kernel_coverage_qd_func():
    399      """Verify that probes fire inside a @qd.func called from a kernel."""
🟢  400      from quadrants.lang import _kernel_coverage
    401  
🟢  402      _kernel_coverage.ensure_field_allocated()
    403  
🟢  404      probe_count_before = _kernel_coverage._probe_counter
🟢  405      out = qd.field(dtype=qd.i32, shape=(1,))
    406  
🟢  407      @qd.func
🟢  408      def helper():
🟢  409          out[0] = 99
    410  
🟢  411      @qd.kernel
🟢  412      def caller():
🟢  413          helper()
    414  
🟢  415      caller()
    416  
🟢  417      assert out[0] == 99
    418  
🟢  419      cov_field = _kernel_coverage.get_field()
🟢  420      assert cov_field is not None
🟢  421      arr = cov_field.to_numpy()
    422  
🟢  423      probes = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  424      fired = {pid for pid in probes if arr[pid] != 0}
    425      # The kernel body has one statement (helper()), and the func body has one (out[0] = 99).
    426      # Both should produce probes that fire.
🟢  427      assert (
    428          len(fired) >= 2
    429      ), f"Expected probes from both kernel and func to fire, got {len(fired)} fired out of {len(probes)}"
    430  
    431  
🟢  432  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  433  def test_kernel_coverage_multiple_kernels_same_session():
    434      """Verify that probes from two different kernels both fire in the same session."""
🟢  435      from quadrants.lang import _kernel_coverage
    436  
🟢  437      _kernel_coverage.ensure_field_allocated()
    438  
🟢  439      probe_count_before = _kernel_coverage._probe_counter
🟢  440      a = qd.field(dtype=qd.i32, shape=(1,))
🟢  441      b = qd.field(dtype=qd.i32, shape=(1,))
    442  
🟢  443      @qd.kernel
🟢  444      def kernel_a():
🟢  445          a[0] = 10
    446  
🟢  447      @qd.kernel
🟢  448      def kernel_b():
🟢  449          b[0] = 20
    450  
🟢  451      kernel_a()
🟢  452      probe_count_after_a = _kernel_coverage._probe_counter
🟢  453      kernel_b()
    454  
🟢  455      assert a[0] == 10
🟢  456      assert b[0] == 20
    457  
🟢  458      cov_field = _kernel_coverage.get_field()
🟢  459      arr = cov_field.to_numpy()
    460  
🟢  461      probes_a = {
    462          pid: loc for pid, loc in _kernel_coverage._probe_map.items() if probe_count_before <= pid < probe_count_after_a
    463      }
🟢  464      probes_b = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_after_a}
    465  
🟢  466      fired_a = {pid for pid in probes_a if arr[pid] != 0}
🟢  467      fired_b = {pid for pid in probes_b if arr[pid] != 0}
    468  
🟢  469      assert len(fired_a) > 0, "Probes from kernel_a should have fired"
🟢  470      assert len(fired_b) > 0, "Probes from kernel_b should have fired"
    471  
    472  
🟢  473  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  474  def test_qd_prefix_exemption_pure_kernel():
    475      """Verify that _qd_-prefixed globals don't violate pure kernel checks.
    476  
    477      With kernel coverage enabled, _qd_cov is injected as a global. This test verifies that a pure (fastcache)
    478      kernel still compiles without error. The kernel uses ndarray arguments (not global fields) because pure
    479      kernels prohibit non-_qd_ globals.
    480      """
🟢  481      a = qd.ndarray(qd.i32, (1,))
    482  
🟢  483      @qd.kernel(fastcache=True)
🟢  484      def pure_kernel(arr: qd.types.NDArray) -> None:
🟢  485          arr[0] = 42
    486  
🟢  487      pure_kernel(a)
🟢  488      assert a[0] == 42
🟢 tests/python/test_offline_cache.py (100%)
     16  # Coverage field allocation creates internal fill kernels that change cache file counts.
     17  # CI runs these tests in a separate phase without QD_KERNEL_COVERAGE (see 4_test.sh).
🟢   18  pytestmark = pytest.mark.skipif(
     19      os.environ.get("QD_KERNEL_COVERAGE") == "1",
     20      reason="Kernel coverage adds internal kernels that invalidate cache file count assertions",
     21  )
     22  
🟢 tests/python/test_snode_layout_inspection.py (100%)
🟢    1  import os
      2  
🟢    3  import pytest
      4  
🟢   10  @pytest.mark.skipif(
     11      os.environ.get("QD_KERNEL_COVERAGE") == "1",
     12      reason="Kernel coverage field on root shifts offset assertions",
     13  )

@hughperkins hughperkins merged commit 2aae667 into main Apr 25, 2026
21 of 22 checks passed
@hughperkins hughperkins deleted the hp/pr-coverage-w-kernels branch April 25, 2026 19:55
@github-actions
Copy link
Copy Markdown

Coverage Report (2b7d0561d)

Metric Value
Diff coverage (changed lines only) 0%
Overall project coverage 73%

Total: 0 lines, 0 missing, 0% covered

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants