Skip to content

geotiff: GPU + dask+GPU coverage for float16 read (#1941)#1947

Merged
brendancol merged 2 commits into
xarray-contrib:mainfrom
brendancol:deep-sweep-test-coverage-geotiff-2026-05-15-1778858256-01
May 15, 2026
Merged

geotiff: GPU + dask+GPU coverage for float16 read (#1941)#1947
brendancol merged 2 commits into
xarray-contrib:mainfrom
brendancol:deep-sweep-test-coverage-geotiff-2026-05-15-1778858256-01

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Closes a Cat 1 HIGH backend-parity gap for issue #1941 (float16 read auto-promotion). The eager and dask paths were covered by test_float16_read_1941.py; the cupy and dask+cupy paths had no targeted tests.

A regression that:

  • dropped the bps_mismatch stripped/odd-bps fallback at _backends/gpu.py:357 would route float16 reads through the tiled GPU decoder and mis-decode half-precision samples
  • dropped the (bps=16, sample_format=float) early-out in _gds_chunk_path_available would send tiled float16 chunked reads down the kvikIO GDS path and mis-stride the buffer

would have shipped silently. The new tests pin both code paths plus cross-backend parity.

13 new tests, all passing on a CUDA host:

  • read_geotiff_gpu on stripped + tiled (deflate, uncompressed) float16
  • open_geotiff(gpu=True) dispatcher thread-through
  • Windowed GPU reads on stripped + tiled float16
  • open_geotiff(chunks=, gpu=True) and read_geotiff_gpu(chunks=)
  • Structural pin: _gds_chunk_path_available returns False for (bps=16, sf=3); float32 tiled files still pass
  • Cross-backend pixel-exact parity (numpy vs GPU, numpy vs dask+GPU, dask+numpy vs dask+GPU)
  • predictor=3 + float16 GPU round trip

Mutation against bps_mismatch flipped 5 tests red; mutation against the GDS float16 gate flipped the structural test red.

This is a deep-sweep test-coverage PR. No source files were modified.

Test plan

  • pytest xrspatial/geotiff/tests/test_float16_read_gpu_1941.py (13 passed, 1 skipped due to kvikio import)
  • pytest xrspatial/geotiff/tests/test_float16_read_1941.py xrspatial/geotiff/tests/test_float16_read_gpu_1941.py runs side by side (22 passed, 1 skipped)
  • Mutation: drop bps_mismatch from the stripped/odd-bps guard -> 5 tests fail
  • Mutation: drop float16 gate from _gds_chunk_path_available -> structural test fails

Copilot AI review requested due to automatic review settings May 15, 2026 15:23
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 15, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@brendancol
Copy link
Copy Markdown
Contributor Author

PR Review: geotiff: GPU + dask+GPU coverage for float16 read (#1941)

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

  • test_gds_path_gated_off_for_float16 (xrspatial/geotiff/tests/test_float16_read_gpu_1941.py:184) doesn't pytest.importorskip("kvikio"). When kvikio is missing, _gds_chunk_path_available returns False at the kvikio probe (_backends/gpu.py:768), several lines before the float16 gate at line 791. The test asserts result is False regardless, so it passes vacuously, and the mutation claim "drop the float16 gate, test fails" only holds when kvikio is installed. The sibling test_gds_path_allowed_for_float32_tiled already does pytest.importorskip("kvikio") on line 226; adding the same here would close the gap.

  • All three tile fixtures (xrspatial/geotiff/tests/test_float16_read_gpu_1941.py:80, 96) write a 16x16 image with a 16x16 tile, which produces a one-tile file. The tile-grid stride and inter-tile reassembly logic in the GPU decoder is never exercised. A 32x32 image with 16x16 tiles (4 tiles) would also exercise the multi-tile stitching path.

Nits (optional improvements)

  • In test_gds_path_gated_off_for_float16 (xrspatial/geotiff/tests/test_float16_read_gpu_1941.py:206-208), the test extracts bps = ifd.bits_per_sample; if isinstance(bps, tuple): bps = bps[0] without the empty-tuple guard the function uses (bps_first[0] if bps_first else 0). A valid TIFF can't produce (), so this is purely cosmetic.

What looks good

  • Both reach paths are tested: the dispatcher (open_geotiff(..., gpu=True)) and the direct call (read_geotiff_gpu).
  • Pixel-exact backend parity is checked three ways: numpy vs GPU, numpy vs dask+GPU, dask+numpy vs dask+GPU (lines 251-296).
  • Windowed reads exercise the bps_mismatch CPU fallback on both stripped and tiled layouts.
  • Predictor=3 + float16 round-trip is included (lines 303-328).
  • Uses .data.get() for cupy access, not .values (the convention in this module).
  • tmp_path gives each test its own filesystem prefix.
  • Module-level pytestmark skips the file when cupy/CUDA is absent.
  • Mutation evidence is in the PR body: dropping bps_mismatch flips 5 tests red; dropping the GDS float16 gate flips the structural test red (subject to the kvikio caveat above).
  • Single file, 333 additions, no source changes.

Checklist

  • Algorithm matches reference/paper — N/A (test-only PR)
  • All implemented backends produce consistent results — pinned by parity tests
  • NaN handling is correct — N/A (float16 fixtures contain no NaN)
  • Edge cases are covered by tests — stripped, tiled, windowed, chunked, predictor=3
  • Dask chunk boundaries handled correctly — chunks=8 on 16x16 yields 4 blocks
  • No premature materialization or unnecessary copies — one .compute() per test
  • Benchmark exists or is not needed — not needed (test-only)
  • README feature matrix updated — N/A (no public surface change)
  • Docstrings present and accurate — every class and most tests have one

…trib#1941)

Issue xarray-contrib#1941 added float16 auto-promotion on read and gated the GPU
GDS path off for (bps=16, sf=float). The eager numpy and dask paths
are covered by test_float16_read_1941.py; the cupy and dask+cupy
paths had no targeted tests. A regression dropping the bps_mismatch
fallback at _backends/gpu.py:357 or the float16 gate in
_gds_chunk_path_available would silently mis-decode half-precision
tiles and ship under existing CI.

Adds 13 tests, all passing on a CUDA host:
- read_geotiff_gpu on stripped + tiled (deflate, uncompressed) float16
- open_geotiff(gpu=True) dispatcher thread-through
- windowed GPU reads on stripped + tiled float16
- open_geotiff(chunks=, gpu=True) and read_geotiff_gpu(chunks=)
- _gds_chunk_path_available structural pin for (bps=16, sf=3) -> False
  plus a sanity check that float32 tiled files still allow GDS
- cross-backend pixel-exact parity (numpy vs GPU, numpy vs dask+GPU,
  dask+numpy vs dask+GPU)
- predictor=3 + float16 GPU round trip

Mutation against bps_mismatch flipped 5 tests red; mutation against
the GDS float16 gate flipped the structural test red.
@brendancol brendancol force-pushed the deep-sweep-test-coverage-geotiff-2026-05-15-1778858256-01 branch from 9ecd552 to c485361 Compare May 15, 2026 17:34
@brendancol
Copy link
Copy Markdown
Contributor Author

PR Review: geotiff: GPU + dask+GPU coverage for float16 read (#1941) (re-review after rebase)

Blockers (must fix before merge)

  • None

Suggestions (should fix, not blocking)

  • None

Nits (optional improvements)

  • xrspatial/geotiff/tests/test_float16_read_gpu_1941.py:30 imports os but the module never uses it; safe to drop.
  • xrspatial/geotiff/tests/test_float16_read_gpu_1941.py:258 calls pytest.importorskip("kvikio") which trips a PytestDeprecationWarning in environments where kvikio is present but libkvikio.so is missing (the shared lib raises ImportError at import time, and pytest 9.1 will treat this as an error). Pass exc_type=ImportError to silence it, or guard with a find_spec + try/except like _gpu_available does at line 37.

What looks good

  • Module-level skip via _gpu_available() keeps the suite green on CPU-only builds.
  • Every fixture uses pytest.importorskip("tifffile") and the predictor=3 case also gates on imagecodecs, so optional deps stay optional.
  • Coverage hits all three regression vectors named in the docstring (stripped bps_mismatch fallback, tiled GDS gate at gpu.py:791, dtype map entry at (16, FLOAT) -> float32) plus the float32 sanity test at line 248 that pins the guard to float16 only.
  • Backend parity class compares numpy / GPU / dask+GPU pixel-exact, which is the right bar for a lossless promotion.

CI status

  • Windows 3.12 failure addressed by rebase onto 9ce0e60 (Windows handle fix from main). New head on fork: c485361.
  • readthedocs failure: infra, not content. The sphinx build step (python -m sphinx -T -b html ...) on build 32707706 has exit_code: null and empty output, meaning RTD killed it before it produced any log. The PR only touches a test file, so a content regression is implausible; the same symptom appears on other unrelated PRs.

Checklist

  • Algorithm matches reference/paper (test-only; pins behaviour from geotiff: float16 GeoTIFFs cannot be read despite writer auto-promotion #1941)
  • All implemented backends produce consistent results (TestBackendParityFloat16)
  • NaN handling is correct (not covered; float16 NaN round-trip would be a useful follow-up but is out of scope for this PR)
  • Edge cases are covered by tests
  • Dask chunk boundaries handled correctly (chunks=8 on 16x16 input exercises a 2x2 chunk grid)
  • No premature materialization or unnecessary copies
  • Benchmark exists or is not needed (not needed; test-only)
  • README feature matrix updated (if applicable) (n/a)
  • Docstrings present and accurate

@brendancol
Copy link
Copy Markdown
Contributor Author

Addressed the review in e18236c:

  • Added pytest.importorskip("kvikio", exc_type=ImportError) to test_gds_path_gated_off_for_float16 so the mutation-kill assertion no longer passes vacuously when kvikio is absent.
  • Switched float16_tiled_tif to a 32x32 image with 16x16 tiles (2x2 tile grid) so inter-tile reassembly is exercised. The window slice (arr[:8,:8]), chunks=8 dask reads, and full-array parity assertions all still hold against the larger fixture.
  • Mirrored the production empty-tuple guard (bps_first[0] if bps_first else 0) for the bps unpack in the GDS-gate test.
  • Added exc_type=ImportError to the existing kvikio importorskip in the float32 sibling test.
  • Dropped the unused import os.

Local run: 12 passed, 2 skipped (kvikio-gated) on a CPU+GPU host without kvikio.

@brendancol brendancol merged commit 31c8f77 into xarray-contrib:main May 15, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants