geotiff: cover sparse-strip parallel decode in _read_strips and HTTP path (#2100)#2109
Merged
brendancol merged 2 commits intoMay 19, 2026
Conversation
…path (xarray-contrib#2100) The strip-decode parallelisation in xarray-contrib#2100 / xarray-contrib#2104 added a collect-decode-place pipeline in both _read_strips and _fetch_decode_cog_http_strips. The job-collection loop filters sparse strips (byte_counts[idx] == 0) before they reach the ThreadPoolExecutor. The existing test_parallel_strip_decode_2100.py covers parallel/serial parity, the pool-engaged branch, the single-strip serial short-circuit, windowed strip reads, and planar=2 multi-band, but every fixture is fully populated. The 128x128 sparse fixture in test_sparse_cog.py is below the 64K-pixel parallel gate, so the sparse-strip filter inside the parallel branch is untested. A regression that lost the byte_counts==0 guard would silently ship: the decoder would receive an empty data slice and either raise 'Decompressed tile/strip size mismatch' or return corrupt pixels. Seven tests cover: - local-strip full-image parallel/serial parity with sparse strips - parallel-pool-engaged on a multi-strip sparse image - windowed read across the sparse boundary - all-sparse degenerate (zero filled rows -> empty job list -> short-circuit gate) - planar=2 sparse parity (dedicated samples>1 branch) - HTTP windowed read on a non-sparse strict subset - HTTP windowed read across the sparse boundary Mutation against the strip-job collection sparse guard (delete the byte_counts == 0 continue) flips 5 of 5 local tests red with 'Decompressed tile/strip size mismatch: expected ... got 0'; mutation against the HTTP path sparse guard at line 2646 flips the boundary HTTP test red. Source untouched; clean restore verified via md5sum. deep-sweep-test-coverage pass 18; categories 3 + 4 (HIGH).
Module-level ``import rasterio`` raised ``ModuleNotFoundError`` during pytest collection on CI hosts without rasterio, marking the whole run red even though every other test was fine. Use ``pytest.importorskip`` at the top of the module so the file skips cleanly on those hosts. The fixtures genuinely need rasterio (they exercise the GDAL ``SPARSE_OK`` writer option); the reader code under test does not. Mirrors the pattern other golden-corpus tests already use.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The strip-decode parallelisation in #2100 / #2104 added a collect-decode-place pipeline in both
_read_stripsand_fetch_decode_cog_http_strips. The job-collection loop filters sparse strips (byte_counts[idx] == 0) before they reach theThreadPoolExecutor.test_parallel_strip_decode_2100.pycovers parallel/serial parity, the pool-engaged branch, single-strip short-circuit, windowed reads, and planar=2, but every fixture is fully populated. The 128x128 sparse fixture intest_sparse_cog.pyis below the 64K-pixel parallel gate, so the sparse-strip filter inside the parallel branch is untested.A regression that lost the
byte_counts==0guard would silently ship: the decoder would receive an empty data slice and either raiseDecompressed tile/strip size mismatchor return corrupt pixels.Seven new tests cover the gap:
Mutation against the strip-job collection sparse guard (delete the
byte_counts == 0 continue) flips 5 of 5 local tests red withDecompressed tile/strip size mismatch: expected ... got 0; mutation against the HTTP path sparse guard at line 2646 flips the boundary HTTP test red. Source untouched.Test plan
pytest xrspatial/geotiff/tests/test_parallel_strip_decode_sparse_2100.py -v(7 / 7 pass locally)pytest xrspatial/geotiff/tests/test_parallel_strip_decode_2100.py xrspatial/geotiff/tests/test_parallel_strip_decode_sparse_2100.py(16 / 16 pass)_read_strips, confirm new tests faildeep-sweep-test-coverage pass 18; HIGH category 3 + 4.