Skip to content

geotiff: cover sparse-strip parallel decode in _read_strips and HTTP path (#2100)#2109

Merged
brendancol merged 2 commits into
xarray-contrib:mainfrom
brendancol:deep-sweep-test-coverage-geotiff-2026-05-18-1779164339
May 19, 2026
Merged

geotiff: cover sparse-strip parallel decode in _read_strips and HTTP path (#2100)#2109
brendancol merged 2 commits into
xarray-contrib:mainfrom
brendancol:deep-sweep-test-coverage-geotiff-2026-05-18-1779164339

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

The strip-decode parallelisation in #2100 / #2104 added a collect-decode-place pipeline in both _read_strips and _fetch_decode_cog_http_strips. The job-collection loop filters sparse strips (byte_counts[idx] == 0) before they reach the ThreadPoolExecutor. test_parallel_strip_decode_2100.py covers parallel/serial parity, the pool-engaged branch, single-strip short-circuit, windowed reads, and planar=2, but every fixture is fully populated. The 128x128 sparse fixture in test_sparse_cog.py is below the 64K-pixel parallel gate, so the sparse-strip filter inside the parallel branch is untested.

A regression that lost the byte_counts==0 guard would silently ship: the decoder would receive an empty data slice and either raise Decompressed tile/strip size mismatch or return corrupt pixels.

Seven new tests cover the gap:

  • local-strip full-image parallel/serial parity with sparse strips
  • parallel-pool-engaged on a multi-strip sparse image
  • windowed read across the sparse boundary
  • all-sparse degenerate input
  • planar=2 multi-band sparse parity
  • HTTP windowed read on a non-sparse strict subset
  • HTTP windowed read across the sparse boundary

Mutation against the strip-job collection sparse guard (delete the byte_counts == 0 continue) flips 5 of 5 local tests red with Decompressed tile/strip size mismatch: expected ... got 0; mutation against the HTTP path sparse guard at line 2646 flips the boundary HTTP test red. Source untouched.

Test plan

  • pytest xrspatial/geotiff/tests/test_parallel_strip_decode_sparse_2100.py -v (7 / 7 pass locally)
  • pytest xrspatial/geotiff/tests/test_parallel_strip_decode_2100.py xrspatial/geotiff/tests/test_parallel_strip_decode_sparse_2100.py (16 / 16 pass)
  • Mutation: drop sparse guards in _read_strips, confirm new tests fail
  • Source restored to baseline (md5sum matches pre-mutation copy)

deep-sweep-test-coverage pass 18; HIGH category 3 + 4.

…path (xarray-contrib#2100)

The strip-decode parallelisation in xarray-contrib#2100 / xarray-contrib#2104 added a
collect-decode-place pipeline in both _read_strips and
_fetch_decode_cog_http_strips. The job-collection loop filters sparse
strips (byte_counts[idx] == 0) before they reach the
ThreadPoolExecutor. The existing test_parallel_strip_decode_2100.py
covers parallel/serial parity, the pool-engaged branch, the
single-strip serial short-circuit, windowed strip reads, and planar=2
multi-band, but every fixture is fully populated. The 128x128 sparse
fixture in test_sparse_cog.py is below the 64K-pixel parallel gate, so
the sparse-strip filter inside the parallel branch is untested.

A regression that lost the byte_counts==0 guard would silently ship:
the decoder would receive an empty data slice and either raise
'Decompressed tile/strip size mismatch' or return corrupt pixels.

Seven tests cover:
- local-strip full-image parallel/serial parity with sparse strips
- parallel-pool-engaged on a multi-strip sparse image
- windowed read across the sparse boundary
- all-sparse degenerate (zero filled rows -> empty job list ->
  short-circuit gate)
- planar=2 sparse parity (dedicated samples>1 branch)
- HTTP windowed read on a non-sparse strict subset
- HTTP windowed read across the sparse boundary

Mutation against the strip-job collection sparse guard (delete the
byte_counts == 0 continue) flips 5 of 5 local tests red with
'Decompressed tile/strip size mismatch: expected ... got 0'; mutation
against the HTTP path sparse guard at line 2646 flips the boundary
HTTP test red. Source untouched; clean restore verified via md5sum.

deep-sweep-test-coverage pass 18; categories 3 + 4 (HIGH).
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 19, 2026
Module-level ``import rasterio`` raised ``ModuleNotFoundError`` during
pytest collection on CI hosts without rasterio, marking the whole run
red even though every other test was fine. Use ``pytest.importorskip``
at the top of the module so the file skips cleanly on those hosts. The
fixtures genuinely need rasterio (they exercise the GDAL ``SPARSE_OK``
writer option); the reader code under test does not.

Mirrors the pattern other golden-corpus tests already use.
@brendancol brendancol merged commit 180d5f5 into xarray-contrib:main May 19, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant