You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Two strip-decode paths run their per-strip codec calls in a Python for-loop while the matching tile paths use a ThreadPoolExecutor gated on _PARALLEL_DECODE_PIXEL_THRESHOLD (64K pixels). Codec decode for deflate, zstd and LZW releases the GIL, so the tile paths overlap C-level work across cores; the strip paths leave that parallelism on the table.
Pattern matches issue #1980 (the previous audit fixed the HTTP tile path in #1981).
Locations
xrspatial/geotiff/_reader.py_read_strips at ~L1972 -- local-file strip decode. Tile counterpart _read_tiles at ~L2146 already parallelises.
When n_strips > 1 and strip_pixels >= _PARALLEL_DECODE_PIXEL_THRESHOLD, run _decode_strip_or_tile calls via a ThreadPoolExecutor with min(n_strips, os.cpu_count() or 4) workers.
Keep the placement loop sequential to avoid contending writes into the output buffer.
Why MEDIUM
Most real-world strip layouts (width >= 1024, rps >= 64) clear the 64K-pixel gate per strip, so the speedup applies to any multi-strip read. Codec choice matters: deflate/zstd/LZW release the GIL during decompression; uncompressed strips still see a numpy frombuffer + copy cost the threaded path overlaps.
Summary
Two strip-decode paths run their per-strip codec calls in a Python for-loop while the matching tile paths use a
ThreadPoolExecutorgated on_PARALLEL_DECODE_PIXEL_THRESHOLD(64K pixels). Codec decode for deflate, zstd and LZW releases the GIL, so the tile paths overlap C-level work across cores; the strip paths leave that parallelism on the table.Pattern matches issue #1980 (the previous audit fixed the HTTP tile path in #1981).
Locations
xrspatial/geotiff/_reader.py_read_stripsat ~L1972 -- local-file strip decode. Tile counterpart_read_tilesat ~L2146 already parallelises.xrspatial/geotiff/_reader.py_fetch_decode_cog_http_stripsat ~L2670 -- HTTP COG strip decode. Tile counterpart_fetch_decode_cog_http_tilesat ~L2898 already parallelises (fixed in geotiff: parallelise tile decode in _fetch_decode_cog_http_tiles (#1980) #1981).Proposed fix
Mirror the tile-path gate in both strip paths:
strip_pixels = width * rps.n_strips > 1 and strip_pixels >= _PARALLEL_DECODE_PIXEL_THRESHOLD, run_decode_strip_or_tilecalls via aThreadPoolExecutorwithmin(n_strips, os.cpu_count() or 4)workers.Why MEDIUM
Most real-world strip layouts (width >= 1024, rps >= 64) clear the 64K-pixel gate per strip, so the speedup applies to any multi-strip read. Codec choice matters: deflate/zstd/LZW release the GIL during decompression; uncompressed strips still see a numpy frombuffer + copy cost the threaded path overlaps.