geotiff: forward max_pixels, window, band on GPU stripped fallback (#1732)#1738
Conversation
|
@copilot resolve the merge conflicts in this pull request |
Resolved in df44959. The only conflict was in |
There was a problem hiding this comment.
Pull request overview
This PR fixes read_geotiff_gpu behavior for stripped (non-tiled) TIFFs so its CPU-fallback path honors the same max_pixels, window, and band arguments as the tiled GPU path, and adds regression tests for issue #1732.
Changes:
- Forward
max_pixels,window, andbandtoread_to_array()in the stripped-TIFF GPU fallback. - Replace post-decode
_gpu_apply_window_bandslicing in the stripped fallback with coord-only computation (since the array is now pre-sliced byread_to_array). - Add new GPU regression tests covering
max_pixelsenforcement, windowed reads, band selection, and window+band composition.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
xrspatial/geotiff/__init__.py |
Forward missing kwargs to read_to_array in stripped fallback and adjust coordinate generation accordingly. |
xrspatial/geotiff/tests/test_gpu_stripped_forwarding_1732.py |
New regression tests ensuring stripped fallback honors max_pixels, window, and band. |
Comments suppressed due to low confidence (2)
xrspatial/geotiff/init.py:2531
- The comment says
read_to_arrayonly updatesgeo_info.transformfor orientations 5–8, butread_to_arraycurrently updates the transform for orientations 2–4 as well (whengeo_info.has_georefis true). This is misleading for future maintainers; please update or remove the outdated note (and the reference to a sibling PR) so it matches the actual behavior.
through to the next stage (CPU mmap re-decode for the first
failure, full CPU decode + GPU transfer for the second). This
preserves backward-compatible behaviour while making GPU
regressions visible.
- ``'strict'``: re-raise the original exception from either stage
xrspatial/geotiff/init.py:2578
- The windowed coord computation here checks
t is None, butGeoInfo.transformis always aGeoTransform(even whenhas_georef=False), so this branch is effectively dead and non-georeferenced windowed reads will incorrectly compute float coords from the default identity transform. Consider keying offgeo_info.has_georef(like_geo_to_coords) and returning integer pixel coords for non-georeferenced files, plus dropping the unreachablet is Nonebranch.
on_gpu_failure = gpu
elif not new_passed:
on_gpu_failure = 'auto'
gpu = on_gpu_failure
if gpu not in ('auto', 'strict'):
raise ValueError(
f"on_gpu_failure must be 'auto' or 'strict', got {gpu!r}")
try:
import cupy
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| win = (8, 16, 40, 80) # 32x64 window | ||
| out = read_geotiff_gpu(p, window=win) | ||
| assert out.shape == (32, 64) | ||
| np.testing.assert_array_equal(out.data.get(), data[8:40, 16:80]) |
There was a problem hiding this comment.
Addressed in f087212: added explicit assertions on out.coords['y'], out.coords['x'], and out.attrs['transform'] so regressions in the coord-only path are caught.
…1732) The stripped-TIFF branch of read_geotiff_gpu called read_to_array(source, overview_level=overview_level), dropping the caller's max_pixels, window, and band kwargs. Three consequences: - max_pixels safety cap bypassed (default 1B pixel limit applied instead of the smaller value the caller set). - Windowed reads decoded the entire image before slicing on the GPU. - Single-band reads on multi-band sources decoded every band. Forward all three. The post-decode _gpu_apply_window_band call is replaced with coord-only computation since read_to_array now produces the pre-sliced array. Adds regression coverage in test_gpu_stripped_forwarding_1732.py.
The windowed test only checked array values and shape. Since this PR swapped the post-decode `_gpu_apply_window_band` call for an inline coord-only computation, a regression in that new branch would not show up. Give the source fixture explicit coords so the file has a real georef, then assert `out.coords['y']`, `out.coords['x']`, and `out.attrs['transform']` against both the CPU eager path and the expected windowed tuple. The CPU comparison pins backend parity; the literal tuple catches drift on either side. Per Copilot review comment 3229433881.
The GPU stripped fallback path in read_geotiff_gpu (added in PR #1738 so the GPU path forwards max_pixels / window / band to the CPU helper) returned float64 coords synthesised from the default unit GeoTransform for non-georef windowed reads, while the eager numpy, dask, and tiled GPU paths all returned integer pixel coords. The pre-fix output ``y=[-0.5, -1.5, ...]`` regressed the #1710 fix on the stripped GPU fallback only. The branch at __init__.py 2724-2747 only guarded ``t is None``, which is unreachable because _extract_transform always returns a default GeoTransform() instance with has_georef=False. The default transform has pixel_width=1.0, pixel_height=-1.0, so the PixelIsArea sub-branch synthesised the half-pixel-shifted negative-y output instead of falling through to the integer-coord shortcut. dask+cupy stripped windowed reads shared the same code path so the bug fired there too. Fix: route ``not getattr(geo_info, 'has_georef', True)`` through the integer-coord branch alongside the existing ``t is None`` check. Mirror the eager numpy ``open_geotiff`` body at __init__.py:834-836 and the dask path at read_geotiff_dask which both already check has_georef. Also corrected the latent off-by-one in the integer-coord branch (was arange(r1-r0); should be arange(r0, r1) so an offset window like (2,3,6,7) returns [2,3,4,5]/[3,4,5,6] not [0,1,2,3]). 14 regression tests in test_gpu_stripped_no_georef_window_1753.py: - TestStrippedGpuWindowedNoGeoref: origin window, offset window, dask+cupy variants of both, uint16 dtype, no transform attr - TestStrippedGpuWindowedBackendParity: dtype-parity (3 windows) and value-parity (3 windows) across eager / dask / GPU / dask+cupy - TestStrippedGpuWindowedGeorefStillWorks: PixelIsArea origin and offset windows still get float64 coords with the correct half-pixel shift (regression guard against the fix mis-firing on georef files) All 2005 non-stale geotiff tests still pass. 10 pre-existing failures in test_predictor2_big_endian_gpu_1517.py are stale test infrastructure from PR #1708 (which renamed read_to_array to _read_to_array in the public module namespace) and unrelated to this fix. Categories: Cat 3 (off-by-one boundary handling, latent on the previously unreachable branch) + Cat 5 (backend inconsistency: CPU eager and dask returned int64 integer coords while stripped GPU returned float64 half-pixel-shifted coords on the same file). Closes #1753
Summary
read_geotiff_gpu's stripped-TIFF fallback atxrspatial/geotiff/__init__.py:2533calledread_to_array(source, overview_level=overview_level)and threw awaymax_pixels,window, andband.read_to_array. The post-decode_gpu_apply_window_bandis replaced with coord-only computation since the array is now pre-sliced.Closes #1732.
Test plan
test_gpu_stripped_forwarding_1732.pycover themax_pixelscap, windowed reads, band selection, and the window+band composition.test_gpu_stripped_multiband.py,test_orientation_gpu.py, predictor + nodata coverage).