Skip to content

geotiff: forward max_pixels, window, band on GPU stripped fallback (#1732)#1738

Merged
brendancol merged 2 commits into
mainfrom
issue-1732
May 12, 2026
Merged

geotiff: forward max_pixels, window, band on GPU stripped fallback (#1732)#1738
brendancol merged 2 commits into
mainfrom
issue-1732

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

  • read_geotiff_gpu's stripped-TIFF fallback at xrspatial/geotiff/__init__.py:2533 called read_to_array(source, overview_level=overview_level) and threw away max_pixels, window, and band.
  • Forward all three to read_to_array. The post-decode _gpu_apply_window_band is replaced with coord-only computation since the array is now pre-sliced.

Closes #1732.

Test plan

  • New regression tests in test_gpu_stripped_forwarding_1732.py cover the max_pixels cap, windowed reads, band selection, and the window+band composition.
  • Existing stripped-GPU tests still pass (test_gpu_stripped_multiband.py, test_orientation_gpu.py, predictor + nodata coverage).

@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 12, 2026
@brendancol brendancol requested a review from Copilot May 12, 2026 20:01
@brendancol
Copy link
Copy Markdown
Contributor Author

@copilot resolve the merge conflicts in this pull request

Copy link
Copy Markdown
Contributor

Copilot AI commented May 12, 2026

@copilot resolve the merge conflicts in this pull request

Resolved in df44959. The only conflict was in xrspatial/geotiff/__init__.py: main renamed the import to _read_to_array, so I updated the stripped-fallback call to use that alias while keeping the forwarded window, band, and max_pixels arguments.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes read_geotiff_gpu behavior for stripped (non-tiled) TIFFs so its CPU-fallback path honors the same max_pixels, window, and band arguments as the tiled GPU path, and adds regression tests for issue #1732.

Changes:

  • Forward max_pixels, window, and band to read_to_array() in the stripped-TIFF GPU fallback.
  • Replace post-decode _gpu_apply_window_band slicing in the stripped fallback with coord-only computation (since the array is now pre-sliced by read_to_array).
  • Add new GPU regression tests covering max_pixels enforcement, windowed reads, band selection, and window+band composition.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
xrspatial/geotiff/__init__.py Forward missing kwargs to read_to_array in stripped fallback and adjust coordinate generation accordingly.
xrspatial/geotiff/tests/test_gpu_stripped_forwarding_1732.py New regression tests ensuring stripped fallback honors max_pixels, window, and band.
Comments suppressed due to low confidence (2)

xrspatial/geotiff/init.py:2531

  • The comment says read_to_array only updates geo_info.transform for orientations 5–8, but read_to_array currently updates the transform for orientations 2–4 as well (when geo_info.has_georef is true). This is misleading for future maintainers; please update or remove the outdated note (and the reference to a sibling PR) so it matches the actual behavior.
          through to the next stage (CPU mmap re-decode for the first
          failure, full CPU decode + GPU transfer for the second). This
          preserves backward-compatible behaviour while making GPU
          regressions visible.
        - ``'strict'``: re-raise the original exception from either stage

xrspatial/geotiff/init.py:2578

  • The windowed coord computation here checks t is None, but GeoInfo.transform is always a GeoTransform (even when has_georef=False), so this branch is effectively dead and non-georeferenced windowed reads will incorrectly compute float coords from the default identity transform. Consider keying off geo_info.has_georef (like _geo_to_coords) and returning integer pixel coords for non-georeferenced files, plus dropping the unreachable t is None branch.
        on_gpu_failure = gpu
    elif not new_passed:
        on_gpu_failure = 'auto'
    gpu = on_gpu_failure
    if gpu not in ('auto', 'strict'):
        raise ValueError(
            f"on_gpu_failure must be 'auto' or 'strict', got {gpu!r}")
    try:
        import cupy

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

win = (8, 16, 40, 80) # 32x64 window
out = read_geotiff_gpu(p, window=win)
assert out.shape == (32, 64)
np.testing.assert_array_equal(out.data.get(), data[8:40, 16:80])
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in f087212: added explicit assertions on out.coords['y'], out.coords['x'], and out.attrs['transform'] so regressions in the coord-only path are caught.

…1732)

The stripped-TIFF branch of read_geotiff_gpu called
read_to_array(source, overview_level=overview_level), dropping the
caller's max_pixels, window, and band kwargs. Three consequences:

- max_pixels safety cap bypassed (default 1B pixel limit applied
  instead of the smaller value the caller set).
- Windowed reads decoded the entire image before slicing on the GPU.
- Single-band reads on multi-band sources decoded every band.

Forward all three. The post-decode _gpu_apply_window_band call is
replaced with coord-only computation since read_to_array now produces
the pre-sliced array.

Adds regression coverage in test_gpu_stripped_forwarding_1732.py.
The windowed test only checked array values and shape. Since this PR
swapped the post-decode `_gpu_apply_window_band` call for an inline
coord-only computation, a regression in that new branch would not show
up.

Give the source fixture explicit coords so the file has a real georef,
then assert `out.coords['y']`, `out.coords['x']`, and
`out.attrs['transform']` against both the CPU eager path and the
expected windowed tuple. The CPU comparison pins backend parity; the
literal tuple catches drift on either side.

Per Copilot review comment 3229433881.
@brendancol brendancol merged commit 5af16e1 into main May 12, 2026
11 checks passed
brendancol added a commit that referenced this pull request May 13, 2026
The GPU stripped fallback path in read_geotiff_gpu (added in PR #1738
so the GPU path forwards max_pixels / window / band to the CPU helper)
returned float64 coords synthesised from the default unit GeoTransform
for non-georef windowed reads, while the eager numpy, dask, and tiled
GPU paths all returned integer pixel coords. The pre-fix output
``y=[-0.5, -1.5, ...]`` regressed the #1710 fix on the stripped GPU
fallback only.

The branch at __init__.py 2724-2747 only guarded ``t is None``, which
is unreachable because _extract_transform always returns a default
GeoTransform() instance with has_georef=False. The default transform
has pixel_width=1.0, pixel_height=-1.0, so the PixelIsArea sub-branch
synthesised the half-pixel-shifted negative-y output instead of
falling through to the integer-coord shortcut. dask+cupy stripped
windowed reads shared the same code path so the bug fired there too.

Fix: route ``not getattr(geo_info, 'has_georef', True)`` through the
integer-coord branch alongside the existing ``t is None`` check.
Mirror the eager numpy ``open_geotiff`` body at __init__.py:834-836
and the dask path at read_geotiff_dask which both already check
has_georef. Also corrected the latent off-by-one in the integer-coord
branch (was arange(r1-r0); should be arange(r0, r1) so an offset
window like (2,3,6,7) returns [2,3,4,5]/[3,4,5,6] not [0,1,2,3]).

14 regression tests in test_gpu_stripped_no_georef_window_1753.py:
- TestStrippedGpuWindowedNoGeoref: origin window, offset window,
  dask+cupy variants of both, uint16 dtype, no transform attr
- TestStrippedGpuWindowedBackendParity: dtype-parity (3 windows) and
  value-parity (3 windows) across eager / dask / GPU / dask+cupy
- TestStrippedGpuWindowedGeorefStillWorks: PixelIsArea origin and
  offset windows still get float64 coords with the correct half-pixel
  shift (regression guard against the fix mis-firing on georef files)

All 2005 non-stale geotiff tests still pass. 10 pre-existing failures
in test_predictor2_big_endian_gpu_1517.py are stale test infrastructure
from PR #1708 (which renamed read_to_array to _read_to_array in the
public module namespace) and unrelated to this fix.

Categories: Cat 3 (off-by-one boundary handling, latent on the
previously unreachable branch) + Cat 5 (backend inconsistency:
CPU eager and dask returned int64 integer coords while stripped GPU
returned float64 half-pixel-shifted coords on the same file).

Closes #1753
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

geotiff: GPU stripped-TIFF fallback bypasses max_pixels, window, and band

3 participants