Skip to content

geotiff: GPU stripped windowed coords honour has_georef=False (#1753)#1763

Open
brendancol wants to merge 1 commit into
mainfrom
deep-sweep-accuracy-geotiff-2026-05-12-1778629014
Open

geotiff: GPU stripped windowed coords honour has_georef=False (#1753)#1763
brendancol wants to merge 1 commit into
mainfrom
deep-sweep-accuracy-geotiff-2026-05-12-1778629014

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Fix backend inconsistency on open_geotiff(path, gpu=True, window=...) for non-georef stripped TIFFs. PR #1738 (the GPU stripped fallback that forwards max_pixels / window / band to the CPU helper) regressed the #1710 fix for the GPU stripped fallback path: a non-georef file read with gpu=True, window=... returned float64 coords synthesised from the default unit GeoTransform (e.g. y=[-0.5, -1.5, -2.5, -3.5]), while the eager numpy, dask, and tiled GPU paths returned integer pixel coords (np.arange(c0, c1, dtype=int64)).

The fix routes the has_georef=False case through the integer-coord branch on the GPU stripped fallback, mirroring the eager and dask paths.

Root cause

The windowed coord branch added in #1738 only checks if t is None:. That is unreachable because _extract_transform always returns a default GeoTransform() instance with has_georef=False. The default transform has pixel_width=1.0, pixel_height=-1.0, so the PixelIsArea sub-branch produced (c0 + 0.5) * 1.0 + 0 and (r0 + 0.5) * -1.0 + 0 -> [-0.5, -1.5, ...].

dask+cupy shares the same code path, so the bug fired on every gpu=True windowed read of a non-georef stripped TIFF regardless of chunking.

Fix

  • Route not getattr(geo_info, 'has_georef', True) through the integer-coord branch alongside the existing t is None check at __init__.py:2724-2747.
  • Mirror the eager numpy path's windowed coord shortcut at open_geotiff lines 834-836.
  • Also fix a latent off-by-one in the integer-coord branch: was arange(r1-r0) instead of arange(r0, r1), so an offset window like (2, 3, 6, 7) would have returned [0, 1, 2, 3] instead of [2, 3, 4, 5]. The branch was unreachable so the off-by-one was latent.

Tests

14 new regression tests in test_gpu_stripped_no_georef_window_1753.py:

  • TestStrippedGpuWindowedNoGeoref: origin window, offset window, dask+cupy variants, uint16 dtype, no transform attr
  • TestStrippedGpuWindowedBackendParity: dtype-parity (3 windows) and value-parity (3 windows) across all four backends
  • TestStrippedGpuWindowedGeorefStillWorks: PixelIsArea origin and offset windows still get float64 coords with the correct half-pixel shift

Test plan

  • All 14 new regression tests pass.
  • All 15 existing tests in test_no_georef_windowed_coords_1710.py pass (3 previously failed in TestGpuWindowedCoords and TestBackendParity).
  • All 2005 non-stale geotiff tests still pass.
  • 10 pre-existing failures in test_predictor2_big_endian_gpu_1517.py are unrelated stale test-infrastructure from PR geotiff: read_to_array leaks into public namespace but is not in __all__ or docs #1708 (renamed read_to_array to _read_to_array).

Closes #1753

The GPU stripped fallback path in read_geotiff_gpu (added in PR #1738
so the GPU path forwards max_pixels / window / band to the CPU helper)
returned float64 coords synthesised from the default unit GeoTransform
for non-georef windowed reads, while the eager numpy, dask, and tiled
GPU paths all returned integer pixel coords. The pre-fix output
``y=[-0.5, -1.5, ...]`` regressed the #1710 fix on the stripped GPU
fallback only.

The branch at __init__.py 2724-2747 only guarded ``t is None``, which
is unreachable because _extract_transform always returns a default
GeoTransform() instance with has_georef=False. The default transform
has pixel_width=1.0, pixel_height=-1.0, so the PixelIsArea sub-branch
synthesised the half-pixel-shifted negative-y output instead of
falling through to the integer-coord shortcut. dask+cupy stripped
windowed reads shared the same code path so the bug fired there too.

Fix: route ``not getattr(geo_info, 'has_georef', True)`` through the
integer-coord branch alongside the existing ``t is None`` check.
Mirror the eager numpy ``open_geotiff`` body at __init__.py:834-836
and the dask path at read_geotiff_dask which both already check
has_georef. Also corrected the latent off-by-one in the integer-coord
branch (was arange(r1-r0); should be arange(r0, r1) so an offset
window like (2,3,6,7) returns [2,3,4,5]/[3,4,5,6] not [0,1,2,3]).

14 regression tests in test_gpu_stripped_no_georef_window_1753.py:
- TestStrippedGpuWindowedNoGeoref: origin window, offset window,
  dask+cupy variants of both, uint16 dtype, no transform attr
- TestStrippedGpuWindowedBackendParity: dtype-parity (3 windows) and
  value-parity (3 windows) across eager / dask / GPU / dask+cupy
- TestStrippedGpuWindowedGeorefStillWorks: PixelIsArea origin and
  offset windows still get float64 coords with the correct half-pixel
  shift (regression guard against the fix mis-firing on georef files)

All 2005 non-stale geotiff tests still pass. 10 pre-existing failures
in test_predictor2_big_endian_gpu_1517.py are stale test infrastructure
from PR #1708 (which renamed read_to_array to _read_to_array in the
public module namespace) and unrelated to this fix.

Categories: Cat 3 (off-by-one boundary handling, latent on the
previously unreachable branch) + Cat 5 (backend inconsistency:
CPU eager and dask returned int64 integer coords while stripped GPU
returned float64 half-pixel-shifted coords on the same file).

Closes #1753
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 12, 2026
@brendancol brendancol requested a review from Copilot May 13, 2026 10:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a backend-parity regression in xrspatial.geotiff.open_geotiff(..., gpu=True, window=...) for non-georeferenced, stripped TIFFs, where windowed reads incorrectly produced float64, half-pixel-shifted coordinates derived from a placeholder unit GeoTransform instead of integer pixel coordinates.

Changes:

  • Update the GPU stripped fallback windowed-coordinate logic to treat has_georef=False like the existing “no transform” case and emit integer pixel coords; also fixes an offset-window off-by-one in that integer-coord branch.
  • Add GPU regression tests covering origin/offset windows, dask+cupy variants, dtype independence, backend parity, and a georef non-regression check.
  • Record the sweep/accuracy state update for issue #1753.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
xrspatial/geotiff/__init__.py Fix GPU stripped fallback windowed coord generation for has_georef=False and correct offset-window integer coord generation.
xrspatial/geotiff/tests/test_gpu_stripped_no_georef_window_1753.py Add regression + parity tests for stripped-GPU windowed reads on non-georef TIFFs.
.claude/sweep-accuracy-state.csv Log sweep outcome/state for #1753.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

gpu_dk = open_geotiff(no_georef_stripped_path_1753, gpu=True,
chunks=4, window=win)
assert eager.y.dtype == dask.y.dtype == gpu.y.dtype == gpu_dk.y.dtype
assert eager.y.dtype == np.int64
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

geotiff: GPU stripped windowed read of non-georef TIFF emits float64 coords (regression of #1710)

2 participants