Skip to content

Fix open_geotiff eager out-of-bounds window error (#1634)#1636

Merged
brendancol merged 2 commits into
mainfrom
deep-sweep-accuracy-geotiff-2026-05-11-run3
May 11, 2026
Merged

Fix open_geotiff eager out-of-bounds window error (#1634)#1636
brendancol merged 2 commits into
mainfrom
deep-sweep-accuracy-geotiff-2026-05-11-run3

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

  • open_geotiff(path, window=...) on the eager (numpy) path produced an opaque CoordinateValidationError when the window extended past the source extent. The dask path already rejected the same input with a clear ValueError, so the two backends diverged on the contract.
  • Fix: validate window up front in the eager branch via _read_geo_info (metadata-only read, no extra pixel cost) using the exact condition the dask path applies, raising the same message format.
  • 12 regression tests in test_window_out_of_bounds_1634.py lock the eager path to the dask path's contract.

Closes #1634.

Test plan

  • pytest xrspatial/geotiff/tests/test_window_out_of_bounds_1634.py -- 12 new tests, all pass post-fix and verified to fail pre-fix.
  • pytest xrspatial/geotiff/tests/ excluding two HTTP-network tests and one pre-existing matplotlib failure unrelated to this change -- 1286 passed, 7 skipped (env-conditional).
  • Backend parity: eager and dask paths now raise ValueError with the same message format on the same bad window.
  • In-bounds windows still return the correct subset with matching coord and data lengths.

`open_geotiff(path, window=...)` on the eager (numpy) path produced an
opaque `CoordinateValidationError` when the window extended past the
source extent. `read_to_array` clamped the bad window to file bounds
and returned a smaller array, but the eager code in `open_geotiff`
used the unclamped window indices for y/x coord generation and the
windowed transform shift in `_populate_attrs_from_geo_info`, so the
coord array length differed from the data and xarray refused to
construct the DataArray.

The dask path (`read_geotiff_dask`) already validated the window up
front since #1561, raising a clear `ValueError` with the format
`window=... is outside the source extent (HxW) or has non-positive
size.` The two backends therefore disagreed on the contract.

Fix: validate `window` up front in `open_geotiff`'s eager branch via
`_read_geo_info` (metadata-only read, O(1) memory, no extra pixel
cost) using the exact same condition the dask path applies, raising
the same `ValueError` message format. Out-of-bounds windows now fail
with one consistent error regardless of which backend the user
requests.

12 regression tests in `test_window_out_of_bounds_1634.py` cover:
negative-start, past-right-edge, past-bottom-edge, past-both-edges,
zero-size, inverted-window, full-extent in-bounds, interior subset,
edge-aligned, cross-backend parity (eager == dask), message-format
parity, and the issue's exact reproducer. All 1286 existing
non-network geotiff tests still pass.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 11, 2026
@brendancol brendancol requested a review from Copilot May 11, 2026 21:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a backend inconsistency in xrspatial.geotiff.open_geotiff(..., window=...): the eager (NumPy) path now rejects out-of-bounds windows with the same clear ValueError message format as the dask path, instead of surfacing an opaque xarray CoordinateValidationError caused by coord/data length mismatches.

Changes:

  • Added eager-path window extent validation in open_geotiff to match read_geotiff_dask’s contract and error message.
  • Added a regression test suite covering multiple out-of-bounds window shapes plus eager-vs-dask parity.
  • Updated the internal sweep tracking CSV entry for the geotiff accuracy pass.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
xrspatial/geotiff/__init__.py Adds eager-path pre-validation for window to align error behavior with the dask backend.
xrspatial/geotiff/tests/test_window_out_of_bounds_1634.py New regression tests ensuring eager and dask paths reject identical invalid windows consistently.
.claude/sweep-accuracy-state.csv Records the issue #1634 sweep/audit status and summary.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread xrspatial/geotiff/tests/test_window_out_of_bounds_1634.py Outdated
Comment thread xrspatial/geotiff/__init__.py Outdated
- Remove unused tempfile import flagged by F401 in
  test_window_out_of_bounds_1634.py.
- Move the eager-path window-bounds check into read_to_array. The
  previous fix called _read_geo_info() up front in open_geotiff,
  which for file-like sources read the entire buffer before
  read_to_array read it again, doubling I/O. read_to_array already
  parses the IFD and has ifd.height / ifd.width before invoking
  _read_tiles / _read_strips, so the check piggybacks on the
  existing parse. Same ValueError message; one metadata pass for
  both backends. For path-based sources the previous code was
  cheap (mmap is lazy) but for BytesIO / fsspec sources this drops
  one full read of the buffer per windowed eager call.
@brendancol brendancol merged commit 69a75da into main May 11, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

open_geotiff eager path windowed read coord/data shape mismatch on out-of-bounds windows

2 participants