geotiff: reject ambiguous 3D writer inputs (#1812)#1820
Merged
brendancol merged 3 commits intoMay 13, 2026
Conversation
to_geotiff and write_geotiff_gpu used to silently mishandle 3D
DataArrays whose leading dim was not in _BAND_DIM_NAMES = ('band',
'bands', 'channel'). The moveaxis that puts (band, y, x) into the
on-disk (y, x, band) layout was skipped, the writer kept the leading
axis as the spatial y axis, and the round-trip produced a TIFF with
silently swapped axes -- on read-back, out[:, :, 0].sum() !=
arr[0].sum().
Reject ambiguous 3D layouts at all three writer entry points (eager
to_geotiff, dask streaming, write_geotiff_gpu) via the shared
_validate_3d_writer_dims helper. Accepted layouts: (band, y, x) or
(y, x, band) with band-name aliases bands/channel and spatial-name
aliases lat/lon/latitude/longitude/row/col. Anything else raises
ValueError with an actionable message (rename the non-spatial dim
or transpose).
Surfaced by the 2026-05-13 metadata propagation sweep.
Contributor
There was a problem hiding this comment.
Pull request overview
Closes #1812: the GeoTIFF writers (to_geotiff eager, to_geotiff dask-streaming, and write_geotiff_gpu) used to silently corrupt 3D DataArrays whose leading dim was not in _BAND_DIM_NAMES because the moveaxis to (y, x, band) was skipped, leaving the leading axis to be treated as y on disk. The PR introduces a shared _validate_3d_writer_dims helper that rejects any 3D layout that is not (band, y, x) or (y, x, band) (with accepted aliases) and raises ValueError with an actionable remediation message.
Changes:
- Add
_validate_3d_writer_dimshelper plus_Y_DIM_NAMES/_X_DIM_NAMESalias lists inxrspatial/geotiff/__init__.py, and call it from all three 3D writer entry points (eager, dask streaming, GPU). - Update docstrings of
to_geotiffandwrite_geotiff_gpuwith aRaisesentry describing the new validation. - Add a regression test module covering the original repro, the three writer paths, accepted layouts (including aliases), 2D pass-through, GPU happy paths, and the error-message contract.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| xrspatial/geotiff/init.py | Adds the _validate_3d_writer_dims gate and wires it into the eager, dask-streaming, and GPU writer entry points; updates docstrings. |
| xrspatial/geotiff/tests/test_to_geotiff_3d_dim_validation_1812.py | New regression tests for ambiguous-layout rejection, happy-path round-trips, 2D pass-through, and error-message content (CPU + GPU). |
| .claude/sweep-metadata-state.csv | Sweep state row updated to record the 2026-05-13 audit and the new HIGH finding (#1812). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+42
to
+51
| # Inputs that *must* raise. Each tuple is (dims, shape). | ||
| _AMBIGUOUS_3D_INPUTS = [ | ||
| pytest.param(("time", "y", "x"), (2, 4, 5), id="time-y-x"), | ||
| pytest.param(("z", "y", "x"), (2, 4, 5), id="z-y-x"), | ||
| pytest.param(("band", "lat", "lon"), (2, 4, 5), id="band-lat-lon"), # ok via alias | ||
| pytest.param(("y", "x", "depth"), (4, 5, 2), id="y-x-depth"), # accepted: spatial-first | ||
| pytest.param(("foo", "bar", "baz"), (2, 4, 5), id="foo-bar-baz"), | ||
| ] | ||
|
|
||
|
|
PR #1803 forwarded the caller's max_pixels to read_to_array inside read_vrt's source loop so a tiny VRT output cannot force a huge source decode (#1796). The output-window check at the source read enforces that correctly. A separate per-tile dimension check at the same call sites also consumed the caller's max_pixels, so a caller setting max_pixels as an output budget (e.g. 10_000) failed the per-tile sanity check on any normal source whose default tile size is 256x256 (= 65_536 pixels). Use MAX_PIXELS_DEFAULT for the per-tile dim check at the two call sites in _read_tiles (local) and _read_tiles_cog_http (HTTP). The output-window check at the same functions continues to enforce the user-supplied max_pixels, preserving the #1796 protection.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
to_geotiffandwrite_geotiff_gpuused to silently corrupt data when a 3D DataArray's leading dim was not in_BAND_DIM_NAMES = ('band', 'bands', 'channel'). Themoveaxisthat puts(band, y, x)into the on-disk(y, x, band)layout was skipped, the writer kept the leading axis as the spatialyaxis, and the round-trip produced a TIFF with silently swapped axes.to_geotiff, dask streaming,write_geotiff_gpu) via a shared_validate_3d_writer_dimshelper. Accepted layouts:(band, y, x)or(y, x, band)plus band aliasesbands/channeland spatial aliaseslat/lon/latitude/longitude/row/col. Anything else raisesValueErrorwith an actionable fix message.Repro (pre-fix)
After the fix the same call raises
ValueErrorwith a message naming the offending dims and pointing to eithertranspose('y', 'x', 'time')or rename to one of_BAND_DIM_NAMES.Test plan
xrspatial/geotiff/tests/test_to_geotiff_3d_dim_validation_1812.py(19 cases)ValueError(band, y, x),(bands, y, x),(channel, y, x),(y, x, band),(lat, lon, band),(row, col, channel),(band, lat, lon)round-trip with correct per-band sums(y, x)still works untouchedxrspatial/geotiff/tests/suite passes (2311 passed, 5 skipped; 11 deselected pre-existing failures unrelated to this change -- matplotlib path deepcopy recursion on Python 3.14 andtest_predictor2_big_endian_gpu_1517/test_size_param_validation_gpu_vrt_1776::test_tile_size_positive_works/test_vrt_dstrect_resample_cap_1737::test_dstrect_at_cap_succeedsbaseline failures, all confirmed pre-existing viagit stash).