You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The geotiff round-trip used to detect "this file has no GeoTIFF transform tags" by looking at coord shape: int64 ascending step-1 on both y/x axes meant "no georef" and the writer skipped transform synthesis. That heuristic was tightened in #2087 and then replaced in #2120 (commit 34cc415) with an explicit attr attrs['_xrspatial_no_georef'] = True.
This issue tracks the rest of the migration: one validator still infers no-georef from coord dtype, the shape helper is now test-only, and the round-trip test matrix could pin a couple more edge cases.
Added _NO_GEOREF_KEY = '_xrspatial_no_georef' in xrspatial/geotiff/_geotags.py.
Stamped the marker on every read path that emits placeholder coords:
CPU reader: _populate_attrs_from_geo_info in xrspatial/geotiff/_attrs.py (else branch when geo_info.transform is None).
VRT reader: both eager and chunked no-transform branches in xrspatial/geotiff/_backends/vrt.py.
GPU / dask readers route through _populate_attrs_from_geo_info so they pick the marker up for free.
Switched the writer-side detection to _has_no_georef_marker(da) in two places in xrspatial/geotiff/_coords.py:
coords_to_transform (return None instead of synthesising a fake unit transform).
require_transform_for_georeferenced (skip the fail-closed guard).
Kept _is_no_georef_sentinel in the module as a diagnostic but wired it out of the writer path.
Added test_no_georef_marker_2120.py and updated test_int_coord_sentinel_2087.py and test_no_georef_writer_round_trip_1949.py to pin the new contract.
What is still fragile
_check_write_non_uniform_coords in _validation.py
The validator at line 755 skips its uniformity check whenever a coord has integer dtype, with the comment "those coords are the 0..N-1 fallback the no-georef reader emits". After #2120 that exemption is wrong for the same reason the writer's old shape check was wrong: a user-authored int-coord grid (the exact case #2087 fixed) now bypasses uniformity validation. A user with int coords on a non-uniform grid will silently write a misrepresented transform.
The validator should gate the exemption on _has_no_georef_marker(da), not on dtype.kind. Float coords with the marker set should also be exempt for symmetry with the writer.
_is_no_georef_sentinel is dead on the production path
It only survives because test_int_coord_sentinel_2087.py still references it as a predicate. Two options:
Drop the helper and rewrite the tests to check _has_no_georef_marker instead.
Move the helper into the test file as a local fixture.
I lean toward option 1. The helper's docstring already says it is test-only, and keeping public-module dead code makes it easier for someone to plug it back into a write path by accident.
Attribute name and the georef_status proposal
A separate issue is being filed to propose a richer attrs['georef_status'] enum ('georef' | 'no_georef' | 'rotated' | ...) that would replace ad-hoc booleans across the geotiff module. If that lands first, the implementation here should consume georef_status == 'no_georef' rather than the dedicated _xrspatial_no_georef flag, and _has_no_georef_marker should be the single point of indirection.
If this issue lands first, keep _xrspatial_no_georef as an underscore-prefixed internal contract and migrate it under the georef_status umbrella later. Either way, callers should never read _xrspatial_no_georef directly outside the geotiff module.
Read paths to audit
Each path that emits placeholder y/x coords must stamp the marker. 34cc415 covered the three known paths; double-check that no new reader has appeared since:
xrspatial/geotiff/_backends/vrt.py: read_vrt and _read_vrt_chunked no-transform branches.
GPU (_backends/gpu.py) and dask (_backends/dask.py) readers both call _populate_attrs_from_geo_info, so they inherit the marker. A unit test that opens a no-georef fixture through each backend and asserts the marker is present would catch any future regression.
Write paths to audit
Each writer must consult the marker only, never coord shape:
xrspatial/geotiff/_writers/eager.py: to_geotiff (line 565-568) and the VRT per-tile write (line 946-948). Both call _coords_to_transform(data), which already checks the marker.
xrspatial/geotiff/_writers/gpu.py: write_geotiff_gpu (line 425-427). Same path.
xrspatial/geotiff/_writer.py low-level write accepts a pre-computed geo_transform, so the marker decision happens upstream.
The validator fix above is the missing piece.
Round-trip tests to add
A user grid with int coords on a non-uniform spacing should raise NonUniformCoordsError even though the dtype is int. Today it silently writes a misrepresented transform via the validator exemption.
A no-georef file opened through each of the four backends (CPU eager, CPU dask, GPU eager, GPU dask) carries the marker.
Marker survives a da.copy() / da.assign_attrs(...) chain. xarray attr-propagation defaults sometimes drop underscore-prefixed keys, so this is worth pinning explicitly.
PR #2127 threaded the masked_nodata decision through _set_nodata_attrs so the writer reads an explicit attr instead of inferring intent from dtype. Same pattern applies here: the no-georef decision lives on attrs[_NO_GEOREF_KEY], the writer reads that attr only, and the validator should follow suit.
Both attrs share an invariant worth documenting in docs/source/user_guide/attrs_contract.rst: they are read-side-stamped, internal, and is True identity-checked so a stray third-party stamp cannot flip behaviour.
Acceptance criteria
_check_write_non_uniform_coords checks the marker, not dtype.
_is_no_georef_sentinel removed from _coords.py; tests rewritten to use _has_no_georef_marker or a local predicate.
New test: user int-coord grid with non-uniform spacing raises NonUniformCoordsError.
New test: marker is present after reading a no-georef fixture through each backend.
Decision recorded on the georef_status issue about which attr is canonical long-term.
Background
The geotiff round-trip used to detect "this file has no GeoTIFF transform tags" by looking at coord shape:
int64ascending step-1 on both y/x axes meant "no georef" and the writer skipped transform synthesis. That heuristic was tightened in #2087 and then replaced in #2120 (commit 34cc415) with an explicit attrattrs['_xrspatial_no_georef'] = True.This issue tracks the rest of the migration: one validator still infers no-georef from coord dtype, the shape helper is now test-only, and the round-trip test matrix could pin a couple more edge cases.
What 34cc415 already did
_NO_GEOREF_KEY = '_xrspatial_no_georef'inxrspatial/geotiff/_geotags.py._populate_attrs_from_geo_infoinxrspatial/geotiff/_attrs.py(else branch whengeo_info.transform is None).xrspatial/geotiff/_backends/vrt.py._populate_attrs_from_geo_infoso they pick the marker up for free._has_no_georef_marker(da)in two places inxrspatial/geotiff/_coords.py:coords_to_transform(returnNoneinstead of synthesising a fake unit transform).require_transform_for_georeferenced(skip the fail-closed guard)._is_no_georef_sentinelin the module as a diagnostic but wired it out of the writer path.test_no_georef_marker_2120.pyand updatedtest_int_coord_sentinel_2087.pyandtest_no_georef_writer_round_trip_1949.pyto pin the new contract.What is still fragile
_check_write_non_uniform_coordsin_validation.pyThe validator at line 755 skips its uniformity check whenever a coord has integer dtype, with the comment "those coords are the 0..N-1 fallback the no-georef reader emits". After #2120 that exemption is wrong for the same reason the writer's old shape check was wrong: a user-authored int-coord grid (the exact case #2087 fixed) now bypasses uniformity validation. A user with int coords on a non-uniform grid will silently write a misrepresented transform.
The validator should gate the exemption on
_has_no_georef_marker(da), not ondtype.kind. Float coords with the marker set should also be exempt for symmetry with the writer._is_no_georef_sentinelis dead on the production pathIt only survives because
test_int_coord_sentinel_2087.pystill references it as a predicate. Two options:_has_no_georef_markerinstead.I lean toward option 1. The helper's docstring already says it is test-only, and keeping public-module dead code makes it easier for someone to plug it back into a write path by accident.
Attribute name and the
georef_statusproposalA separate issue is being filed to propose a richer
attrs['georef_status']enum ('georef' | 'no_georef' | 'rotated' | ...) that would replace ad-hoc booleans across the geotiff module. If that lands first, the implementation here should consumegeoref_status == 'no_georef'rather than the dedicated_xrspatial_no_georefflag, and_has_no_georef_markershould be the single point of indirection.If this issue lands first, keep
_xrspatial_no_georefas an underscore-prefixed internal contract and migrate it under thegeoref_statusumbrella later. Either way, callers should never read_xrspatial_no_georefdirectly outside the geotiff module.Read paths to audit
Each path that emits placeholder y/x coords must stamp the marker. 34cc415 covered the three known paths; double-check that no new reader has appeared since:
xrspatial/geotiff/_attrs.py:_populate_attrs_from_geo_infoelse branch.xrspatial/geotiff/_backends/vrt.py:read_vrtand_read_vrt_chunkedno-transform branches.GPU (
_backends/gpu.py) and dask (_backends/dask.py) readers both call_populate_attrs_from_geo_info, so they inherit the marker. A unit test that opens a no-georef fixture through each backend and asserts the marker is present would catch any future regression.Write paths to audit
Each writer must consult the marker only, never coord shape:
xrspatial/geotiff/_writers/eager.py:to_geotiff(line 565-568) and the VRT per-tile write (line 946-948). Both call_coords_to_transform(data), which already checks the marker.xrspatial/geotiff/_writers/gpu.py:write_geotiff_gpu(line 425-427). Same path.xrspatial/geotiff/_writer.pylow-levelwriteaccepts a pre-computedgeo_transform, so the marker decision happens upstream.The validator fix above is the missing piece.
Round-trip tests to add
NonUniformCoordsErroreven though the dtype is int. Today it silently writes a misrepresented transform via the validator exemption.da.copy()/da.assign_attrs(...)chain. xarray attr-propagation defaults sometimes drop underscore-prefixed keys, so this is worth pinning explicitly.to_geotiff(da, path)followed byopen_geotiff(path)on a hand-built no-georef array (already covered bytest_marker_on_user_grid_skips_transform_synthesisin geotiff: to_geotiff silently strips georef on int64 step-1 user coords #2120; add a corresponding 3D test).Parallel with PR #2127 / commit 478dbd5
PR #2127 threaded the
masked_nodatadecision through_set_nodata_attrsso the writer reads an explicit attr instead of inferring intent from dtype. Same pattern applies here: the no-georef decision lives onattrs[_NO_GEOREF_KEY], the writer reads that attr only, and the validator should follow suit.Both attrs share an invariant worth documenting in
docs/source/user_guide/attrs_contract.rst: they are read-side-stamped, internal, andis Trueidentity-checked so a stray third-party stamp cannot flip behaviour.Acceptance criteria
_check_write_non_uniform_coordschecks the marker, not dtype._is_no_georef_sentinelremoved from_coords.py; tests rewritten to use_has_no_georef_markeror a local predicate.NonUniformCoordsError.georef_statusissue about which attr is canonical long-term.Follow-up to #2120.