Describe the bug
A DataArray with integer-spaced x/y coords (e.g. x=[100,101,102], y=[200,199]) writes through to_geotiff with no transform tags and reads back as pixel coords x=[0,1,2], y=[0,1] with no georef. The projection metadata is silently lost.
The cause is coords_to_transform at xrspatial/geotiff/_coords.py:353, which returns None for any integer x or y dtype. The fail-closed validator at _coords.py:272 mirrors that, so the writer doesn't raise either. Both branches were intended as a no-georef sentinel for files round-tripped through open_geotiff (which emits np.arange(N, dtype=int64) for x/y when the source file has no GeoTIFF transform tags, #1710 / #1753 / #1949). The sentinel is too broad: integer dtype on its own catches both the read-side arange placeholder and any user-authored projected grid that happens to use integer-spaced coords.
Repro
import numpy as np
import xarray as xr
from xrspatial.geotiff import to_geotiff, open_geotiff
da = xr.DataArray(
np.zeros((2, 3), dtype=np.float32),
coords={'y': np.array([200, 199]), 'x': np.array([100, 101, 102])},
dims=('y', 'x'),
)
to_geotiff(da, 'out.tif')
out = open_geotiff('out.tif')
print(out.coords['x'].values) # [0 1 2] -- georef lost
print(out.coords['y'].values) # [0 1]
print(out.attrs.get('transform')) # None
Expected behavior
Either the projected grid round-trips with its coords intact, or the write raises so the caller knows the georef is being dropped. Silent loss of the transform is the worst outcome.
Fix
Tighten the sentinel to match exactly what the reader emits: dtype is int64 AND np.array_equal(coord, np.arange(len(coord), dtype=int64)). Only that pattern is treated as no-georef. Anything else (user-authored [100,101,102], subset [2,3,4], subsampled [0,2,4], non-uniform [1,2,5]) falls through to the existing float-coord path, which either synthesizes a real transform from the spacing or raises NonUniformCoordsError.
Two _coords.py sites need the change: the validator at :272 and coords_to_transform at :353.
Trade-off
Users who subset or subsample a no-georef DataArray and write it will get a file with a real transform where they previously got a no-georef file. Coord values round-trip exactly; only the dtype flips int→float on subsequent reads. That's a behavior change but a more honest one — the subset has a well-defined origin offset that should be preserved.
Related
Describe the bug
A DataArray with integer-spaced x/y coords (e.g.
x=[100,101,102],y=[200,199]) writes throughto_geotiffwith no transform tags and reads back as pixel coordsx=[0,1,2],y=[0,1]with no georef. The projection metadata is silently lost.The cause is
coords_to_transformatxrspatial/geotiff/_coords.py:353, which returnsNonefor any integer x or y dtype. The fail-closed validator at_coords.py:272mirrors that, so the writer doesn't raise either. Both branches were intended as a no-georef sentinel for files round-tripped throughopen_geotiff(which emitsnp.arange(N, dtype=int64)for x/y when the source file has no GeoTIFF transform tags, #1710 / #1753 / #1949). The sentinel is too broad: integer dtype on its own catches both the read-sidearangeplaceholder and any user-authored projected grid that happens to use integer-spaced coords.Repro
Expected behavior
Either the projected grid round-trips with its coords intact, or the write raises so the caller knows the georef is being dropped. Silent loss of the transform is the worst outcome.
Fix
Tighten the sentinel to match exactly what the reader emits: dtype is
int64ANDnp.array_equal(coord, np.arange(len(coord), dtype=int64)). Only that pattern is treated as no-georef. Anything else (user-authored[100,101,102], subset[2,3,4], subsampled[0,2,4], non-uniform[1,2,5]) falls through to the existing float-coord path, which either synthesizes a real transform from the spacing or raisesNonUniformCoordsError.Two
_coords.pysites need the change: the validator at:272andcoords_to_transformat:353.Trade-off
Users who subset or subsample a no-georef DataArray and write it will get a file with a real transform where they previously got a no-georef file. Coord values round-trip exactly; only the dtype flips int→float on subsequent reads. That's a behavior change but a more honest one — the subset has a well-defined origin offset that should be preserved.
Related