Summary
open_geotiff(path, allow_rotated=True) still surfaces attrs['crs'] and attrs['crs_wkt'] on the returned DataArray, contradicting the public docstring that states these attrs are dropped on rotated reads.
Docstring contract
xrspatial/geotiff/__init__.py (lines 379-391) says:
allow_rotated=True reads the pixel grid without the geospatial assumption: the result has integer pixel coords on x / y and attrs['crs'] is dropped.
The promise is part of the public read-only contract for the allow_rotated opt-in added in issue #2115.
Reproduction
import numpy as np
import tifffile
import tempfile, os
from xrspatial.geotiff import open_geotiff
arr = np.arange(64, dtype=np.uint8).reshape(8, 8)
path = os.path.join(tempfile.mkdtemp(), 'rotated.tif')
# Rotated ModelTransformationTag (b = 0.1 forces rotation).
M = (1.0, 0.1, 0.0, 100.0,
0.0, -1.0, 0.0, 200.0,
0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 1.0)
# GeoKeyDirectoryTag: declare EPSG:4326.
geo_keys = (1, 1, 0, 3,
1024, 0, 1, 2,
1025, 0, 1, 1,
2048, 0, 1, 4326)
extratags = [(34264, 12, 16, M),
(34735, 3, len(geo_keys), geo_keys)]
tifffile.imwrite(path, arr, photometric='minisblack',
extratags=extratags)
da = open_geotiff(path, allow_rotated=True)
print(sorted(da.attrs.keys()))
print('crs:', da.attrs.get('crs'))
print('crs_wkt present:', 'crs_wkt' in da.attrs)
print('transform:', da.attrs.get('transform'))
print('y dtype:', da.coords['y'].dtype)
Output:
['_xrspatial_geotiff_contract', 'crs', 'crs_wkt', 'extra_tags', ...]
crs: 4326
crs_wkt present: True
transform: None
y dtype: int64
transform is correctly absent and y/x coords are integer pixel indices, but crs and crs_wkt leaked through.
Impact
Downstream code that uses 'crs' in da.attrs (or da.attrs.get('crs')) as the "this raster is georeferenced" signal will treat the no-georef pixel grid as a projected raster and apply CRS-aware math (reprojection, distance calculations, etc.) against integer pixel indices. Wrong-number-of-meters bugs follow.
Root cause
xrspatial/geotiff/_attrs.py::_populate_attrs_from_geo_info emits crs and crs_wkt unconditionally when geo_info.crs_epsg / geo_info.crs_wkt are populated. Only transform is gated on has_georef. For rotated reads, has_georef is False (the rotated 6-tuple is stashed on geo_info.transform.rotated_affine) but the GeoKey parser still populates crs_epsg and crs_wkt.
The VRT backend (_backends/vrt.py) has parallel inline if vrt.crs_wkt: blocks that suffer the same flaw on the eager and chunked paths.
Affected backends
- numpy (eager): confirmed via reproduction above.
- dask: confirmed via reproduction with
chunks=4.
- cupy / dask+cupy: separate issue (the GPU CPU-fallback paths drop
allow_rotated and raise; tracked elsewhere). Once the GPU plumbing is fixed, the same attr-leak would apply.
- VRT eager and VRT chunked: same
crs_wkt block emits unconditionally.
Proposed fix
Gate crs / crs_wkt emission on has_georef in _populate_attrs_from_geo_info. Mirror the same gate in the VRT eager and chunked paths so the contract is uniform across all read paths.
Regression test: extend xrspatial/geotiff/tests/test_allow_rotated_geotiff_2115.py with a fixture that writes both a rotated ModelTransformationTag and an EPSG:4326 GeoKeyDirectoryTag, then assert neither crs nor crs_wkt is on da.attrs for the numpy and dask backends. Cupy paths can be covered when the host has CUDA.
Related
Summary
open_geotiff(path, allow_rotated=True)still surfacesattrs['crs']andattrs['crs_wkt']on the returned DataArray, contradicting the public docstring that states these attrs are dropped on rotated reads.Docstring contract
xrspatial/geotiff/__init__.py(lines 379-391) says:The promise is part of the public read-only contract for the
allow_rotatedopt-in added in issue #2115.Reproduction
Output:
transformis correctly absent and y/x coords are integer pixel indices, butcrsandcrs_wktleaked through.Impact
Downstream code that uses
'crs' in da.attrs(orda.attrs.get('crs')) as the "this raster is georeferenced" signal will treat the no-georef pixel grid as a projected raster and apply CRS-aware math (reprojection, distance calculations, etc.) against integer pixel indices. Wrong-number-of-meters bugs follow.Root cause
xrspatial/geotiff/_attrs.py::_populate_attrs_from_geo_infoemitscrsandcrs_wktunconditionally whengeo_info.crs_epsg/geo_info.crs_wktare populated. Onlytransformis gated onhas_georef. For rotated reads,has_georefis False (the rotated 6-tuple is stashed ongeo_info.transform.rotated_affine) but the GeoKey parser still populatescrs_epsgandcrs_wkt.The VRT backend (
_backends/vrt.py) has parallel inlineif vrt.crs_wkt:blocks that suffer the same flaw on the eager and chunked paths.Affected backends
chunks=4.allow_rotatedand raise; tracked elsewhere). Once the GPU plumbing is fixed, the same attr-leak would apply.crs_wktblock emits unconditionally.Proposed fix
Gate
crs/crs_wktemission onhas_georefin_populate_attrs_from_geo_info. Mirror the same gate in the VRT eager and chunked paths so the contract is uniform across all read paths.Regression test: extend
xrspatial/geotiff/tests/test_allow_rotated_geotiff_2115.pywith a fixture that writes both a rotatedModelTransformationTagand an EPSG:4326GeoKeyDirectoryTag, then assert neithercrsnorcrs_wktis onda.attrsfor the numpy and dask backends. Cupy paths can be covered when the host has CUDA.Related
allow_rotatedopt-in.allow_rotatedto the rotatedModelTransformationTag.