What happened
open_geotiff reads a GeoTIFF whose CRS is user-defined (WKT stored in the GeoTIFF citation, GeoKey GEOKEY_PROJECTED_CS_TYPE == 32767) and emits the WKT under attrs['crs_name'] only. Neither attrs['crs'] nor attrs['crs_wkt'] is populated. to_geotiff ignores attrs['crs_name'], so a read -> write round-trip silently drops the CRS.
Users who only read files see the WKT survive (as crs_name). Pipelines that read, process, and write a user-defined CRS GeoTIFF lose the projection on the output. Files with a recognized EPSG (e.g. EPSG:4326) are unaffected because the EPSG resolves to WKT via pyproj and the read path emits both attrs['crs'] and attrs['crs_wkt'].
Repro
import numpy as np
import xarray as xr
import tifffile
from xrspatial.geotiff import open_geotiff, to_geotiff
WKT = (
'PROJCS["User defined LCC",GEOGCS["NAD83",DATUM["North American Datum 1983",'
'SPHEROID["GRS 1980",6378137,298.257222101]],PRIMEM["Greenwich",0],'
'UNIT["degree",0.0174532925199433]],PROJECTION["Lambert Conformal Conic 2SP"],'
'PARAMETER["central_meridian",-100],PARAMETER["latitude_of_origin",40],'
'PARAMETER["standard_parallel_1",30],PARAMETER["standard_parallel_2",50],'
'UNIT["metre",1]]'
)
arr = np.ones((4, 4), dtype=np.float32)
da = xr.DataArray(arr, dims=['y', 'x'],
coords={'y': np.linspace(50, 47, 4),
'x': np.linspace(10, 13, 4)},
attrs={'crs_wkt': WKT})
to_geotiff(da, "src.tif", compression="none")
rd = open_geotiff("src.tif")
print("crs:", rd.attrs.get("crs")) # None
print("crs_wkt:", rd.attrs.get("crs_wkt")) # None
print("crs_name:", rd.attrs.get("crs_name")) # WKT survives here
to_geotiff(rd, "dst.tif", compression="none")
with tifffile.TiffFile("dst.tif") as tif:
print(tif.pages[0].tags.get(34735).value) # GeoKeyDirectory: no CRS keys
print(tif.pages[0].tags.get(34737)) # GeoAsciiParams: None
What should happen
When geo_info.crs_epsg is None but the citation looks like WKT (which is how GDAL stores user-defined CRSes), the read path should expose that WKT as attrs['crs_wkt'] and keep attrs['crs_name'] for back-compat. to_geotiff already accepts attrs['crs_wkt'] and re-emits it via GEOKEY_*_CS_TYPE = 32767 plus the citation.
Severity
HIGH. Read -> write pipelines silently lose the projection on user-defined CRS files. All four read backends (numpy, dask, cupy, dask+cupy) drop the same attr.
Suggested fix
In xrspatial/geotiff/_geotags.py::extract_geo_info, when no EPSG is resolved but the citation parses as WKT (starts with PROJCS[, GEOGCS[, PROJCRS[, GEOGCRS[, BOUNDCRS[, COMPOUNDCRS[, VERTCRS[, or LOCAL_CS[), set geo_info.crs_wkt = crs_name. _populate_attrs_from_geo_info then propagates it correctly to every backend.
What happened
open_geotiffreads a GeoTIFF whose CRS is user-defined (WKT stored in the GeoTIFF citation, GeoKeyGEOKEY_PROJECTED_CS_TYPE == 32767) and emits the WKT underattrs['crs_name']only. Neitherattrs['crs']norattrs['crs_wkt']is populated.to_geotiffignoresattrs['crs_name'], so a read -> write round-trip silently drops the CRS.Users who only read files see the WKT survive (as
crs_name). Pipelines that read, process, and write a user-defined CRS GeoTIFF lose the projection on the output. Files with a recognized EPSG (e.g. EPSG:4326) are unaffected because the EPSG resolves to WKT via pyproj and the read path emits bothattrs['crs']andattrs['crs_wkt'].Repro
What should happen
When
geo_info.crs_epsg is Nonebut the citation looks like WKT (which is how GDAL stores user-defined CRSes), the read path should expose that WKT asattrs['crs_wkt']and keepattrs['crs_name']for back-compat.to_geotiffalready acceptsattrs['crs_wkt']and re-emits it viaGEOKEY_*_CS_TYPE = 32767plus the citation.Severity
HIGH. Read -> write pipelines silently lose the projection on user-defined CRS files. All four read backends (numpy, dask, cupy, dask+cupy) drop the same attr.
Suggested fix
In
xrspatial/geotiff/_geotags.py::extract_geo_info, when no EPSG is resolved but the citation parses as WKT (starts withPROJCS[,GEOGCS[,PROJCRS[,GEOGCRS[,BOUNDCRS[,COMPOUNDCRS[,VERTCRS[, orLOCAL_CS[), setgeo_info.crs_wkt = crs_name._populate_attrs_from_geo_infothen propagates it correctly to every backend.