geotiff: add canonical attrs['georef_status'] (#2136)#2145
Open
brendancol wants to merge 4 commits into
Open
Conversation
Read paths emit one canonical attr that encodes the five distinct states the reader can land in when CRS and transform tags are combined: full, transform_only, crs_only, none, rotated_dropped. Without this attr, downstream code has to reconcile crs / crs_wkt / transform / _xrspatial_no_georef by hand, and two distinct on-disk situations (rotated-with-allow_rotated and truly-no-transform) end up indistinguishable via the public attrs. * Stamped via _populate_attrs_from_geo_info, covering eager numpy, dask, and the GPU read paths. * Stamped inline in both VRT branches (eager and chunked) through a shared _compute_georef_status_from_parts helper so all six call sites share one decision rule. * Bumps _ATTRS_CONTRACT_VERSION 2 -> 3. * Additive: crs, crs_wkt, transform, and the _xrspatial_no_georef marker keep their pre-v3 semantics so existing consumers keep working.
brendancol
commented
May 19, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
PR Review: canonical attrs['georef_status'] (#2136)
No blockers. A few small things worth addressing in this PR rather than a follow-up.
Suggestions
- xrspatial/geotiff/tests/test_georef_status_2136.py:50 --
_ROTATED_Mis imported but never referenced. Drop it from the import (or actually use it to pin the matrix the test exercises). - GEOREF_STATUS_* constants in
_attrs.py-- The issue framesgeoref_statusas a downstream-consumer signal, but the five constants live in private_attrs. If you want callers to branch on them rather than the string literals, re-export fromxrspatial.geotiff. Otherwise document that comparing against the literal strings is the expected API.
Nits
- xrspatial/geotiff/_attrs.py:391 -- Docstring says "The four backends (eager, dask, GPU)" but there are three distinct GPU call sites in
gpu.py(chunked, eager, tile) on top of eager numpy + dask. Either drop the count or list the five paths. -
_GEOREF_STATUS_VALUESfrozenset -- Declared but unused. If it stays, expose it so downstream code can validate; otherwise drop it. -
_backends/vrt.py:286, 717--bool(vrt.crs_wkt)is currently safe because_textreturnsNone(never"") for missing/empty SRS, but_compute_georef_statususescrs_wkt is not None. Align the two checks so a future change to_textcannot drift the decision.
What looks good
- One decision function plus a booleans-only twin for the VRT branches that don't have a
GeoInfo. Clean separation. - Rotated-dropped detection keys off
transform.rotated_affine, nothas_georef, so a future change to howhas_georefis set cannot leak a true-no-transform file into the rotated bucket. - Pre-branch stamping means changes to which optional attrs get emitted can't shift the status value.
- Tests cover the unit decision, the five reader states across numpy / dask / GPU / VRT-eager / VRT-chunked, and round-trip stability for the four writable states. The
rotated_droppedwrite-side asymmetry is documented inline. - Contract version assertions now track
_ATTRS_CONTRACT_VERSIONinstead of literals; the next bump only touches the constant.
Checklist
- Decision table matches the issue
- All read paths funnel through one of the two helpers
- Edge cases: empty
crs_wkthandled (via_textreturning None) - No materialization concerns (attrs work only)
- Docstrings accurate modulo the count typo
- Expose GEOREF_STATUS_* constants and GEOREF_STATUS_VALUES through xrspatial.geotiff. Downstream code can now branch on the constants via the public package surface; previously the only path was an import from the private _attrs module. - Align the VRT branches with _compute_georef_status by checking ``vrt.crs_wkt is not None`` rather than ``bool(vrt.crs_wkt)``. The VRT XML parser returns None (not "") for missing/empty SRS today; pinning the rule defends the alignment if the parser ever changes. - Drop the unused _ROTATED_M import in the test module. - Fix the _compute_georef_status docstring to enumerate the read call sites instead of an out-of-date count.
brendancol
commented
May 19, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
Follow-up: review findings addressed
All suggestions and nits from the previous review pass are addressed in 89283ce.
Fixed
- Public constants exposed.
GEOREF_STATUS_FULL,GEOREF_STATUS_TRANSFORM_ONLY,GEOREF_STATUS_CRS_ONLY,GEOREF_STATUS_NONE,GEOREF_STATUS_ROTATED_DROPPED, and the newGEOREF_STATUS_VALUESfrozenset are now re-exported fromxrspatial.geotiffand added to__all__. Downstream code canfrom xrspatial.geotiff import GEOREF_STATUS_FULLinstead of reaching into the private_attrsmodule. Thetest_features.pypublic-API frozen list grew by six names to match. - VRT
bool(vrt.crs_wkt)->vrt.crs_wkt is not Nonein both VRT branches. Matches the_compute_georef_statusGeoInfo helper exactly; the comment notes why the alignment matters even though_textreturns None today. - Unused
_ROTATED_Mimport dropped from the test module. _compute_georef_statusdocstring rewritten to enumerate the actual call sites (eager numpy, dask, three GPU paths, two VRT helpers) instead of the stale "four backends" count._GEOREF_STATUS_VALUESrenamed toGEOREF_STATUS_VALUESsince it is now part of the public surface, with a docstring comment.
Added
test_public_constants_reexported: pins the public re-exports so a future refactor cannot quietly drop one of the six names.
Verified
Full geotiff test suite: 4277 passed, 25 skipped.
…087373c6751 # Conflicts: # xrspatial/geotiff/_attrs.py
…087373c6751 # Conflicts: # xrspatial/geotiff/_attrs.py # xrspatial/geotiff/_backends/vrt.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
One canonical
attrs['georef_status']so downstream code can branch on a single value instead of reconcilingcrs,crs_wkt,transform, and_xrspatial_no_georefby hand. Five values cover what the reader can actually see:full-- CRS resolved plus axis-aligned transform.transform_only-- transform present, no CRS.crs_only-- CRS present, no transform tags.none-- neither.rotated_dropped-- rotatedModelTransformationTagdropped underallow_rotated=True. Previously indistinguishable fromnonevia public attrs.Changes:
_attrs.py:_compute_georef_status(takes aGeoInfo) and_compute_georef_status_from_parts(for the VRT branches that don't build aGeoInfo). All read sites -- eager, dask, GPU, VRT-eager, VRT-chunked -- funnel through one decision rule._ATTRS_CONTRACT_VERSIONgoes 2 to 3. The contract docstring at the top of_attrs.pydocuments the new key.crs/crs_wkt/transform/_xrspatial_no_georefkeep their pre-v3 semantics so existing consumers don't change.Backend coverage
_populate_attrs_from_geo_info)_compute_georef_status_from_parts)Test plan
test_georef_status_2136.py: per-state reads across numpy, dask, GPU (skipped without CUDA), and both VRT branches, plus a parametrised round-trip test for the four writable states.rotated_droppedstays read-only becauseto_geotiffdoesn't emit rotatedModelTransformationTag; the issue calls this out under Out of Scope.test_attrs_contract_canonical_1984.py:georef_statusadded to the canonical key list, pinned tofullfor the canonical fixture.test_attrs_contract_version_1984.pyandtest_attrs_contract_passthrough_1984.py: switched to tracking_ATTRS_CONTRACT_VERSIONrather than the literal2, so the next bump only touches the constant.Closes #2136.