geotiff: honour allow_rotated for rotated ModelTransformationTag#2116
Conversation
The validator at _validation.py:893 documents allow_rotated=True as an opt-out for the rotated-grid check, and the VRT path honours it. The GeoTIFF path did not: the geotag parser at _geotags.py:509 raised NotImplementedError before the validator ever saw the file. Thread allow_rotated through _extract_transform, extract_geo_info, and extract_geo_info_with_overview_inheritance. When the caller opts in, the parser returns an identity GeoTransform with has_georef=False so the reader produces a pixel-coord array, and stashes the rotated 6-tuple on transform.rotated_affine for downstream consumers that want the matrix. Dask reads chunk through read_to_array and the per-chunk allow_rotated kwarg threads via _delayed_read_window so the rotation check does not fire per-chunk. GPU and HTTP/cloud paths also accept allow_rotated, but the HTTP metadata parser still needs the kwarg threaded if a user wants to read a rotated COG over HTTP -- noted for follow-up.
brendancol
left a comment
There was a problem hiding this comment.
Self-review of PR #2116.
Coverage
_extract_transformacceptsallow_rotated. When True and rotation/skew/z-coupling is present, returns an identityGeoTransformwithhas_georef=Falseand stashes the rotated 6-tuple ontransform.rotated_affinefor downstream consumers.- Threaded through
extract_geo_info,extract_geo_info_with_overview_inheritance,read_to_array,_read_geo_info, and the dask_delayed_read_windowso per-chunk dask reads do not re-trip the check. - Default behaviour is unchanged:
allow_rotated=Falsekeeps the existingNotImplementedErrorsafety net.
Risk areas worth a closer look
- HTTP / cloud metadata parser (
_parse_cog_http_metaand_read_cog_http) still does not threadallow_rotated. A rotated COG over HTTP still raises; noted in the commit body as follow-up. - Existing monkeypatch tests (
test_overview_pixel_is_point_1642,test_overview_nodata_inheritance_1739) needed to accept the new keyword; that is a signature-leak symptom of the threading and worth keeping in mind if you would rather passallow_rotatedvia context. - The
rotated_affineattribute is set viaobject.__setattr__so it works whetherGeoTransformstays a plain dataclass or becomes frozen later. The attrs round-trip code does not yet emit it on read; if you want the rotated matrix to survive a read / write cycle, that is a separate change.
Validation
- The existing
_check_read_rotated_transformvalidator short-circuits whenallow_rotated=Trueor when the transform tuple is None, so the no-georef path the parser produces does not trigger the rotated-grid error a second time.
brendancol
left a comment
There was a problem hiding this comment.
Follow-up to my self-review.
HTTP / cloud allow_rotated -> deferred.
_parse_cog_http_meta and _read_cog_http use the same extract_geo_info_with_overview_inheritance call but build their context without an allow_rotated arg. Threading it through adds the kwarg on _parse_cog_http_meta, _read_cog_http, and the HTTP path inside read_to_array (the early return for http:// URIs at line 3219). All callsites are mechanical, but the HTTP path is also the one without test coverage in this branch's test, so I would rather land it in a follow-up that ships a corresponding HTTP-rotated test than amend now without a verification target.
Monkeypatch tests -> already updated.
test_overview_pixel_is_point_1642 and test_overview_nodata_inheritance_1739 both define local fake_extract shims for extract_geo_info. Adding the new keyword to those local definitions is unavoidable signature leak; the alternative would be passing allow_rotated through a context dict, which is structurally heavier than the existing kwarg flow.
rotated_affine attrs roundtrip -> deferred.
The parser stashes the 6-tuple on transform.rotated_affine, but _populate_attrs_from_geo_info does not emit it. So a to_geotiff round-trip on a allow_rotated=True read would write an identity-affine output and lose the original rotation. The flagged user path is "read pixel grid without geo assumption", not "preserve rotation through round-trip", so this is the right default for now. If round-trip preservation matters, a separate PR can teach the attrs writer to emit rotated_affine and the writer's ModelTransformationTag emitter to consume it.
Self-review on #2116 deferred the HTTP allow_rotated wiring. Address it. ``_parse_cog_http_meta`` and ``_read_cog_http`` now accept ``allow_rotated`` and forward it to ``extract_geo_info_with_overview_inheritance`` the same way the local file and dask paths do. The dispatcher in ``read_to_array`` passes the kwarg into the HTTP branch, and the metadata helper ``_read_geo_info`` forwards it on the fsspec / HTTP COG metadata read used by the dask graph builder. Tests use a loopback HTTP server (with the private hosts env var enabled) to verify both the default raise and the opt-in pixel read on a rotated HTTP source. The remaining deferred item from the self-review (rotated_affine attrs roundtrip) is a writer feature -- the parser already stashes the rotated 6-tuple on transform.rotated_affine but the attrs writer does not emit it. Round-trip preservation is a separate feature that belongs with the writer changes for ModelTransformationTag emission.
brendancol
left a comment
There was a problem hiding this comment.
Re-review after the HTTP allow_rotated thread-through (1f02a0a).
Suggestions (should fix, not blocking)
-
Per-chunk dask reads re-encounter the rotation check. The dask graph builder passes
allow_rotatedthrough_delayed_read_windowso each chunk'sread_to_arrayre-parses the IFD and re-runs_extract_transform. Withallow_rotated=Truethe cost is just the parsing, not a raise, but it scales linearly with chunk count. For a 100-chunk dask read of a rotated file you pay 100 IFD parses. Threading metadata down once (the existing pattern for some kwargs) would amortise it; this PR matches the existing dask shape, so it's not a regression -- just worth tracking. -
The opt-in
GeoTransform()is an identity transform withrotated_affineattached. Downstream code that readstransform.origin_x/transform.pixel_widthfromhas_georef=Falseresults will get0.0and1.0. That is the documented contract -- "ungeoreferenced pixel grid" -- but a future reader of the code may be tempted to use the diagonal ofrotated_affineas a fallback whenhas_georefis False. The doc on_extract_transformcould be more explicit that the transform is intentionally inert and that consumers wanting the rotated mapping must readrotated_affinedirectly. -
rotated_affineis set viaobject.__setattr__to survive a future frozen dataclass conversion. If a future PR usesdataclasses.replaceon aGeoTransform, the attribute drops becausereplacedoes not copy non-field attributes. Adding arotated_affine: tuple | None = Nonefield to the dataclass would survive that without changing today's behaviour.
Nits (optional)
-
Existing monkeypatch tests still leak the signature.
test_overview_pixel_is_point_1642andtest_overview_nodata_inheritance_1739redefineextract_geo_infolocally; the kwarg add forces a corresponding local-signature change. Not a regression, but if a future refactor moves to a registry / dispatch object the local shims would be drift-prone. -
rotated_affineis never emitted on write. A read-write round-trip silently loses the rotation. The kwarg contract is read-only, so this is by design; if anyone hits this, the writer needs a correspondingModelTransformationTagemit path. Worth a doc note onallow_rotatedso the limitation is discoverable. -
HTTP loopback server tests need
XRSPATIAL_GEOTIFF_ALLOW_PRIVATE_HOSTS=1. Documented in-test viamonkeypatch.setenv; same pattern used elsewhere in the suite.
What looks good
- The opt-in is now uniformly available across local files, dask, fsspec / HTTP COG, and the dispatcher; tests cover both raise-by-default and read-on-opt-in on each.
rotated_affinepreserves the original matrix for downstream consumers (validator + future writer); the validator's short-circuit onallow_rotated=Truecontinues to mean the no-georef path doesn't double-trip the rotated-grid error.- HTTP-loopback tests use a real server with the private-hosts env so the threading is exercised end-to-end rather than mocked.
# Conflicts: # xrspatial/geotiff/_geotags.py
Make rotated_affine a real GeoTransform dataclass field instead of an object.__setattr__ side-channel, so dataclasses.replace and similar helpers preserve it. The non-rotated case sets the field to None; the opt-in path passes the 6-tuple via the constructor. Expand the _extract_transform docstring to spell out that the returned identity GeoTransform on the allow_rotated=True path is intentionally inert -- callers must read transform.rotated_affine for the real mapping instead of falling back to origin_x / pixel_width on a has_georef=False result. Document the read-only contract on both _extract_transform and the public open_geotiff allow_rotated kwarg: to_geotiff does not emit rotated_affine, so a read-then-write round-trip writes an identity-affine output and silently drops the rotation. The kwarg was previously undocumented on open_geotiff entirely. Update the test that asserted no rotated_affine attribute existed on the axis-aligned path; the field is always present now, just None.
Closes #2115.
Summary
_extract_transformnow acceptsallow_rotated. When True and the file declares rotation/skew/z-coupling, it returns an identityGeoTransformwithhas_georef=False(pixel grid) and stashes the rotated 6-tuple ontransform.rotated_affinefor downstream consumers.extract_geo_infoandextract_geo_info_with_overview_inheritancepassallow_rotatedthrough.read_to_array,_read_geo_info, and the dask_delayed_read_windowacceptallow_rotatedso eager numpy and dask reads both honour the opt-in.open_geotiff(..., allow_rotated=True)now does what the validator's error message promised: reads the pixel grid without raising.Backend coverage
_delayed_read_window).Test plan
test_allow_rotated_geotiff_2115.py: unit tests on_extract_transform(default raises, opt-in returns no-georef, axis-aligned unchanged) plus end-to-end open_geotiff with a synthetic rotated TIFF on both eager and dask paths.test_overview_pixel_is_point_1642,test_overview_nodata_inheritance_1739) updated to accept the new keyword.xrspatial/geotiff/tests/suite: 4025 passed, 25 skipped.