Skip to content

geotiff: honour allow_rotated for rotated ModelTransformationTag#2116

Merged
brendancol merged 4 commits into
mainfrom
issue-2115
May 19, 2026
Merged

geotiff: honour allow_rotated for rotated ModelTransformationTag#2116
brendancol merged 4 commits into
mainfrom
issue-2115

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #2115.

Summary

  • _extract_transform now accepts allow_rotated. When True and the file declares rotation/skew/z-coupling, it returns an identity GeoTransform with has_georef=False (pixel grid) and stashes the rotated 6-tuple on transform.rotated_affine for downstream consumers.
  • extract_geo_info and extract_geo_info_with_overview_inheritance pass allow_rotated through.
  • read_to_array, _read_geo_info, and the dask _delayed_read_window accept allow_rotated so eager numpy and dask reads both honour the opt-in.
  • open_geotiff(..., allow_rotated=True) now does what the validator's error message promised: reads the pixel grid without raising.
  • HTTP / cloud metadata parser is unchanged; the kwarg there is follow-up.

Backend coverage

  • numpy: covered.
  • cupy: parameter threaded; behaviour preserved (existing path already accepted the kwarg).
  • dask+numpy: covered (per-chunk threaded via _delayed_read_window).
  • dask+cupy: parameter threaded.

Test plan

  • test_allow_rotated_geotiff_2115.py: unit tests on _extract_transform (default raises, opt-in returns no-georef, axis-aligned unchanged) plus end-to-end open_geotiff with a synthetic rotated TIFF on both eager and dask paths.
  • Existing monkeypatch tests (test_overview_pixel_is_point_1642, test_overview_nodata_inheritance_1739) updated to accept the new keyword.
  • Full xrspatial/geotiff/tests/ suite: 4025 passed, 25 skipped.

The validator at _validation.py:893 documents allow_rotated=True as an
opt-out for the rotated-grid check, and the VRT path honours it. The
GeoTIFF path did not: the geotag parser at _geotags.py:509 raised
NotImplementedError before the validator ever saw the file.

Thread allow_rotated through _extract_transform, extract_geo_info, and
extract_geo_info_with_overview_inheritance. When the caller opts in,
the parser returns an identity GeoTransform with has_georef=False so
the reader produces a pixel-coord array, and stashes the rotated
6-tuple on transform.rotated_affine for downstream consumers that want
the matrix. Dask reads chunk through read_to_array and the per-chunk
allow_rotated kwarg threads via _delayed_read_window so the rotation
check does not fire per-chunk.

GPU and HTTP/cloud paths also accept allow_rotated, but the HTTP
metadata parser still needs the kwarg threaded if a user wants to read
a rotated COG over HTTP -- noted for follow-up.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 19, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review of PR #2116.

Coverage

  • _extract_transform accepts allow_rotated. When True and rotation/skew/z-coupling is present, returns an identity GeoTransform with has_georef=False and stashes the rotated 6-tuple on transform.rotated_affine for downstream consumers.
  • Threaded through extract_geo_info, extract_geo_info_with_overview_inheritance, read_to_array, _read_geo_info, and the dask _delayed_read_window so per-chunk dask reads do not re-trip the check.
  • Default behaviour is unchanged: allow_rotated=False keeps the existing NotImplementedError safety net.

Risk areas worth a closer look

  • HTTP / cloud metadata parser (_parse_cog_http_meta and _read_cog_http) still does not thread allow_rotated. A rotated COG over HTTP still raises; noted in the commit body as follow-up.
  • Existing monkeypatch tests (test_overview_pixel_is_point_1642, test_overview_nodata_inheritance_1739) needed to accept the new keyword; that is a signature-leak symptom of the threading and worth keeping in mind if you would rather pass allow_rotated via context.
  • The rotated_affine attribute is set via object.__setattr__ so it works whether GeoTransform stays a plain dataclass or becomes frozen later. The attrs round-trip code does not yet emit it on read; if you want the rotated matrix to survive a read / write cycle, that is a separate change.

Validation

  • The existing _check_read_rotated_transform validator short-circuits when allow_rotated=True or when the transform tuple is None, so the no-georef path the parser produces does not trigger the rotated-grid error a second time.

Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up to my self-review.

HTTP / cloud allow_rotated -> deferred.
_parse_cog_http_meta and _read_cog_http use the same extract_geo_info_with_overview_inheritance call but build their context without an allow_rotated arg. Threading it through adds the kwarg on _parse_cog_http_meta, _read_cog_http, and the HTTP path inside read_to_array (the early return for http:// URIs at line 3219). All callsites are mechanical, but the HTTP path is also the one without test coverage in this branch's test, so I would rather land it in a follow-up that ships a corresponding HTTP-rotated test than amend now without a verification target.

Monkeypatch tests -> already updated.
test_overview_pixel_is_point_1642 and test_overview_nodata_inheritance_1739 both define local fake_extract shims for extract_geo_info. Adding the new keyword to those local definitions is unavoidable signature leak; the alternative would be passing allow_rotated through a context dict, which is structurally heavier than the existing kwarg flow.

rotated_affine attrs roundtrip -> deferred.
The parser stashes the 6-tuple on transform.rotated_affine, but _populate_attrs_from_geo_info does not emit it. So a to_geotiff round-trip on a allow_rotated=True read would write an identity-affine output and lose the original rotation. The flagged user path is "read pixel grid without geo assumption", not "preserve rotation through round-trip", so this is the right default for now. If round-trip preservation matters, a separate PR can teach the attrs writer to emit rotated_affine and the writer's ModelTransformationTag emitter to consume it.

Self-review on #2116 deferred the HTTP allow_rotated wiring. Address it.

``_parse_cog_http_meta`` and ``_read_cog_http`` now accept
``allow_rotated`` and forward it to
``extract_geo_info_with_overview_inheritance`` the same way the local
file and dask paths do. The dispatcher in ``read_to_array`` passes the
kwarg into the HTTP branch, and the metadata helper ``_read_geo_info``
forwards it on the fsspec / HTTP COG metadata read used by the dask
graph builder. Tests use a loopback HTTP server (with the private
hosts env var enabled) to verify both the default raise and the
opt-in pixel read on a rotated HTTP source.

The remaining deferred item from the self-review (rotated_affine attrs
roundtrip) is a writer feature -- the parser already stashes the
rotated 6-tuple on transform.rotated_affine but the attrs writer does
not emit it. Round-trip preservation is a separate feature that
belongs with the writer changes for ModelTransformationTag emission.
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review after the HTTP allow_rotated thread-through (1f02a0a).

Suggestions (should fix, not blocking)

  • Per-chunk dask reads re-encounter the rotation check. The dask graph builder passes allow_rotated through _delayed_read_window so each chunk's read_to_array re-parses the IFD and re-runs _extract_transform. With allow_rotated=True the cost is just the parsing, not a raise, but it scales linearly with chunk count. For a 100-chunk dask read of a rotated file you pay 100 IFD parses. Threading metadata down once (the existing pattern for some kwargs) would amortise it; this PR matches the existing dask shape, so it's not a regression -- just worth tracking.

  • The opt-in GeoTransform() is an identity transform with rotated_affine attached. Downstream code that reads transform.origin_x / transform.pixel_width from has_georef=False results will get 0.0 and 1.0. That is the documented contract -- "ungeoreferenced pixel grid" -- but a future reader of the code may be tempted to use the diagonal of rotated_affine as a fallback when has_georef is False. The doc on _extract_transform could be more explicit that the transform is intentionally inert and that consumers wanting the rotated mapping must read rotated_affine directly.

  • rotated_affine is set via object.__setattr__ to survive a future frozen dataclass conversion. If a future PR uses dataclasses.replace on a GeoTransform, the attribute drops because replace does not copy non-field attributes. Adding a rotated_affine: tuple | None = None field to the dataclass would survive that without changing today's behaviour.

Nits (optional)

  • Existing monkeypatch tests still leak the signature. test_overview_pixel_is_point_1642 and test_overview_nodata_inheritance_1739 redefine extract_geo_info locally; the kwarg add forces a corresponding local-signature change. Not a regression, but if a future refactor moves to a registry / dispatch object the local shims would be drift-prone.

  • rotated_affine is never emitted on write. A read-write round-trip silently loses the rotation. The kwarg contract is read-only, so this is by design; if anyone hits this, the writer needs a corresponding ModelTransformationTag emit path. Worth a doc note on allow_rotated so the limitation is discoverable.

  • HTTP loopback server tests need XRSPATIAL_GEOTIFF_ALLOW_PRIVATE_HOSTS=1. Documented in-test via monkeypatch.setenv; same pattern used elsewhere in the suite.

What looks good

  • The opt-in is now uniformly available across local files, dask, fsspec / HTTP COG, and the dispatcher; tests cover both raise-by-default and read-on-opt-in on each.
  • rotated_affine preserves the original matrix for downstream consumers (validator + future writer); the validator's short-circuit on allow_rotated=True continues to mean the no-georef path doesn't double-trip the rotated-grid error.
  • HTTP-loopback tests use a real server with the private-hosts env so the threading is exercised end-to-end rather than mocked.

# Conflicts:
#	xrspatial/geotiff/_geotags.py
Make rotated_affine a real GeoTransform dataclass field instead of an
object.__setattr__ side-channel, so dataclasses.replace and similar
helpers preserve it. The non-rotated case sets the field to None; the
opt-in path passes the 6-tuple via the constructor.

Expand the _extract_transform docstring to spell out that the returned
identity GeoTransform on the allow_rotated=True path is intentionally
inert -- callers must read transform.rotated_affine for the real mapping
instead of falling back to origin_x / pixel_width on a has_georef=False
result.

Document the read-only contract on both _extract_transform and the
public open_geotiff allow_rotated kwarg: to_geotiff does not emit
rotated_affine, so a read-then-write round-trip writes an
identity-affine output and silently drops the rotation. The kwarg was
previously undocumented on open_geotiff entirely.

Update the test that asserted no rotated_affine attribute existed on
the axis-aligned path; the field is always present now, just None.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

allow_rotated does not work for rotated GeoTIFFs

1 participant