Skip to content

geotiff: move no-georef signal off coord shape onto attrs marker#2124

Merged
brendancol merged 2 commits into
mainfrom
issue-2120
May 19, 2026
Merged

geotiff: move no-georef signal off coord shape onto attrs marker#2124
brendancol merged 2 commits into
mainfrom
issue-2120

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #2120

Summary

  • The writer used to detect the read-side no-georef placeholder by coord shape (int64 ascending step-1 on both axes). User-authored grids matching that pattern lost their georef on round-trip with no warning. The reproducer x=[500,501,502], y=[1000,1001] came back as [0,1,2] with no transform.
  • Move the signal off shape onto attrs['_xrspatial_no_georef']. The reader stamps the marker together with the placeholder coords; the writer checks the marker and ignores coord shape.
  • A user-authored int64 step-1 grid without the marker now falls through to coords_to_transform, which synthesises a unit transform and preserves CRS attrs.

Backend coverage

  • numpy, dask+numpy, GPU (cupy + dask+cupy): all go through _populate_attrs_from_geo_info, which now stamps the marker.
  • VRT: emits empty coords on the no-georef path so the marker is not needed there.

Test plan

  • New regression tests in test_no_georef_marker_2120.py covering: user grid keeps CRS and gains a transform; explicit-marker writes skip transform synthesis; round-trip via a real no-georef file preserves the marker.
  • Updated test_int_coord_sentinel_2087.py and test_no_georef_writer_round_trip_1949.py to pin the new contract (shape alone is no longer the signal).
  • Full xrspatial/geotiff/tests suite: 4222 passed, 25 skipped.
  • Full non-geotiff suite: 3672 passed, 15 skipped.

The writer detected the read-side no-georef placeholder by coord
shape: int64 ascending step-1 on both axes meant "no georef" and
skipped transform synthesis. Real user grids that happened to match
the same pattern (e.g. ``x=[500,501,502], y=[1000,1001]``) lost their
georef on round-trip with no warning.

Move the signal off shape onto ``attrs['_xrspatial_no_georef']``.
The reader stamps the marker whenever it emits placeholder coords,
the writer checks the marker (only), and a user-authored int64 grid
without the marker falls through to ``coords_to_transform`` and
keeps its georef.

The shape helper ``_is_no_georef_sentinel`` stays in the module for
its existing diagnostic tests but is no longer wired into the writer
path. ``test_int_coord_sentinel_2087.py`` and the corresponding 3D
helper test in ``test_no_georef_writer_round_trip_1949.py`` are
updated to pin the new contract.

Closes #2120
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 19, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: geotiff: move no-georef signal off coord shape onto attrs marker

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

  • xrspatial/geotiff/_attrs.py:173 — the mid-file from ._coords import _NO_GEOREF_KEY with # noqa: E402,F401 works but reads oddly. _geotags.py is imported by both _attrs and _coords, so parking the constant there would let both files pick it up at the top with no noqa.
  • xrspatial/geotiff/_backends/vrt.py:268 — VRT's no-transform branch leaves coords = {} and skips the marker. Safe today (no x/y coords for the writer to misinterpret) but if a future change adds placeholder coords here without the marker, the silent-strip is back. Setting attrs[_NO_GEOREF_KEY] = True defensively in that else would prevent that.

Nits (optional improvements)

  • xrspatial/geotiff/_coords.py:64_is_no_georef_sentinel is now only called from test_int_coord_sentinel_2087.py. The docstring already says so, but a one-line comment near the definition ("kept for diagnostic / test pinning only") would help anyone greping for callers.
  • xrspatial/geotiff/_coords.py:61_has_no_georef_marker uses is True (identity, not truthiness). That's deliberate -- a stray attrs['_xrspatial_no_georef'] = 'yes' shouldn't flip the writer -- but an inline note would save the next reader a moment.

What looks good

  • The marker resolves a genuine ambiguity. The old shape-based check was forced to guess between two cases that share a shape; the new check just asks.
  • All four read paths funnel through _populate_attrs_from_geo_info, so the marker is stamped once and propagates uniformly.
  • isel and the other in-memory ops preserve da.attrs, so the marker survives the usual transformations between read and write.
  • New tests in test_no_georef_marker_2120.py cover both directions: user grid keeps georef, explicit marker writes no-georef.
  • The two existing test files that pinned the old behaviour were updated to pin the new contract rather than deleted.

Checklist

  • Algorithm matches reference/paper -- N/A, contract change.
  • All implemented backends produce consistent results -- eager / dask / GPU share _populate_attrs_from_geo_info.
  • NaN handling is correct -- unchanged by this PR.
  • Edge cases covered by tests -- user grid w/o marker, real no-georef read, slicing, explicit opt-in.
  • Dask chunk boundaries handled correctly -- N/A, attrs-only change.
  • No premature materialization or unnecessary copies -- N/A.
  • Benchmark exists or is not needed -- not needed.
  • README feature matrix updated (if applicable) -- N/A.
  • Docstrings present and accurate -- _has_no_georef_marker, _is_no_georef_sentinel, coords_to_transform, and require_transform_for_georeferenced all rewritten for the new role.

- Move ``_NO_GEOREF_KEY`` to ``_geotags`` so both ``_attrs`` and
  ``_coords`` import it from a shared module without the mid-file
  ``# noqa: E402,F401`` workaround.
- Stamp the marker defensively in both VRT no-transform branches
  (eager + chunked). The current code leaves coords empty so the
  writer has nothing to misinterpret, but the marker keeps the
  contract uniform across read paths.
- Document the ``is True`` identity check on
  ``_has_no_georef_marker``: only the exact boolean flips the
  writer, so stray third-party stamps like ``'yes'`` cannot
  silently drop a transform.
- Add a heading comment on ``_is_no_georef_sentinel`` flagging it
  as test-only after #2120 -- the writer no longer calls it.
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up: all review items addressed

Suggestions

  • Fixed_NO_GEOREF_KEY moved to _geotags.py. Both _attrs and _coords now import it at the top of the file; the mid-file noqa import is gone.
  • Fixed — VRT no-transform branches (eager + chunked) now stamp attrs[_NO_GEOREF_KEY] = True defensively, matching the contract from _populate_attrs_from_geo_info.

Nits

  • Fixed — Added a heading comment on _is_no_georef_sentinel flagging it as kept for tests only.
  • Fixed — Added an inline note on _has_no_georef_marker documenting that the is True identity check is deliberate (stray non-boolean stamps should not flip the writer).

Full geotiff suite still green: 4222 passed / 25 skipped.

@brendancol brendancol merged commit 34cc415 into main May 19, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

geotiff: to_geotiff silently strips georef on int64 step-1 user coords

1 participant