geotiff: tier the public surface, gate experimental codecs (#2137)#2150
Conversation
Add SUPPORTED_FEATURES module constant that enumerates every public feature in xrspatial.geotiff with one of four tiers: stable, advanced, experimental, internal_only. The mapping is the source of truth that the docstrings, the writer gates, and the user-guide notebook all read from. Add allow_experimental_codecs=False kwarg to to_geotiff and write_geotiff_gpu. Setting compression= to a Tier 3 codec (lerc, jpeg2000 / j2k, lz4) without the flag raises ValueError naming the flag; with the flag the writer emits GeoTIFFFallbackWarning and proceeds. Modelled on the existing allow_internal_only_jpeg opt-in for the Tier 4 jpeg codec (#1845); internal-only is a stricter tier than experimental and keeps its own dedicated flag. Docstrings on to_geotiff, write_geotiff_gpu, and open_geotiff carry a tier marker block after the summary, and per-parameter Advanced / Experimental markers on the kwargs that cross tier boundaries. User-guide notebook 39_GeoTIFF_IO.ipynb renders SUPPORTED_FEATURES as a markdown table so the documentation cannot drift from the code. Existing in-tree tests that wrote Tier 3 codecs are updated to pass allow_experimental_codecs=True so they exercise the encode path rather than the rejection gate.
brendancol
left a comment
There was a problem hiding this comment.
PR Review: geotiff: tier the public surface, gate experimental codecs (#2137)
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
-
xrspatial/geotiff/_writers/eager.py:423-- the experimental-codec warning at the CPU dispatcher fires unconditionally before the GPU dispatch decision. Soto_geotiff(gpu=True, compression='lerc', allow_experimental_codecs=True)warns here, then warns again inwrite_geotiff_gpuat_writers/gpu.py:321. The JPEG opt-in ateager.py:471-483avoids the double-warn by gating onnot use_gpu(theuse_gpulocal is computed at line 470 for exactly this reason). The comment ateager.py:417-422says this "mirrorsallow_internal_only_jpeg" but the JPEG path does the opposite. Either gate the new warning the same way, or fix the comment and the docstring's "once per call" claim to acknowledge the double-warn. -
Acceptance criterion 4 in #2137 names seven entry points that need a tier marker:
open_geotiff,to_geotiff,read_geotiff_gpu,write_geotiff_gpu,read_vrt,write_vrt, andread_geotiff_dask. The PR coversto_geotiff,write_geotiff_gpu, andopen_geotiff. The other four (_backends/gpu.py:92,_backends/dask.py:34,_backends/vrt.py:33,_writers/vrt.py:21) still have no tier marker block at the top of their docstrings. -
test_supported_features_tiers_2137.py:216-222-- the docstring promises "emitsGeoTIFFFallbackWarningonce per call" but the assertion only checkslen(fallback) >= 1. Pin the count:assert len(fallback) == 1would have caught the double-warn the next time someone touched the dispatcher.
Nits (optional improvements)
-
_writers/eager.py:407and_writers/gpu.py:308import_EXPERIMENTAL_CODECSinside the function body. The constant only depends onSUPPORTED_FEATURES, so there is no circular-import risk for hoisting the import to module scope; it would save one lookup per write call.
What looks good
_EXPERIMENTAL_CODECSis derived fromSUPPORTED_FEATURESrather than written out twice. The gate cannot drift from the documentation.- The new test walks
SUPPORTED_FEATURESinstead of hard-coding the codec list. Adding a new Tier 3 entry grows the test matrix automatically. - The notebook renders the table from the live constant.
__all__carries the newSUPPORTED_FEATURESexport and the existingtest_all_lists_supported_functionsparity check was updated in the same commit.- The PR resists the temptation to fold
compression='jpeg'into the new flag. Internal-only is the stricter tier and keeps its own gate, matching acceptance criterion #3. - The in-tree tests that wrote Tier 3 codecs were updated to pass the new flag, so the suite stays green rather than silently breaking 41 tests.
…arkers (#2137) - to_geotiff: gate the Tier 3 opt-in warning on ``not use_gpu`` so the CPU dispatcher emits exactly once and the GPU path's own warning fires alone. Mirrors how the JPEG opt-in is wired and lets the ``"once per call"`` docstring claim hold under the auto-dispatch ``gpu=True`` path. - Move SUPPORTED_FEATURES and _EXPERIMENTAL_CODECS from the package __init__ to _attrs.py so the writers can import them at module scope. The package __init__ re-exports SUPPORTED_FEATURES so the public API at xrspatial.geotiff.SUPPORTED_FEATURES is unchanged. - Add tier marker blocks to read_geotiff_gpu, read_geotiff_dask, read_vrt, and write_vrt docstrings so every public entry point listed in acceptance criterion #4 carries one. - Pin the warning count in test_supported_features_tiers_2137.py at exactly one per call so a future double-warn regression fails the test rather than passing the loose ``>= 1`` check.
brendancol
left a comment
There was a problem hiding this comment.
Follow-up: addressed the three Suggestions and the Nit
Disposition
-
Suggestion 1 (double-warn on GPU dispatch path) -- fixed in
a187138. The CPU dispatcher's experimental warning is now gated onnot use_gpu, mirroring the JPEG opt-in ateager.py:471-483. Thewrite_geotiff_gpuwarning still fires on the GPU path. The rejection check at line 411 still runs unconditionally so a default call sees the sameValueErroreither way. -
Suggestion 2 (missing tier markers on four entry points) -- fixed in
a187138. Added tier marker blocks to:read_geotiff_gpu(_backends/gpu.py) -- Experimentalread_geotiff_dask(_backends/dask.py) -- Stable for the Tier 1 codec set, Advanced for theallow_*opt-insread_vrt(_backends/vrt.py) -- Advanced, with the specific failure modes namedwrite_vrt(_writers/vrt.py) -- Advanced
All seven entry points listed in acceptance criterion 4 now carry the marker.
-
Suggestion 3 (loose warning count assertion) -- fixed in
a187138. The opt-in test now assertslen(fallback) == 1and includes a comment explaining what the count would catch. With Suggestion 1 fixed the assertion holds; without it, the GPU dispatch path would fail this test. -
Nit 4 (lazy import) -- fixed in
a187138. MovedSUPPORTED_FEATURESand_EXPERIMENTAL_CODECSfrom the package__init__to_attrs.pyso the writers can import them at module scope. The package__init__re-exportsSUPPORTED_FEATURES, so the public API atxrspatial.geotiff.SUPPORTED_FEATURESis unchanged. This breaks the circular dependency that forced the lazy import.
Verification
- Full geotiff test suite: 4262 passed, 25 skipped on the latest commit.
- The new
test_experimental_codec_opt_in_emits_warningparametrize set still passes with the tighter== 1assertion. - No other test files needed updates for the
_attrs.pymove; the public re-export keeps the existing import paths working.
#2146 (nodata lifecycle) and #2150 (tier surface) had landed cleanly on main without touching _populate_attrs_from_geo_info, but #2125's "drop attrs['crs'] on rotated reads" introduced a _vrt_is_rotated gate on the VRT branches that our pre-merge dataclass-based VRT code did not honour. Resolved by: * _attrs.py: kept the GeoTIFFMetadata shim. The dataclass already produces the same attrs surface as the legacy field-by-field path. * _backends/vrt.py: kept the dataclass build on both VRT branches and threaded _vrt_is_rotated through to drop crs/crs_wkt and set has_georef=False on rotated reads (so metadata_to_attrs stamps the no-georef marker). Verified test_allow_rotated_no_crs_2122, test_geotiff_metadata_2139, test_nodata_lifecycle_attrs_2135, test_supported_features_tiers_2137, test_backend_parity_matrix, and the broader VRT/attrs/nodata suite (1309 tests) all pass.
Closes #2137.
Summary
xrspatial.geotiff.SUPPORTED_FEATURES, a module-level mapping from feature name to tier label (stable/advanced/experimental/internal_only). Single source of truth for the docstrings, the writer gates, and the user-guide notebook.allow_experimental_codecs=Falsetoto_geotiffandwrite_geotiff_gpu. Tier 3 codecs (lerc,jpeg2000/j2k,lz4) raiseValueErrornaming the flag without it; with it, the writer emitsGeoTIFFFallbackWarningand proceeds. Modelled on the existingallow_internal_only_jpegopt-in for the Tier 4jpegcodec (write_geotiff_gpu emits JPEG TIFFs that other readers reject #1845); the two flags do not collapse into one switch.to_geotiff,write_geotiff_gpu, andopen_geotiffcarry a tier marker block after the summary, plus per-parameterAdvanced:/Experimental:markers on kwargs that cross tier boundaries.39_GeoTIFF_IO.ipynbrendersSUPPORTED_FEATURESas a markdown table so the documentation cannot drift from the code.Out of scope
Breaking change shape
Existing callers of
to_geotiff(compression='lerc')(or'jpeg2000'/'j2k'/'lz4') see aValueErrorthe first time they hit the new gate. The message namesallow_experimental_codecs=Trueso the caller can fix the call site in one line. This is the same break shapeallow_internal_only_jpegalready has, and the issue accepted it there.In-tree tests that wrote Tier 3 codecs are updated to pass the flag so they exercise the encode path rather than the rejection gate.
Backend coverage
CPU (
to_geotiff) and GPU (write_geotiff_gpu) both enforce the gate; the auto-dispatch path throughto_geotiff(gpu=True, ...)forwards the kwarg. Read paths are unchanged (the codec set sits on writers, not readers).Test plan
test_supported_features_tiers_2137.pywalksSUPPORTED_FEATURES, asserts every Tier 3 codec rejects by default and accepts the opt-in with aGeoTIFFFallbackWarning, and pins thatallow_experimental_codecsdoes not unlock the Tier 4jpegcodec.__all__parity test updated to include the newSUPPORTED_FEATURESexport.