Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 46 additions & 3 deletions examples/user_guide/39_GeoTIFF_IO.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -261,7 +261,7 @@
"}\n",
"\n",
".xr-group-name::before {\n",
" content: \"📁\";\n",
" content: \"\ud83d\udcc1\";\n",
" padding-right: 0.3em;\n",
"}\n",
"\n",
Expand Down Expand Up @@ -324,7 +324,7 @@
"\n",
".xr-section-summary-in + label:before {\n",
" display: inline-block;\n",
" content: \"\";\n",
" content: \"\u25ba\";\n",
" font-size: 11px;\n",
" width: 15px;\n",
" text-align: center;\n",
Expand All @@ -335,7 +335,7 @@
"}\n",
"\n",
".xr-section-summary-in:checked + label:before {\n",
" content: \"\";\n",
" content: \"\u25bc\";\n",
"}\n",
"\n",
".xr-section-summary-in:checked + label > span {\n",
Expand Down Expand Up @@ -970,6 +970,49 @@
"The VRT is a few hundred bytes of XML. `open_geotiff` assembles the tiles when you read it."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Supported features by tier\n",
"\n",
"`xrspatial.geotiff.SUPPORTED_FEATURES` is the source of truth for which features sit in the stable core, which are advanced but supported, which are experimental, and which require an opt-in flag because they do not round-trip through external readers. The table below is built from that constant so the documentation cannot drift from the code (issue #2137).\n",
"\n",
"Tiers:\n",
"\n",
"- **stable** -- the path a new user should be on. Local file in, local file out, lossless codec, axis-aligned grid.\n",
"- **advanced** -- works and is tested, but the caller should know the failure mode (cloud cost, partial VRT mosaics, rotated transforms drop on write, BigTIFF promotion, etc.).\n",
"- **experimental** -- no claim about cross-backend numerical parity or external interop. Tier 3 codecs (`lerc`, `jpeg2000` / `j2k`, `lz4`) require `allow_experimental_codecs=True` on `to_geotiff` and `write_geotiff_gpu`; the GPU read/write paths use `gpu=True` as their explicit opt-in.\n",
"- **internal_only** -- the strictest tier. `compression='jpeg'` writes self-contained JFIF tiles without the TIFF JPEGTables tag, so the output decodes through xrspatial but not libtiff / GDAL / rasterio. Requires the dedicated `allow_internal_only_jpeg=True` flag (issue #1845); `allow_experimental_codecs` does not cover it."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from xrspatial.geotiff import SUPPORTED_FEATURES\n",
"\n",
"# Render SUPPORTED_FEATURES as a markdown table grouped by tier.\n",
"# This cell renders from the live constant so the table tracks any\n",
"# future additions to the feature inventory.\n",
"from collections import defaultdict\n",
"from IPython.display import Markdown\n",
"\n",
"_TIER_ORDER = ['stable', 'advanced', 'experimental', 'internal_only']\n",
"_by_tier = defaultdict(list)\n",
"for name, tier in SUPPORTED_FEATURES.items():\n",
" _by_tier[tier].append(name)\n",
"\n",
"lines = ['| Feature | Tier |', '| --- | --- |']\n",
"for tier in _TIER_ORDER:\n",
" for name in sorted(_by_tier[tier]):\n",
" lines.append(f'| `{name}` | {tier} |')\n",
"\n",
"Markdown('\\n'.join(lines))"
]
},
{
"cell_type": "code",
"execution_count": 8,
Expand Down
63 changes: 62 additions & 1 deletion xrspatial/geotiff/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@
'MixedBandMetadataError',
'NonUniformCoordsError',
'RotatedTransformError',
'SUPPORTED_FEATURES',
'UnparseableCRSError',
'UnsafeURLError',
'open_geotiff',
Expand All @@ -140,6 +141,44 @@
]


# ``SUPPORTED_FEATURES`` and its derived ``_EXPERIMENTAL_CODECS`` set
# live in ``_attrs.py`` so the writers can import them at module scope
# without a circular dependency (this ``__init__`` already imports the
# writers, so the writers cannot import from ``..`` at module scope).
# The names are re-exported below to keep the public API at
# ``xrspatial.geotiff.SUPPORTED_FEATURES``.
#
# Tier semantics
# --------------
# - ``"stable"`` -- the path a new user should be on. Local file in,
# local file out, lossless codec, axis-aligned grid. Covered by the
# cross-backend parity matrix.
# - ``"advanced"`` -- works and is tested, but the caller should know
# what they are signing up for (cloud cost, partial VRT mosaics,
# rotated transforms dropping on write, BigTIFF promotion, etc.). No
# kwarg gate; the docstring carries an ``Advanced:`` marker.
# - ``"experimental"`` -- works in our tests, no claim about external
# interop or numerical parity across backends. Tier 3 codecs
# (``lerc``, ``jpeg2000`` / ``j2k``, ``lz4``) require
# ``allow_experimental_codecs=True`` on the writers; the GPU paths
# use ``gpu=True`` as the explicit opt-in.
# - ``"internal_only"`` -- the strictest tier. Already gated behind
# its own dedicated flag because the output does not round-trip
# through libtiff / GDAL / rasterio. ``codec.jpeg`` requires
# ``allow_internal_only_jpeg=True`` (issue #1845);
# ``allow_experimental_codecs`` does NOT cover it.
#
# Tests in ``xrspatial/geotiff/tests/test_supported_features_tiers_2137.py``
# walk the mapping and assert that every Tier 3 codec rejects without
# the opt-in flag and every Tier 4 codec rejects without its own
# dedicated flag. The user-guide notebook
# (``examples/user_guide/39_GeoTIFF_IO.ipynb``) renders the same
# mapping as a table so the documentation cannot drift from the code.
#
# See issue #2137.
from ._attrs import SUPPORTED_FEATURES # noqa: E402


def _read_geo_info(source, *, overview_level: int | None = None,
allow_rotated: bool = False):
"""Read only the geographic metadata and image dimensions from a GeoTIFF.
Expand Down Expand Up @@ -287,6 +326,15 @@ def open_geotiff(source: str | BinaryIO, *,
) -> xr.DataArray:
"""Read a GeoTIFF, COG, or VRT file into an xarray.DataArray.

Tier: Stable for local-file reads on axis-aligned grids with an
EPSG CRS in ``attrs['crs']``. Cloud / fsspec URIs, HTTP range
reads, ``.vrt`` mosaics, external ``.tif.ovr`` sidecars,
``allow_rotated=True``, and ``allow_unparseable_crs=True`` are
Advanced (work, but each carries a specific failure mode named on
the parameter doc). ``gpu=True`` is Experimental. See
:data:`xrspatial.geotiff.SUPPORTED_FEATURES` for the full tier
map (issue #2137).

Automatically dispatches to the best backend:
- ``gpu=True``: GPU-accelerated read via nvCOMP (returns CuPy)
- ``chunks=N``: Dask lazy read via windowed chunks
Expand Down Expand Up @@ -319,12 +367,18 @@ def open_geotiff(source: str | BinaryIO, *,
chunks : int, tuple, or None
Chunk size for Dask lazy reading.
gpu : bool
Use GPU-accelerated decompression (requires cupy + nvCOMP).
Experimental: requires cupy + nvCOMP for the codec the file
carries; the reader falls back to CPU when the optional
libraries are unavailable unless ``on_gpu_failure='strict'`` is
also set. Use GPU-accelerated decompression.
max_pixels : int or None
Maximum allowed pixel count (width * height * samples). None
uses the default (~1 billion). Raise to read legitimately
large files.
max_cloud_bytes : int or None, optional
Advanced: fsspec cloud reads can run up cost on large objects;
the budget defends against accidental large downloads but the
eager path still pulls the full object once the budget allows.
Byte ceiling for eager reads from fsspec sources (``s3://``,
``gs://``, ``az://``, ``abfs://``, ``memory://``, ...). The
compressed object size is checked against this budget before
Expand All @@ -345,6 +399,10 @@ def open_geotiff(source: str | BinaryIO, *,
because the policy only applies to the GPU pipeline. See
``read_geotiff_gpu`` for the full description.
missing_sources : {'raise', 'warn'}, optional
Advanced: VRT mosaics can return partial output under
``missing_sources='warn'`` when a backing source is unreadable;
the ``attrs['vrt_holes']`` entry records which sources were
skipped so downstream code can detect the partial mosaic.
Forwarded to ``read_vrt`` when the source is a ``.vrt`` file.
When the caller does not pass this kwarg, the public
``read_vrt`` default applies (``'raise'`` since #1860).
Expand Down Expand Up @@ -377,6 +435,9 @@ def open_geotiff(source: str | BinaryIO, *,
pixel, and ``dtype=<integer>`` then raises ``ValueError`` on the
float-to-int cast.
allow_rotated : bool, default False
Advanced: read-only opt-in; ``to_geotiff`` does not currently
emit ``rotated_affine`` so a read-then-write round-trip writes
an identity-affine output and silently drops the rotation.
Read-side opt-in for rotated / sheared ``ModelTransformationTag``
files. By default the reader raises ``NotImplementedError``
because the rest of xrspatial assumes an axis-aligned grid.
Expand Down
55 changes: 55 additions & 0 deletions xrspatial/geotiff/_attrs.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,61 @@
)


# Tiered feature inventory for the public geotiff surface (issue #2137).
# Defined in ``_attrs.py`` (not the package ``__init__.py``) so the writers
# can import it at module scope without a circular dependency: the package
# ``__init__`` already imports the writers. The package re-exports
# ``SUPPORTED_FEATURES`` so the public API stays
# ``xrspatial.geotiff.SUPPORTED_FEATURES``.
#
# See ``xrspatial/geotiff/__init__.py`` for the per-tier semantics; the
# inline comments here track the codec/reader/writer split used by the
# user-guide notebook table.
SUPPORTED_FEATURES = {
# Codecs. Tier 1 lossless integer + float byte-for-byte round-trip.
'codec.none': 'stable',
'codec.deflate': 'stable',
'codec.lzw': 'stable',
'codec.packbits': 'stable',
'codec.zstd': 'stable',
# Tier 3 codecs: require ``allow_experimental_codecs=True``.
'codec.lerc': 'experimental',
'codec.jpeg2000': 'experimental',
'codec.j2k': 'experimental',
'codec.lz4': 'experimental',
# Tier 4 codec: requires the dedicated ``allow_internal_only_jpeg``
# opt-in (issue #1845). Not covered by ``allow_experimental_codecs``.
'codec.jpeg': 'internal_only',
# Read paths.
'reader.local_file': 'stable',
'reader.fsspec': 'advanced',
'reader.http': 'advanced',
'reader.vrt': 'advanced',
'reader.sidecar_ovr': 'advanced',
'reader.allow_rotated': 'advanced',
'reader.allow_unparseable_crs': 'advanced',
'reader.gpu': 'experimental',
# Write paths.
'writer.local_file': 'stable',
'writer.cog': 'advanced',
'writer.overviews': 'advanced',
'writer.bigtiff': 'advanced',
'writer.gpu': 'experimental',
'writer.gdal_metadata_xml': 'experimental',
'writer.extra_tags': 'experimental',
}


# Tier 3 codec names (lower-cased) gated behind
# ``allow_experimental_codecs`` on the writers. Derived from
# ``SUPPORTED_FEATURES`` so the gate cannot drift from the docs.
_EXPERIMENTAL_CODECS = frozenset(
name.split('.', 1)[1].lower()
for name, tier in SUPPORTED_FEATURES.items()
if name.startswith('codec.') and tier == 'experimental'
)


# TIFF type ids needed when synthesizing extra_tags entries from attrs.
_TIFF_BYTE = 1
_TIFF_ASCII = 2
Expand Down
7 changes: 7 additions & 0 deletions xrspatial/geotiff/_backends/dask.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,13 @@ def read_geotiff_dask(source: str, *,
mask_nodata: bool = True) -> xr.DataArray:
"""Read a GeoTIFF as a dask-backed DataArray for out-of-core processing.

Tier: Stable for local-file reads on axis-aligned grids with the
Tier 1 codec set. ``allow_rotated`` / ``allow_unparseable_crs``
are Advanced (read-only opt-ins; round-trip semantics are listed
on the parameter docs). See
:data:`xrspatial.geotiff.SUPPORTED_FEATURES` for the full tier map
(issue #2137).

Each chunk is loaded lazily via windowed reads.

Parameters
Expand Down
6 changes: 6 additions & 0 deletions xrspatial/geotiff/_backends/gpu.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,12 @@ def read_geotiff_gpu(source: str, *,
) -> xr.DataArray:
"""Read a GeoTIFF with GPU-accelerated decompression via Numba CUDA.

Tier: Experimental (issue #2137). Requires cupy + numba CUDA plus
optional nvCOMP / nvJPEG / nvJPEG2K libraries for codec-specific
acceleration; cross-backend numerical parity with the CPU reader
is tested for the Tier 1 codec set only. See
:data:`xrspatial.geotiff.SUPPORTED_FEATURES` for the full tier map.

Decompresses all tiles in parallel on the GPU and returns a
CuPy-backed DataArray that stays on device memory. No CPU->GPU
transfer needed for downstream xrspatial GPU operations.
Expand Down
7 changes: 7 additions & 0 deletions xrspatial/geotiff/_backends/vrt.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,13 @@ def read_vrt(source: str, *,
mask_nodata: bool = True) -> xr.DataArray:
"""Read a GDAL Virtual Raster Table (.vrt) into an xarray.DataArray.

Tier: Advanced (issue #2137). VRT mosaics work and are tested, but
the caller should know the failure modes: cross-source nodata can
disagree (gated by ``band_nodata``), backing files can be missing
(gated by ``missing_sources``), and per-band metadata mismatch
raises a typed error rather than silently flattening. See
:data:`xrspatial.geotiff.SUPPORTED_FEATURES` for the full tier map.

The VRT's source GeoTIFFs are read via windowed reads and assembled
into a single array.

Expand Down
Loading
Loading