Skip to content

geotiff: annotate open_geotiff(max_cloud_bytes=...) (#2106)#2108

Merged
brendancol merged 2 commits into
mainfrom
deep-sweep-api-consistency-geotiff-2026-05-18-1779164255
May 19, 2026
Merged

geotiff: annotate open_geotiff(max_cloud_bytes=...) (#2106)#2108
brendancol merged 2 commits into
mainfrom
deep-sweep-api-consistency-geotiff-2026-05-18-1779164255

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #2106.

Summary

  • open_geotiff(max_cloud_bytes=...) was the only kwarg on the public read/write surface without a Python type annotation; the docstring already declared int or None so the surface and the docs disagreed
  • Adds int | None to the annotation. Default stays the module-internal _MAX_CLOUD_BYTES_SENTINEL so behaviour is unchanged
  • Regression test pins the immediate gap and parametrises over every public reader/writer entry point (open_geotiff, read_geotiff_gpu, read_geotiff_dask, read_vrt, to_geotiff, write_geotiff_gpu, write_vrt) so a future addition without an annotation surfaces in CI

Why this matters

  • inspect.signature is the source of truth for IDE autocomplete, Sphinx output, and mypy --strict. A bare parameter blocks the static-typing path for the only fsspec-related kwarg on the public read entry point
  • Cat 3 (Type hints and docstrings), severity MEDIUM under the api-consistency sweep template. Not a behaviour change

Out of scope

  • No rename. Sentinel default and validation rules unchanged
  • Other geotiff entry points already had annotations on every kwarg, so no follow-up rows to address

Test plan

  • pytest xrspatial/geotiff/tests/test_open_geotiff_max_cloud_bytes_annot_2106.py -v (8 passed)
  • pytest xrspatial/geotiff/tests/test_reader_kwarg_order_1935.py xrspatial/geotiff/tests/test_writer_kwarg_order_1922.py -v (existing parity tests still pass)
  • Local smoke test: open_geotiff round-trip with max_cloud_bytes=..., max_cloud_bytes=None, and the existing gpu=True + max_cloud_bytes=... rejection guard all still work
  • CUDA smoke test on a GPU host (read_geotiff_gpu returns CuPy-backed DataArray, gpu+max_cloud_bytes rejection still fires)

Filed by deep-sweep api-consistency on 2026-05-18.

The docstring already declared ``int or None``, but the function
signature carried a bare parameter -- the only kwarg on the public
read/write surface without a type annotation. ``inspect.signature``,
IDE autocomplete, Sphinx output, and ``mypy --strict`` all saw the
gap.

Adds ``int | None`` to the annotation; default stays the
module-internal ``_MAX_CLOUD_BYTES_SENTINEL`` so behaviour is
unchanged. Regression test pins the immediate fix and parametrises
over every public reader/writer entry point so a future kwarg
addition without an annotation surfaces in CI.

Also updates ``.claude/sweep-api-consistency-state.csv`` with the
2026-05-18 sweep results.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 19, 2026
@brendancol
Copy link
Copy Markdown
Contributor Author

PR Review: geotiff: annotate open_geotiff(max_cloud_bytes=...) (#2106)

Blockers

None.

Suggestions

None.

Nits

  • xrspatial/geotiff/__init__.py:252: the annotation is int | None, but the default _MAX_CLOUD_BYTES_SENTINEL is an object(). mypy --strict flags Incompatible default for argument because of that mismatch. The same pattern already exists a few lines down for on_gpu_failure: str = _ON_GPU_FAILURE_SENTINEL and missing_sources: str = _MISSING_SOURCES_SENTINEL, so the new line is consistent with what's already there. A typed sentinel class would clear mypy fully, but that's a bigger refactor and not in scope.
  • test_open_geotiff_max_cloud_bytes_annot_2106.py:46: the assertion checks that the annotation string contains both "int" and "None" as substrings. Something like Mapping[int, None] would pass while not being the intended type. The parametrized test below catches the more interesting failure mode (missing annotation), so this is fine as a pin.

What looks good

  • Scope is tight: one annotation, one regression test file.
  • The test does two useful things: pins the named kwarg, and parametrizes over every public reader/writer so the next missing annotation surfaces in CI rather than IDE autocomplete.
  • Lazy-eval annotations are handled correctly. The file uses from __future__ import annotations, so the runtime annotation is the string 'int | None', and the test's str(...) wrapping is a no-op for that case while still working if the import is ever removed.
  • Docstring at __init__.py:299 already declared int or None; the signature now matches.
  • Verified locally: the 8 new tests pass, and the existing kwarg-order parity tests in test_reader_kwarg_order_1935.py and test_writer_kwarg_order_1922.py still pass. Every other public entry point already had all kwargs annotated, so the new parametrized assertion is green on day one and won't fire false positives.

Checklist

  • Algorithm matches reference/paper: N/A, signature-only change
  • All implemented backends produce consistent results: N/A
  • NaN handling is correct: N/A
  • Edge cases are covered by tests
  • Dask chunk boundaries handled correctly: N/A
  • No premature materialization or unnecessary copies: N/A
  • Benchmark exists or is not needed: not needed
  • README feature matrix updated: N/A
  • Docstrings present and accurate

Two follow-ups from review on #2108:

1. ``open_geotiff(max_cloud_bytes=...)`` annotation: ``int | None`` is
   the caller-facing contract, but the default ``_MAX_CLOUD_BYTES_SENTINEL``
   is an ``object()``. ``mypy --strict`` flagged ``Incompatible default for
   argument`` ([assignment]). Add ``# type: ignore[assignment]`` so the
   line is clean under strict mode while keeping the user-facing
   annotation untouched. The pre-existing ``on_gpu_failure`` and
   ``missing_sources`` sentinels follow the same pattern and remain
   out of scope.

2. ``test_open_geotiff_max_cloud_bytes_has_type_annotation``: the
   substring check ``"int" in s and "None" in s`` would also pass on
   ``Mapping[int, None]``. Pin the exact ``int | None`` form. The
   parametrized test below still acts as a presence-only check for
   the rest of the public reader/writer surface.
@brendancol brendancol merged commit 8917c26 into main May 19, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

geotiff: open_geotiff(max_cloud_bytes=...) missing type annotation

1 participant