Skip to content

geotiff: read_to_array leaks into public namespace but is not in __all__ or docs #1708

@brendancol

Description

@brendancol

Summary

xrspatial.geotiff.__init__.py imports read_to_array from ._reader at module level (line 46) so that the eager open_geotiff body can call it. This brings the name into the xrspatial.geotiff public namespace as a side-effect, but read_to_array is not in __all__ and is not advertised in the module-level "Public API" docstring.

# xrspatial/geotiff/__init__.py
from ._reader import UnsafeURLError, read_to_array  # line 46

__all__ = [
    'GeoTIFFFallbackWarning',
    'UnsafeURLError',
    'open_geotiff',
    'read_geotiff_gpu',
    'read_geotiff_dask',
    'read_vrt',
    'to_geotiff',
    'write_geotiff_gpu',
    'write_vrt',
]

from xrspatial.geotiff import read_to_array succeeds today. The library's own test in xrspatial/geotiff/tests/test_band_validation_1673.py relies on that import six times (lines 49, 58, 68, 77, 86, 95, 113) -- proof that the leaked name is being used as if public. External users who pattern-match against the test will also import it from the public namespace. If we later promote read_to_array to private by adding an underscore or by removing the top-level re-export, those users break with no deprecation path.

This is the orphan-API pattern flagged by the api-consistency sweep (Cat 5): a function name that walks like a public API (importable from the package's namespace) but does not appear in __all__, the docstring's Public API list, or the user docs.

Severity

MEDIUM (Cat 5, public API surface drift). Not a correctness bug; it's a hidden surface that constrains future cleanup.

Proposed fix

Hide the leak. The internal usage in __init__.py keeps working, but the name disappears from the public namespace:

# xrspatial/geotiff/__init__.py
from ._reader import UnsafeURLError
from ._reader import read_to_array as _read_to_array

Then update each internal reference (lines around 686, 2533) to _read_to_array, and update the single test file test_band_validation_1673.py to import from xrspatial.geotiff._reader directly:

from xrspatial.geotiff._reader import read_to_array

That makes the internal-only status explicit, removes the orphan-API surface, and keeps the test surface working through the canonical module path. The change is internal-rename only; no external user code can depend on the leaked top-level name because it was never in __all__ or the docs in the first place.

If a maintainer wants to instead promote read_to_array to public API, the alternative fix is to add it to __all__, document its signature and return type in the module docstring, and write a Sphinx entry for it. Hiding is the cheaper option; promotion is a larger commitment.

Discovered by

/sweep-api-consistency against the geotiff module on 2026-05-12.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions