geotiff: eager fsspec cloud read pulls full object before dim guard

**Describe the bug**
Eager reads from cloud storage via fsspec pull the entire object into memory before any TIFF header parse or `max_pixels` guard runs.

`read_to_array()` at [`xrspatial/geotiff/_reader.py:2925`](../blob/main/xrspatial/geotiff/_reader.py#L2925) constructs a `_CloudSource` for any non-HTTP `://` source and immediately calls `src.read_all()`. `_CloudSource.read_all()` at [`xrspatial/geotiff/_reader.py:1339`](../blob/main/xrspatial/geotiff/_reader.py#L1339) does an unbounded `f.read()` with no size check, so a large or hostile remote TIFF on `s3://`, `gs://`, `az://`, or `abfs://` can exhaust memory and bandwidth before the dimensions are checked.

The HTTP path already reads only what it needs via `_parse_cog_http_meta` and range fetches against `_HTTPSource`. The dask backend also bounds its metadata reads. Eager cloud is the gap.

**Expected behavior**
Eager cloud reads should either:
- Refuse objects larger than a caller-configurable byte budget, raising before any data is downloaded. `_CloudSource.__init__` already knows the object size from `fsspec.size()`, so the check is cheap.
- Or (deeper fix) reuse the range-reader path, since `_CloudSource` already exposes `read_range` / `read_ranges` mirroring `_HTTPSource`.

The byte-budget fix is the smaller change and closes the safety hole. The range-based refactor can land separately as backend parity work.

**Proposed fix**
1. Add a `MAX_CLOUD_BYTES_DEFAULT` constant (256 MiB) in `_reader.py`, with `XRSPATIAL_GEOTIFF_MAX_CLOUD_BYTES` env override.
2. Plumb a `max_cloud_bytes` kwarg through `read_to_array` and `open_geotiff`. `None` opts out of the size check entirely.
3. Before `src.read_all()` in the fsspec branch, compare `src.size` against the budget. Refuse oversized objects with a clear error.

**Additional context**
Companion to PRs #1873 (HTTP / loopback test gating) and the bounded-metadata path on `_HTTPSource`. Surfaced by an external review of the geotiff module.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

geotiff: eager fsspec cloud read pulls full object before dim guard #1928

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

geotiff: eager fsspec cloud read pulls full object before dim guard #1928

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions