xarray-contrib · brendancol · May 19, 2026 · May 19, 2026 · May 19, 2026 · May 19, 2026
diff --git a/.claude/sweep-security-state.csv b/.claude/sweep-security-state.csv
@@ -18,7 +18,7 @@ fire,2026-04-25,,,,,"Clean. Despite the module's size hint, fire.py is purely pe
 flood,2026-05-03,1437,MEDIUM,3,,Re-audit 2026-05-03. MEDIUM Cat 3 fixed in PR #1438 (travel_time and flood_depth_vegetation now validate mannings_n DataArray values are finite and strictly positive via _validate_mannings_n_dataarray helper). No remaining unfixed findings. Other categories clean: every allocation is same-shape as input; no flat index math; NaN propagation explicit in every backend; tan_slope clamped by _TAN_MIN; no CUDA kernels; no file I/O; every public API calls _validate_raster on DataArray inputs.
 focal,2026-04-27,1284,HIGH,1,,"HIGH (fixed PR #1286): apply(), focal_stats(), and hotspots() accepted unbounded user-supplied kernels via custom_kernel(), which only checks shape parity. The kernel-size guard from #1241 (_check_kernel_memory) only ran inside circle_kernel/annulus_kernel, so a (50001, 50001) custom kernel on a 10x10 raster allocated ~10 GB on the kernel itself plus a much larger padded raster before any work -- same shape as the bilateral DoS in #1236. Fixed by adding _check_kernel_vs_raster_memory in focal.py and wiring it into apply(), focal_stats(), and hotspots() after custom_kernel() validation. All 134 focal tests + 19 bilateral tests pass. No other findings: 10 CUDA kernels all have proper bounds + stencil guards; _validate_raster called on every public entry point; hotspots already raises ZeroDivisionError on constant-value rasters; _focal_variety_cuda uses a fixed-size local buffer (silent truncation but bounded); _focal_std_cuda/_focal_var_cuda clamp the catastrophic-cancellation case via if var < 0.0: var = 0.0; no file I/O."
 geodesic,2026-04-27,1283,HIGH,1,,"HIGH (fixed PR #1285): slope(method='geodesic') and aspect(method='geodesic') stack a (3, H, W) float64 array (data, lat, lon) before dispatch with no memory check. A large lat/lon-tagged raster passed to either function would OOM. Fixed by adding _check_geodesic_memory(rows, cols) in xrspatial/geodesic.py (mirrors morphology._check_kernel_memory): budgets 56 bytes/cell (24 stacked float64 + 4 float32 output + 24 padded copy + slack) and raises MemoryError when > 50% of available RAM; called from slope.py and aspect.py inside the geodesic branch before dispatch. No other findings: 6 CUDA kernels all have bounds guards (e.g. _run_gpu_geodesic_aspect at geodesic.py:395), custom 16x16 thread blocks avoid register spill, no shared memory, _validate_raster runs upstream in slope/aspect, all backends cast to float32, slope_mag < 1e-7 flat threshold prevents arctan2 NaN propagation, curvature correction uses hardcoded WGS84 R."
-geotiff,2026-05-18,,MEDIUM,1,,"Re-audit pass 18 2026-05-18 (deep-sweep p1). MEDIUM Cat 1 fixed in deep-sweep-security-geotiff-2026-05-18-p1: read_geotiff_gpu eager path (_backends/gpu.py) now applies the same _max_tile_bytes_from_env() per-tile cap that _read_tiles and _fetch_decode_cog_http_tiles enforce. The CPU and GPU readers now agree on the per-tile budget; a malformed local TIFF with TileByteCounts pointing into a large file region is rejected before GPU decode rather than relying on _check_gpu_memory's aggregate-sum guard. Test: tests/test_gpu_tile_byte_cap_2026_05_18.py. Other categories verified clean: JPEG bomb cap (#1792), HTTP read_all byte budget (#2057), VRT XML cap, DOCTYPE rejection, path containment, SSRF, validate_tile_layout, dimension caps, IFD entry caps, MAX_IFDS, MAX_PIXEL_ARRAY_COUNT, GPU bounds guards, atomic writes, realpath canonicalization, dtype validation."
+geotiff,2026-05-19,2121,HIGH,1,,"Re-audit pass 19 2026-05-19 (deep-sweep p1). HIGH Cat 1 found in _sidecar.py load_sidecar: HTTP and fsspec sidecar downloads bypassed max_cloud_bytes set on the base file, so a hostile server could OOM the reader via a multi-GB .tif.ovr beside a tiny base TIFF (issue #2121). Fixed in deep-sweep-security-geotiff-2026-05-19-01 (PR #2123) by threading max_cloud_bytes through load_sidecar and applying it on both transports (HTTP via _HTTPSource.read_all max_bytes streaming cap, fsspec via fs.size() pre-check raising CloudSizeLimitError). Test: tests/test_sidecar_max_cloud_bytes_2121.py. All other categories verified clean against new commits 68574fe (.tif.ovr sidecar), 6b88cea (allow_rotated rotated MTT), f2e191d (multi-ModelTiepoint GCP rejection), 1e9c432 (GPU per-tile byte cap). Carries forward: JPEG bomb cap (#1792), HTTP read_all byte budget (#2057), VRT XML cap, DOCTYPE rejection, path containment, SSRF, validate_tile_layout, dimension caps, IFD entry caps, MAX_IFDS, MAX_PIXEL_ARRAY_COUNT, GPU bounds guards, atomic writes, realpath canonicalization, dtype validation."
 glcm,2026-04-24,1257,HIGH,1,,"HIGH (fixed #1257): glcm_texture() validated window_size only as >= 3 and distance only as >= 1, with no upper bound on either. _glcm_numba_kernel iterates range(r-half, r+half+1) for every pixel, so window_size=1_000_001 on a 10x10 raster ran ~10^14 loop iterations with all neighbors failing the interior bounds check (CPU DoS). On the dask backends depth = window_size // 2 + distance drove map_overlap padding, so a huge window also caused oversize per-chunk allocations (memory DoS). Fixed by adding max_val caps in the public entrypoint: window_size <= max(3, min(rows, cols)) and distance <= max(1, window_size // 2). One cap covers every backend because cupy and dask+cupy call through to the CPU kernel after cupy.asnumpy. No other HIGH findings: levels is already capped at 256 so the per-pixel np.zeros((levels, levels)) matrix in the kernel is bounded to 512 KB. No CUDA kernels. No file I/O. Quantization clips to [0, levels-1] before the kernel and NaN maps to -1 which the kernel filters with i_val >= 0. Entropy log(p) and correlation p / (std_i * std_j) are both guarded. All four backends use _validate_raster and cast to float64 before quantizing. MEDIUM (unfixed, Cat 1): the per-pixel np.zeros((levels, levels)) allocation inside the hot loop is a perf issue (levels=256 -> 512 KB alloc+free per pixel) but not a security issue because levels is bounded. Could be hoisted out of the loop or replaced with an in-place clear, but that is an efficiency concern, not security."
 gpu_rtx,2026-04-29,1308,HIGH,1,,"HIGH (fixed #1308 / PR #1310): hillshade_rtx (gpu_rtx/hillshade.py:184) and viewshed_gpu (gpu_rtx/viewshed.py:269) allocated cupy device buffers sized by raster shape with no memory check. create_triangulation (mesh_utils.py:23-24) adds verts (12 B/px) + triangles (24 B/px) = 36 B/px; hillshade_rtx adds d_rays(32) + d_hits(16) + d_aux(12) + d_output(4) = 64 B/px (100 B/px total); viewshed_gpu adds d_rays(32) + d_hits(16) + d_visgrid(4) + d_vsrays(32) = 84 B/px (120 B/px total). A 30000x30000 raster asked for 90-108 GB of VRAM before cupy surfaced an opaque allocator error. Fixed by adding gpu_rtx/_memory.py with _available_gpu_memory_bytes() and _check_gpu_memory(func_name, h, w) helpers (cost_distance #1262 / sky_view_factor #1299 pattern, 120 B/px budget covers worst case, raises MemoryError when required > 50% of free VRAM, skips silently when memGetInfo() unavailable). Wired into both entry points after the cupy.ndarray type check and before create_triangulation. 9 new tests in test_gpu_rtx_memory.py (5 helper-unit + 4 end-to-end gated on has_rtx). All 81 existing hillshade/viewshed tests still pass. Cat 4 clean: all CUDA kernels (hillshade.py:25/62/106, viewshed.py:32/74/116, mesh_utils.py:50) have bounds guards; no shared memory, no syncthreads needed. MEDIUM not fixed (Cat 6): hillshade_rtx and viewshed_gpu do not call _validate_raster directly but parent hillshade() (hillshade.py:252) and viewshed() (viewshed.py:1707) already validate, so input validation runs before the gpu_rtx entry point - defense-in-depth, not exploitable. MEDIUM not fixed (Cat 2): mesh_utils.py:64-68 cast mesh_map_index to int32 in the triangle index buffer; overflows at H*W > 2.1B vertices (~46341x46341+) but the new memory guard rejects rasters that large first - documentation/clarity item rather than exploitable. MEDIUM not fixed (Cat 3): mesh_utils.py:19 scale = maxDim / maxH divides by zero on an all-zero raster, propagating inf/NaN into mesh vertex z-coords; separate follow-up. LOW not fixed (Cat 5): mesh_utils.write() opens user-supplied path without canonicalization but its only call site (mesh_utils.py:38-39) sits behind if False: in create_triangulation, not reachable in production."
 hillshade,2026-04-27,,,,,"Clean. Cat 1: only allocation is the output np.empty(data.shape) at line 32 (cupy at line 165) and a _pad_array with hardcoded depth=1 (line 62) -- bounded by caller, no user-controlled amplifier. Azimuth/altitude are scalars and don't drive size. Cat 2: numba kernel uses range(1, rows-1) with simple (y, x) indexing; numba range loops promote to int64. Cat 3: math.sqrt(1.0 + xx_plus_yy) is always >= 1.0 (no neg sqrt, no div-by-zero); NaN elevation propagates correctly through dz_dx/dz_dy -> shaded -> output (the shaded < 0.0 / shaded > 1.0 clamps don't fire on NaN). Azimuth validated to [0, 360], altitude to [0, 90]. Cat 4: _gpu_calc_numba (line 107) guards both grid bounds and 3x3 stencil reads via i > 0 and i < shape[0]-1 and j > 0 and j < shape[1]-1; no shared memory. Cat 5: no file I/O. Cat 6: hillshade() calls _validate_raster (line 252) and _validate_scalar for both azimuth (253) and angle_altitude (254); all four backend paths cast to float32; tests parametrize int32/int64/float32/float64."

diff --git a/xrspatial/geotiff/_reader.py b/xrspatial/geotiff/_reader.py
@@ -3229,14 +3229,19 @@ def read_to_array(source, *, window=None, overview_level: int | None = None,
                               allow_rotated=allow_rotated)
 
     # Local file, cloud storage, or file-like buffer: read all bytes then parse
+    # Resolve the cloud byte budget once so both the base-file ``_CloudSource``
+    # size guard and the sidecar download below see the same effective cap.
+    # ``_resolve_max_cloud_bytes`` honours the kwarg, the env var, and the
+    # default in that order; the result is ``None`` only when the caller
+    # explicitly passed ``max_cloud_bytes=None``.
+    cloud_budget = _resolve_max_cloud_bytes(max_cloud_bytes)
     if _is_file_like(source):
         src = _BytesIOSource(source)
     elif _is_fsspec_uri(source):
         src = _CloudSource(source)
         # Check the compressed object size before any bytes are
         # downloaded. ``_CloudSource.__init__`` already fetched the size
         # via ``fsspec.size()``, so this is free. See issue #1928.
-        cloud_budget = _resolve_max_cloud_bytes(max_cloud_bytes)
         if cloud_budget is not None:
             size = src.size
             if size is None:
@@ -3273,18 +3278,21 @@ def read_to_array(source, *, window=None, overview_level: int | None = None,
         # External `.tif.ovr` sidecar (issue #2112). GDAL/rasterio write
         # overview pyramids to a sibling file when the source is not a
         # COG; the sidecar's IFDs are the continuation of the base
-        # file's pyramid. Discovery only fires for local file paths;
-        # cloud / HTTP / file-like sources skip the lookup and keep the
-        # base-file-only behaviour. The sidecar must be loaded before
-        # IFD selection so ``overview_level`` can index into a unified
+        # file's pyramid. Discovery fires for local files, HTTP, and
+        # fsspec sources; file-like buffers skip the lookup.
+        # ``max_cloud_bytes`` propagates to ``load_sidecar`` so the
+        # sidecar fetch inherits the same byte budget the base file
+        # enforces (#2121). The sidecar must be loaded before IFD
+        # selection so ``overview_level`` indexes into a unified
         # pyramid list.
         from ._sidecar import (
             attach_sidecar_origin, find_sidecar, load_sidecar,
         )
         sidecar_origin: dict[int, tuple] = {}
         sidecar_path = find_sidecar(source)
         if sidecar_path is not None:
-            sidecar = load_sidecar(sidecar_path)
+            sidecar = load_sidecar(sidecar_path,
+                                   max_cloud_bytes=cloud_budget)
             sidecar_origin = attach_sidecar_origin(
                 sidecar.ifds, sidecar.data, sidecar.header)
             ifds = ifds + sidecar.ifds

diff --git a/xrspatial/geotiff/_sidecar.py b/xrspatial/geotiff/_sidecar.py
@@ -129,7 +129,10 @@ def _probe_fsspec(uri: str) -> str | None:
         return None
 
 
-def load_sidecar(path: str) -> SidecarOverviews:
+def load_sidecar(path: str,
+                 *,
+                 max_cloud_bytes: int | None = None,
+                 ) -> SidecarOverviews:
     """Open and parse a sidecar ``.ovr`` file.
 
     Accepts local file paths, HTTP / HTTPS URLs, and fsspec URIs.
@@ -141,9 +144,36 @@ def load_sidecar(path: str) -> SidecarOverviews:
     full-resolution IFD, level 2 when the base file already carries
     one internal overview, and so on).
 
+    Parameters
+    ----------
+    path
+        Sidecar path or URL returned by :func:`find_sidecar`.
+    max_cloud_bytes
+        Byte ceiling applied to HTTP and fsspec downloads. Mirrors the
+        base-file ``max_cloud_bytes`` budget that ``read_to_array`` and
+        ``_CloudSource`` enforce so a hostile or malformed sidecar can
+        not bypass the cap a caller already set on the source. ``None``
+        (the default) means unbounded -- matches the base-file semantics
+        when the caller passes ``max_cloud_bytes=None`` explicitly.
+        Ignored on the local-file path because mmap does not allocate
+        the file. Issue #2121.
+
     The returned ``data`` is either an ``mmap`` (local) or ``bytes``
     (remote). Callers should close the mmap variant via
     ``data.close()`` when present; the bytes case is no-op.
+
+    Raises
+    ------
+    CloudSizeLimitError
+        If the sidecar's size exceeds ``max_cloud_bytes`` on either
+        transport. fsspec checks the declared size up front via
+        ``fs.size()``; HTTP catches the ``OSError`` that
+        :meth:`_HTTPSource.read_all` raises from its
+        ``Content-Length`` pre-check or streaming overshoot detector
+        and re-raises it as ``CloudSizeLimitError`` so callers see a
+        single exception type for "sidecar too big". Non-budget HTTP
+        failures (connection reset, DNS error, etc.) pass through as
+        ``OSError`` unchanged.
     """
     if "://" not in path:
         f = open(path, "rb")
@@ -156,12 +186,60 @@ def load_sidecar(path: str) -> SidecarOverviews:
         # inherits SSRF validation, IP pinning, the shared urllib3
         # PoolManager, and manual redirect re-validation. See
         # ``_probe_http`` for the threat model the indirection closes.
-        from ._reader import _HTTPSource
-        data = _HTTPSource(path).read_all()
+        # ``max_bytes`` here closes a separate gap: without it, the
+        # sidecar fetch ignores the ``max_cloud_bytes`` budget the
+        # caller set on the base file and a hostile server can serve a
+        # multi-GB ``.ovr`` to OOM the process. Issue #2121.
+        from ._reader import CloudSizeLimitError, _HTTPSource
+        try:
+            data = _HTTPSource(path).read_all(max_bytes=max_cloud_bytes)
+        except OSError as e:
+            # ``_HTTPSource.read_all(max_bytes=...)`` raises ``OSError``
+            # with "byte budget" in the message for both the
+            # Content-Length pre-check and the streaming overshoot probe
+            # (see ``_HTTPSource._check_content_length`` and
+            # ``_read_capped``). Translate to ``CloudSizeLimitError`` so
+            # the HTTP and fsspec branches surface the same exception
+            # type for the same failure mode. Other ``OSError``
+            # subtypes (connection reset, DNS, etc.) pass through
+            # untouched.
+            if max_cloud_bytes is not None and "byte budget" in str(e):
+                raise CloudSizeLimitError(
+                    f"Sidecar {path!r} exceeds "
+                    f"max_cloud_bytes={max_cloud_bytes:,}. Raise "
+                    f"max_cloud_bytes (or set "
+                    f"XRSPATIAL_GEOTIFF_MAX_CLOUD_BYTES) if the sidecar "
+                    f"is legitimate, or pass max_cloud_bytes=None on the "
+                    f"source to disable the check."
+                ) from e
+            raise
     else:
-        # fsspec URI
+        # fsspec URI. Stat the sidecar first so an oversized object is
+        # rejected before any bytes hit memory, mirroring the
+        # ``_CloudSource`` size guard at ``_reader.py:3239-3260``.
+        # Issue #2121.
         import fsspec
-        with fsspec.open(path, "rb") as f:
+        fs, fs_path = fsspec.core.url_to_fs(path)
+        if max_cloud_bytes is not None:
+            size = fs.size(fs_path)
+            from ._reader import CloudSizeLimitError
+            if size is None:
+                raise CloudSizeLimitError(
+                    f"Sidecar {path!r} reports unknown size; refusing "
+                    f"to download to avoid an unbounded read. Pass "
+                    f"max_cloud_bytes=None on the source to disable the "
+                    f"check for this sidecar."
+                )
+            if size > max_cloud_bytes:
+                raise CloudSizeLimitError(
+                    f"Sidecar {path!r} is {size:,} bytes, which exceeds "
+                    f"max_cloud_bytes={max_cloud_bytes:,}. Raise "
+                    f"max_cloud_bytes (or set "
+                    f"XRSPATIAL_GEOTIFF_MAX_CLOUD_BYTES) if the sidecar "
+                    f"is legitimate, or pass max_cloud_bytes=None on the "
+                    f"source to disable the check."
+                )
+        with fs.open(fs_path, "rb") as f:
             data = f.read()
     header = parse_header(data)
     ifds = parse_all_ifds(data, header)