diff --git a/.claude/sweep-security-state.csv b/.claude/sweep-security-state.csv index 88aee018..2a6a4bc5 100644 --- a/.claude/sweep-security-state.csv +++ b/.claude/sweep-security-state.csv @@ -22,6 +22,7 @@ geotiff,2026-04-17,1215,HIGH,1;4,1219;1220,Follow-up #1220: GPU predictor=2 kern glcm,2026-04-24,1257,HIGH,1,,"HIGH (fixed #1257): glcm_texture() validated window_size only as >= 3 and distance only as >= 1, with no upper bound on either. _glcm_numba_kernel iterates range(r-half, r+half+1) for every pixel, so window_size=1_000_001 on a 10x10 raster ran ~10^14 loop iterations with all neighbors failing the interior bounds check (CPU DoS). On the dask backends depth = window_size // 2 + distance drove map_overlap padding, so a huge window also caused oversize per-chunk allocations (memory DoS). Fixed by adding max_val caps in the public entrypoint: window_size <= max(3, min(rows, cols)) and distance <= max(1, window_size // 2). One cap covers every backend because cupy and dask+cupy call through to the CPU kernel after cupy.asnumpy. No other HIGH findings: levels is already capped at 256 so the per-pixel np.zeros((levels, levels)) matrix in the kernel is bounded to 512 KB. No CUDA kernels. No file I/O. Quantization clips to [0, levels-1] before the kernel and NaN maps to -1 which the kernel filters with i_val >= 0. Entropy log(p) and correlation p / (std_i * std_j) are both guarded. All four backends use _validate_raster and cast to float64 before quantizing. MEDIUM (unfixed, Cat 1): the per-pixel np.zeros((levels, levels)) allocation inside the hot loop is a perf issue (levels=256 -> 512 KB alloc+free per pixel) but not a security issue because levels is bounded. Could be hoisted out of the loop or replaced with an in-place clear, but that is an efficiency concern, not security." hillshade,2026-04-27,,,,,"Clean. Cat 1: only allocation is the output np.empty(data.shape) at line 32 (cupy at line 165) and a _pad_array with hardcoded depth=1 (line 62) -- bounded by caller, no user-controlled amplifier. Azimuth/altitude are scalars and don't drive size. Cat 2: numba kernel uses range(1, rows-1) with simple (y, x) indexing; numba range loops promote to int64. Cat 3: math.sqrt(1.0 + xx_plus_yy) is always >= 1.0 (no neg sqrt, no div-by-zero); NaN elevation propagates correctly through dz_dx/dz_dy -> shaded -> output (the shaded < 0.0 / shaded > 1.0 clamps don't fire on NaN). Azimuth validated to [0, 360], altitude to [0, 90]. Cat 4: _gpu_calc_numba (line 107) guards both grid bounds and 3x3 stencil reads via i > 0 and i < shape[0]-1 and j > 0 and j < shape[1]-1; no shared memory. Cat 5: no file I/O. Cat 6: hillshade() calls _validate_raster (line 252) and _validate_scalar for both azimuth (253) and angle_altitude (254); all four backend paths cast to float32; tests parametrize int32/int64/float32/float64." hydro,2026-04-17,,MEDIUM,1;3;6,, +kde,2026-04-27,1287,HIGH,1,,"HIGH (fixed #1287): kde() and line_density() accepted user-controlled width/height with no upper bound. The eager numpy and cupy backends allocated np.zeros((height, width), dtype=float64) (or cupy.zeros) up front (kde.py: _run_kde_numpy line 308, _run_kde_cupy line 314, line_density inline at line 706). width=1_000_000, height=1_000_000 requested ~8 TB of float64 (or VRAM on the GPU path) before any check ran. Fixed by adding local _available_memory_bytes() helper (mirrors convolution/morphology/bump pattern) and _check_grid_memory(rows, cols) that raises MemoryError when rows*cols*8 exceeds 50% of available RAM. Wired into kde() (skipped for dask paths since _run_kde_dask_numpy/_run_kde_dask_cupy build per-tile via da.from_delayed and are bounded by chunk size) and line_density() (single numpy backend, always guarded). Error message names width/height so the caller knows which knob to turn. No other HIGH findings: Cat 2 (no int32 flat-index math, numba range loops are int64), Cat 3 (bandwidth <= 0 rejected, Silverman fallback returns 1.0 when sigma==0, NaN coords clamp to empty range via min/max), Cat 4 (_kde_cuda has 'if r >= rows or c >= cols: return' bounds guard at line 254, no shared memory, each thread writes own pixel), Cat 5 (no file I/O), Cat 6 (template only used for shape/coords, output dtype forced to float64). MEDIUM (unfixed, Cat 6): _validate_template only checks DataArray + ndim; does not call _validate_raster, but template dtype does not affect compute correctness here." mahalanobis,2026-04-27,1288,HIGH,1,,"HIGH (fixed #1288): mahalanobis() had no memory guard. Both _compute_stats_numpy/_compute_stats_cupy and _mahalanobis_pixel_numpy/_mahalanobis_pixel_cupy materialise float64 buffers of shape (n_bands, H*W) -- the np.stack at line 45/80, the reshape+transpose at line 184 (which forces a contiguous BLAS copy), the centered diff, and the diff @ inv_cov result are all live at peak. A 100kx100k 5-band raster projected to ~400 GB of host memory just for the stack. Fixed by adding _available_memory_bytes()/_available_gpu_memory_bytes() (mirroring cost_distance.py:261-292) plus _check_memory/_check_gpu_memory at 32 bytes/cell/band budget, and wiring them into the public mahalanobis() entry point before any np.stack runs. Eager paths (numpy, cupy) are guarded; dask paths skip the check because chunks are bounded by user-supplied chunksize. MEDIUM (unfixed, Cat 6): mahalanobis() does not call _validate_raster on each band -- validate_arrays only enforces matching shape and array-type, so boolean / non-numeric DataArrays silently coerce. Deferred to a separate PR per the security-sweep one-fix-per-PR policy. No other HIGH findings: Cat 2 (no int32 indexing, numpy default int64), Cat 3 (singular covariance raises a clean ValueError, dist_sq is clamped to 0 before sqrt to absorb numerical noise, NaN mask propagates correctly), Cat 4 (no CUDA kernels), Cat 5 (no file I/O beyond /proc/meminfo)." morphology,2026-04-24,1256,HIGH,1,,"HIGH (fixed #1256): morph_erode/morph_dilate/morph_opening/morph_closing/morph_gradient/morph_white_tophat/morph_black_tophat accepted a user-supplied kernel with only shape/dtype/odd-size validation. Kernel dimensions drove np.pad/cp.pad on every backend and map_overlap depth on dask paths; a 99999x99999 kernel on a 1000x1000 raster would try to allocate ~80 GB of padded float64 memory with no warning. Fixed by adding local _available_memory_bytes() helper and _check_kernel_memory(rows, cols, ky, kx) that raises MemoryError before allocation when padded size exceeds 50% of available RAM; wired into _dispatch() so every public API entry point is guarded across all four backends. Mirrors bilateral #1236, convolution #1241, bump #1231. No other HIGH findings: Cat 2 (loop indices are Python ints, numba promotes to int64), Cat 3 (NaN propagation explicit via v!=v in both numpy and CUDA paths, tests verify), Cat 4 (GPU kernels _erode_gpu/_dilate_gpu have if i ~2B elements in the numpy path. MEDIUM (unfixed): hypsometric_integral() skips _validate_raster on zones/values; _regions_numpy has no memory guard (numpy-only path, bounded by caller-allocated input). MEDIUM (unfixed): _stats_numpy return_type='xarray.DataArray' allocates np.full((n_stats, values.size)) with no guard." -kde,2026-04-27,1287,HIGH,1,,"HIGH (fixed #1287): kde() and line_density() accepted user-controlled width/height with no upper bound. The eager numpy and cupy backends allocated np.zeros((height, width), dtype=float64) (or cupy.zeros) up front (kde.py: _run_kde_numpy line 308, _run_kde_cupy line 314, line_density inline at line 706). width=1_000_000, height=1_000_000 requested ~8 TB of float64 (or VRAM on the GPU path) before any check ran. Fixed by adding local _available_memory_bytes() helper (mirrors convolution/morphology/bump pattern) and _check_grid_memory(rows, cols) that raises MemoryError when rows*cols*8 exceeds 50% of available RAM. Wired into kde() (skipped for dask paths since _run_kde_dask_numpy/_run_kde_dask_cupy build per-tile via da.from_delayed and are bounded by chunk size) and line_density() (single numpy backend, always guarded). Error message names width/height so the caller knows which knob to turn. No other HIGH findings: Cat 2 (no int32 flat-index math, numba range loops are int64), Cat 3 (bandwidth <= 0 rejected, Silverman fallback returns 1.0 when sigma==0, NaN coords clamp to empty range via min/max), Cat 4 (_kde_cuda has 'if r >= rows or c >= cols: return' bounds guard at line 254, no shared memory, each thread writes own pixel), Cat 5 (no file I/O), Cat 6 (template only used for shape/coords, output dtype forced to float64). MEDIUM (unfixed, Cat 6): _validate_template only checks DataArray + ndim; does not call _validate_raster, but template dtype does not affect compute correctness here." diff --git a/xrspatial/sieve.py b/xrspatial/sieve.py index ff35a226..0f34a0a1 100644 --- a/xrspatial/sieve.py +++ b/xrspatial/sieve.py @@ -223,6 +223,93 @@ def _collect(a, b): return adjacency +# --------------------------------------------------------------------------- +# Memory guards +# --------------------------------------------------------------------------- + +# Peak working set for the union-find pass: +# result copy 8 bytes (float64) +# parent 4 bytes (int32) +# rank / root_to_id 4 bytes (int32, reused) +# region_map_flat 4 bytes (int32) +# slack for region_val, +# region_size, valid mask 8 bytes +# Total ~28 bytes/pixel. Matches the budget the dask paths already use. +_BYTES_PER_PIXEL = 28 + + +def _available_memory_bytes(): + """Best-effort estimate of available memory in bytes.""" + try: + with open("/proc/meminfo", "r") as f: + for line in f: + if line.startswith("MemAvailable:"): + return int(line.split()[1]) * 1024 + except (OSError, ValueError, IndexError): + pass + try: + import psutil + + return psutil.virtual_memory().available + except (ImportError, AttributeError): + pass + return 2 * 1024**3 + + +def _available_gpu_memory_bytes(): + """Best-effort estimate of free GPU memory in bytes. + + Returns 0 when CuPy / CUDA is unavailable or the query fails -- callers + treat that as "no GPU info, skip the guard". + """ + try: + import cupy as _cp + + free, _total = _cp.cuda.runtime.memGetInfo() + return int(free) + except Exception: + return 0 + + +def _check_memory(rows, cols): + """Raise MemoryError if the union-find pass would exceed 50% of RAM.""" + required = int(rows) * int(cols) * _BYTES_PER_PIXEL + available = _available_memory_bytes() + if required > 0.5 * available: + raise MemoryError( + f"sieve() on a {rows}x{cols} raster needs " + f"~{required / 1e9:.1f} GB of working memory but only " + f"~{available / 1e9:.1f} GB is available. " + f"Connected-component labeling is a global operation that " + f"cannot be chunked. Consider downsampling or tiling the " + f"input manually." + ) + + +def _check_gpu_memory(rows, cols): + """Raise MemoryError when the CuPy round-trip would not fit. + + The CuPy backend transfers to host and runs the CPU sieve, so the + host budget still applies; we also check free GPU RAM so a user + with little VRAM gets a clear error before ``data.get()`` runs. + Skips silently when the GPU memory query fails. + """ + _check_memory(rows, cols) + available = _available_gpu_memory_bytes() + if available <= 0: + return + # Round-trip needs the float64 input on device plus a float64 result. + required = int(rows) * int(cols) * 16 + if required > 0.5 * available: + raise MemoryError( + f"sieve() on a {rows}x{cols} cupy raster needs " + f"~{required / 1e9:.1f} GB of GPU memory for the round-trip " + f"but only ~{available / 1e9:.1f} GB is free on the active " + f"device. Use a dask+cupy DataArray for out-of-core " + f"processing or downsample the input." + ) + + # --------------------------------------------------------------------------- # numpy backend # --------------------------------------------------------------------------- @@ -237,6 +324,7 @@ def _sieve_numpy(data, threshold, neighborhood, skip_values): so that earlier merges can grow a neighbor above threshold for later ones within the same pass. """ + _check_memory(data.shape[0], data.shape[1]) result = data.astype(np.float64, copy=True) is_float = np.issubdtype(data.dtype, np.floating) valid = ~np.isnan(result) if is_float else np.ones(result.shape, dtype=bool) @@ -309,6 +397,7 @@ def _sieve_cupy(data, threshold, neighborhood, skip_values): """CuPy backend: transfer to CPU, sieve, transfer back.""" import cupy as cp + _check_gpu_memory(data.shape[0], data.shape[1]) np_result = _sieve_numpy( data.get(), threshold, neighborhood, skip_values ) @@ -320,24 +409,6 @@ def _sieve_cupy(data, threshold, neighborhood, skip_values): # --------------------------------------------------------------------------- -def _available_memory_bytes(): - """Best-effort estimate of available memory in bytes.""" - try: - with open("/proc/meminfo", "r") as f: - for line in f: - if line.startswith("MemAvailable:"): - return int(line.split()[1]) * 1024 - except (OSError, ValueError, IndexError): - pass - try: - import psutil - - return psutil.virtual_memory().available - except (ImportError, AttributeError): - pass - return 2 * 1024**3 - - def _sieve_dask(data, threshold, neighborhood, skip_values): """Dask+numpy backend: compute to numpy, sieve, wrap back.""" avail = _available_memory_bytes() diff --git a/xrspatial/tests/test_sieve.py b/xrspatial/tests/test_sieve.py index 1d56a1cc..efb69da9 100644 --- a/xrspatial/tests/test_sieve.py +++ b/xrspatial/tests/test_sieve.py @@ -396,6 +396,36 @@ def test_sieve_dask_memory_guard(): _sieve_dask(huge, 10, 4, None) +def test_sieve_numpy_memory_guard(): + """Numpy backend raises MemoryError before allocating CCL buffers.""" + from unittest.mock import patch + + from xrspatial.sieve import _sieve_numpy + + # Use a tiny array but mock available memory to 1 byte so the + # guard fires regardless of host RAM. + arr = np.zeros((4, 4), dtype=np.float64) + with patch("xrspatial.sieve._available_memory_bytes", return_value=1): + with pytest.raises(MemoryError, match="working memory"): + _sieve_numpy(arr, 10, 4, None) + + +def test_sieve_numpy_memory_guard_via_public_api(): + """The public sieve() entry point trips the numpy guard.""" + from unittest.mock import patch + + from xrspatial.sieve import sieve as _sieve + + raster = xr.DataArray( + np.zeros((4, 4), dtype=np.float64), dims=["y", "x"] + ) + + # Mock available memory to 1 byte so even a 4x4 raster trips it. + with patch("xrspatial.sieve._available_memory_bytes", return_value=1): + with pytest.raises(MemoryError, match="working memory"): + _sieve(raster, threshold=2) + + # --------------------------------------------------------------------------- # Numpy / dask consistency # ---------------------------------------------------------------------------