Guard basin_d8() against unbounded memory allocations (#1357) by brendancol · Pull Request #1359 · xarray-contrib/xarray-spatial

brendancol · 2026-04-29T17:51:02Z

Closes #1357.

Summary

basin_d8() on the numpy and cupy backends allocated H*W working arrays without checking available memory:

numpy: fd cast (8 B), labels (8 B), state (1 B), plus path_r/path_c int64 inside _watershed_cpu (8 + 8 B). ~33 bytes/pixel.
cupy: flow_dir_f64, labels, state, and the final cp.where result. ~28 bytes/pixel on the device.

A 50000x50000 input on CPU would request ~83 GB before erroring out.

Fix

Add module-local _BYTES_PER_PIXEL / _GPU_BYTES_PER_PIXEL, _available_memory_bytes / _available_gpu_memory_bytes, _check_memory / _check_gpu_memory helpers wired into the numpy and cupy dispatch branches before allocations. 50% threshold matches the watershed_d8 / flow_accumulation_d8 / hand_d8 series.

The dask paths delegate to the watershed dask infrastructure and stay bounded per-tile by the user's chunk size, so they skip the guard.

Tests

5 new tests in TestMemoryGuard:

numpy raises MemoryError when _available_memory_bytes is patched low
numpy normal-sized input succeeds against real memory
dask path skips the guard and computes
error message includes the grid dimensions and a dask hint
cupy raises MemoryError when _available_gpu_memory_bytes is patched low (skipped without CUDA)

Test plan

pytest xrspatial/hydro/tests/test_basin_d8.py -- 22 passed
pytest xrspatial/hydro/tests/ -- 644 passed

basin_d8() on the numpy and cupy backends allocated H*W working arrays (~33 B/pixel CPU, ~28 B/pixel GPU) without checking available memory. A 50000x50000 input would request ~83 GB before failing. Add module-local _BYTES_PER_PIXEL / _GPU_BYTES_PER_PIXEL, _available_memory_bytes / _available_gpu_memory_bytes, _check_memory / _check_gpu_memory helpers (50% threshold) and call them from the numpy and cupy dispatch branches. The dask paths delegate to the watershed dask infrastructure and remain bounded per-tile, so they skip the guard. Tests cover the four guard outcomes plus a normal-sized success.

github-actions Bot added the performance PR touches performance-sensitive code label Apr 29, 2026

brendancol merged commit 517f40f into main Apr 29, 2026
11 checks passed

brendancol deleted the issue-1357-basin-d8-memory-guard branch May 4, 2026 19:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guard basin_d8() against unbounded memory allocations (#1357)#1359

Guard basin_d8() against unbounded memory allocations (#1357)#1359
brendancol merged 1 commit into
mainfrom
issue-1357-basin-d8-memory-guard

brendancol commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brendancol commented Apr 29, 2026

Summary

Fix

Tests

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant