Skip to content

Guard basin_d8() against unbounded memory allocations (#1357)#1359

Merged
brendancol merged 1 commit into
mainfrom
issue-1357-basin-d8-memory-guard
Apr 29, 2026
Merged

Guard basin_d8() against unbounded memory allocations (#1357)#1359
brendancol merged 1 commit into
mainfrom
issue-1357-basin-d8-memory-guard

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #1357.

Summary

basin_d8() on the numpy and cupy backends allocated H*W working arrays without checking available memory:

  • numpy: fd cast (8 B), labels (8 B), state (1 B), plus path_r/path_c int64 inside _watershed_cpu (8 + 8 B). ~33 bytes/pixel.
  • cupy: flow_dir_f64, labels, state, and the final cp.where result. ~28 bytes/pixel on the device.

A 50000x50000 input on CPU would request ~83 GB before erroring out.

Fix

Add module-local _BYTES_PER_PIXEL / _GPU_BYTES_PER_PIXEL, _available_memory_bytes / _available_gpu_memory_bytes, _check_memory / _check_gpu_memory helpers wired into the numpy and cupy dispatch branches before allocations. 50% threshold matches the watershed_d8 / flow_accumulation_d8 / hand_d8 series.

The dask paths delegate to the watershed dask infrastructure and stay bounded per-tile by the user's chunk size, so they skip the guard.

Tests

5 new tests in TestMemoryGuard:

  • numpy raises MemoryError when _available_memory_bytes is patched low
  • numpy normal-sized input succeeds against real memory
  • dask path skips the guard and computes
  • error message includes the grid dimensions and a dask hint
  • cupy raises MemoryError when _available_gpu_memory_bytes is patched low (skipped without CUDA)

Test plan

  • pytest xrspatial/hydro/tests/test_basin_d8.py -- 22 passed
  • pytest xrspatial/hydro/tests/ -- 644 passed

basin_d8() on the numpy and cupy backends allocated H*W working
arrays (~33 B/pixel CPU, ~28 B/pixel GPU) without checking available
memory.  A 50000x50000 input would request ~83 GB before failing.

Add module-local _BYTES_PER_PIXEL / _GPU_BYTES_PER_PIXEL,
_available_memory_bytes / _available_gpu_memory_bytes,
_check_memory / _check_gpu_memory helpers (50% threshold) and call
them from the numpy and cupy dispatch branches.  The dask paths
delegate to the watershed dask infrastructure and remain bounded
per-tile, so they skip the guard.

Tests cover the four guard outcomes plus a normal-sized success.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Apr 29, 2026
@brendancol brendancol merged commit 517f40f into main Apr 29, 2026
11 checks passed
@brendancol brendancol deleted the issue-1357-basin-d8-memory-guard branch May 4, 2026 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

basin_d8: no memory guard on H*W working arrays

1 participant