Guard flow_accumulation() against unbounded eager allocations (#1318) by brendancol · Pull Request #1319 · xarray-contrib/xarray-spatial

brendancol · 2026-04-29T15:55:51Z

Summary

Adds _check_memory / _check_gpu_memory budget checks (29 B/px CPU, 16 B/px GPU, 50% threshold) to the eager numpy and cupy backends in flow_accumulation_d8.py.
Dask backends skip the guard because per-tile allocations are already bounded by chunk size.
Adds 4 memory-guard tests (numpy raise, normal-input pass, dask bypass, error message).

Background

_flow_accum_cpu allocated accum (float64), in_degree (int32), valid (int8), and two H*W int64 BFS queues, ~29 B/pixel of working memory plus the caller's input. _flow_accum_cupy allocated accum (float64) + in_degree (int32) + state (int32) ~16 B/pixel of GPU memory. Neither backend checked against available memory, so a 50000x50000 numpy raster asked for ~72 GB of host memory before anything errored out.

Hydro is safety-critical, so the same asymmetric-guard pattern applies here as in sieve (#1298), kde (#1289), resample (#1297), sky_view_factor (#1300), surface_distance (#1305).

The dinf and mfd flow_accumulation variants share the same shape and will be handled in separate follow-up PRs per the one-fix-per-security-PR policy.

Test plan

pytest xrspatial/hydro/tests/test_flow_accumulation_d8.py -- 33 passed (29 existing + 4 new memory-guard cases)
Mocked tiny memory budget on numpy raises MemoryError with a "working memory" / "dask" message
Normal-size input still succeeds
Dask backend bypasses the guard with mocked 1-byte budget

Fixes #1318. flow_accumulation() on the numpy and cupy backends had no memory check. _flow_accum_cpu allocated accum (8 B/px) + in_degree (4 B/px) + valid (1 B/px) + queue_r/queue_c (8 B/px each) ~ 29 B/pixel of working memory plus the caller's input array. _flow_accum_cupy did the same shape on the device at ~16 B/pixel. A 50000x50000 numpy raster asked for ~72 GB of host memory before anything errored out. Adds _available_memory_bytes / _available_gpu_memory_bytes helpers and _check_memory / _check_gpu_memory budget checks at 50% of available RAM/VRAM. Wires them into the public flow_accumulation_d8() dispatch before the eager numpy and cupy paths run. Dask paths skip the guard because per-tile allocations are bounded by chunk size. Mirrors the pattern from sieve (#1298), kde (#1289), resample (#1297), sky_view_factor (#1300), surface_distance (#1305).

…1321) (#1324) Port the _check_memory / _check_gpu_memory helpers from #1319 into the MFD variant. The numpy and cupy backends now reject grids whose working set would exceed 50% of host or device memory, with a message that points the caller at the dask backends for out-of-core work. CPU working memory: ~29 B/px (accum + in_degree + valid + queue_r + queue_c). GPU working memory: ~16 B/px (accum + in_degree + state). Dask backends are unaffected -- per-tile allocations are bounded by chunk size. Adds 4 memory-guard tests: oversize-rejection, valid-pass-through, dimension-in-message, dask-suggestion-in-message. Fixes #1321.

…1322) (#1325) Adds _check_memory (29 B/px) and _check_gpu_memory (16 B/px) budget checks at the 50% threshold to the eager numpy and cupy backends. Dask paths already use bounded per-tile allocations so they skip the guard. Same root cause and fix shape as #1318 / #1319 (flow_accumulation_d8).

…1331) Mirror the asymmetric guard pattern from PR #1319: numpy and cupy backends check projected working set against 50% of available memory before allocating H*W arrays; dask backends skip the check since per-tile allocations are already bounded. CPU peak working set is ~40 B/px (Strahler kernel: order, in_degree, max_in, cnt_max, queue_r, queue_c). GPU peak is ~37 B/px, budgeted at 40 B/px conservatively. Adds 5 tests covering oversize numpy rejection, normal pass, dask bypass, dimensions in error message, and cupy oversize gating.

) Add memory guards to the numpy and cupy dispatch branches. CPU peak working set is ~33 B/pixel: float64 cast (8) + labels (8) + state (1) + path_r (8) + path_c (8). A 50000x50000 raster needs ~83 GB before the dispatch even runs. GPU peak is ~28 B/pixel on the device: flow_dir_f64 (8) + pp_f64 (8) + labels (8) + state int32 (4). Helpers _check_memory and _check_gpu_memory raise MemoryError with the grid dimensions and a pointer to the dask backend when the projected working set exceeds 50% of available memory. Dask paths skip the guard since per-tile allocations are bounded by the user's chunk size. Same pattern as #1319, #1325, #1324, #1326.

Adds _check_memory and _check_gpu_memory budget checks to the eager numpy and cupy backends in hand_d8.py. The kernel allocates ~38 B/px of working memory (in_degree int32, valid int8, is_stream int8, drain_elev float64, hand_out float64, plus two int64 BFS queues), so a 50000x50000 raster requested ~95 GB before any sanity check. Dask backends skip the guard since per-tile allocations are bounded by chunk size. Mirrors the pattern from #1319 (flow_accumulation_d8). Adds 4 memory-guard tests (numpy raise, normal-input pass, dask bypass, error message + dimension content) plus a cupy raise test that's skipped without CUDA. 636 hydro tests pass.

…1332) Adds a memory guard to flow_length_d8() matching the pattern from #1319, #1324, #1325, and #1326. The eager numpy and cupy backends now raise MemoryError before allocating an HxW working set that would exceed 50% of available host or GPU memory. Dask paths skip the check since per-tile allocations are bounded by chunk size. CPU budget is 29 B/px (in_degree int32 + valid int8 + flow_len float64 + order_r/order_c int64). GPU budget is 32 B/px covering the device input + output copies in _flow_length_cupy.

The numpy and cupy paths allocate H*W working buffers (labels and BFS queues on CPU, labels grid on GPU) before any sanity check on the input size. Passing a sufficiently large in-memory raster can OOM the host or device. Add per-module _BYTES_PER_PIXEL (24) and _GPU_BYTES_PER_PIXEL (8) constants and _check_memory / _check_gpu_memory helpers that raise MemoryError when the projected working set exceeds 50% of available RAM / free GPU memory. Wire the guards into the eager numpy and cupy branches of sink_d8(); dask paths skip the guard since per-tile allocations are bounded. Mirrors the pattern from #1318/#1319 and the rest of the hydro guard series.

… (#1366) The numpy and cupy dispatches each allocate three full H*W float64 buffers (flow_accum cast, pour_points cast, output) -- ~24 B/px with no memory check. Add per-module _check_memory and _check_gpu_memory helpers modeled on flow_accumulation_d8 (#1318/#1319), wired into the public dispatch before the eager allocations. Dask paths use windowed slicing and skip the guard.

github-actions Bot added the performance PR touches performance-sensitive code label Apr 29, 2026

brendancol merged commit 23c5aae into main Apr 29, 2026
11 checks passed

This was referenced Apr 29, 2026

stream_link_dinf: no memory guard on H*W working arrays #1343

Closed

hand_dinf: no memory guard on H*W working arrays #1344

Closed

watershed_dinf: no memory guard on H*W working arrays #1345

Closed

This was referenced Apr 29, 2026

stream_order_mfd: no memory guard on H*W working arrays #1349

Closed

stream_order_dinf: no memory guard on H*W working arrays #1350

Closed

flow_length_mfd: no memory guard on H*W working arrays #1351

Closed

brendancol mentioned this pull request Apr 29, 2026

Guard sink_d8() against unbounded memory allocations (#1356) #1361

Merged

2 tasks

brendancol mentioned this pull request Apr 29, 2026

owa() unbounded xr.concat allocation across criteria #1370

Closed

brendancol deleted the fix/1318-flow-accumulation-memory-guard branch May 4, 2026 13:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guard flow_accumulation() against unbounded eager allocations (#1318)#1319

Guard flow_accumulation() against unbounded eager allocations (#1318)#1319
brendancol merged 1 commit into
mainfrom
fix/1318-flow-accumulation-memory-guard

brendancol commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brendancol commented Apr 29, 2026

Summary

Background

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant