Skip to content

Fix diffusion dask OOM: pass scalar diffusivity directly to chunks#1117

Merged
brendancol merged 4 commits into
masterfrom
issue-1116
Mar 31, 2026
Merged

Fix diffusion dask OOM: pass scalar diffusivity directly to chunks#1117
brendancol merged 4 commits into
masterfrom
issue-1116

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

@brendancol brendancol commented Mar 31, 2026

Summary

  • For scalar diffusivity, the dask chunk function now receives the float value directly instead of a full-raster numpy array captured in every task closure
  • For DataArray diffusivity, the dask path passes the dask array as a second argument to map_overlap so each chunk gets only its own slice
  • Eliminates np.full(agg.shape, ...) allocation (line 261) and diffusivity.values materialization (line 255) that caused OOM on large dask inputs

Context

Found during performance sweep triage (#1116). On 512x512: dask used 35 MB vs numpy's 27 MB, with the 8 MB gap from the full-raster alpha_arr. At 30TB this allocation alone would OOM before any diffusion work starts.

Test plan

  • All 15 existing diffusion tests pass (verified)
  • test_dask_matches_numpy confirms dask scalar path matches numpy output

Parallel subagent triage + ralph-loop workflow for auditing all
xrspatial modules for performance bottlenecks, OOM risk under
30TB dask workloads, and backend-specific anti-patterns.
7 tasks covering command scaffold, module scoring, parallel subagent
dispatch, report merging, ralph-loop generation, and smoke tests.
For scalar diffusivity, the dask chunk function now receives the float
value directly instead of a full-raster numpy array captured in every
task closure. This eliminates the O(H*W) eager allocation and the
per-task serialization overhead.

For DataArray diffusivity, the dask path passes the dask array as a
second argument to map_overlap so each chunk gets only its own slice.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Mar 31, 2026
@brendancol brendancol merged commit 74a6da9 into master Mar 31, 2026
11 checks passed
@brendancol brendancol deleted the issue-1116 branch May 4, 2026 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant