Skip to content

Fix bump OOM: int32 coords, count cap, per-chunk dask partitioning (#1206)#1208

Merged
brendancol merged 1 commit into
masterfrom
issue-1206
Apr 16, 2026
Merged

Fix bump OOM: int32 coords, count cap, per-chunk dask partitioning (#1206)#1208
brendancol merged 1 commit into
masterfrom
issue-1206

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Fixes #1206. Three issues in bump() that cause wrong results and OOM on large rasters:

  • uint16 coordinate overflow: locs used dtype=np.uint16, so coordinates > 65,535 wrapped silently. Now uses int32.
  • Unbounded default count: count defaulted to w * h // 10 with no ceiling. Now capped at 10M bumps, plus a memory guard that raises MemoryError when the arrays would exceed 50% of available RAM.
  • Closure serialization in dask paths: _bump_dask_numpy and _bump_dask_cupy captured the full locs and heights arrays in every chunk closure. At 2048x2048 default count with 128x128 chunks, that was 5 MB/chunk across 256 chunks = 1.29 GB of graph payload. Now pre-partitions bumps by chunk using dask.delayed + da.block, so each task only serializes its own subset. Graph serialization reduced ~250x.

Test plan

  • Existing bump tests pass (all 10)
  • New test: int32 coordinates work at 70,000 (no uint16 wrap)
  • New test: default count capped for 20,000 x 20,000 raster
  • New test: MemoryError raised for huge explicit count
  • New test: dask graph size stays proportional to total bumps, not bumps * chunks
  • New test: single-chunk dask matches numpy bitwise

…oning (#1206)

- Change locs dtype from uint16 to int32 to fix silent coordinate
  wrap-around for rasters with dimensions > 65535
- Cap default count (w*h//10) at 10M bumps to prevent unbounded
  eager allocation on large rasters
- Add memory guard that raises MemoryError when bump arrays would
  exceed 50% of available RAM
- Replace closure-based dask paths with per-chunk partitioning via
  dask.delayed + da.block, so each task only serializes its own
  bump subset instead of the entire locs/heights arrays
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Apr 16, 2026
@brendancol brendancol merged commit c4bb608 into master Apr 16, 2026
11 checks passed
@brendancol brendancol deleted the issue-1206 branch May 4, 2026 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bump: uint16 overflow, unbounded default count, and closure serialization OOM

1 participant