Skip to content

Add native CUDA kernel for hydraulic erosion (#961)#967

Merged
brendancol merged 3 commits into
masterfrom
issue-961
Mar 4, 2026
Merged

Add native CUDA kernel for hydraulic erosion (#961)#967
brendancol merged 3 commits into
masterfrom
issue-961

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #961

Summary

  • Adds _erode_gpu_kernel (@cuda.jit), one CUDA thread per particle, with cuda.atomic.add for heightmap conflicts
  • Replaces the CPU-fallback path for CuPy and Dask+CuPy arrays with on-device execution
  • Refactors erode() to use ArrayTypeFunctionMapping dispatch instead of manual type checks
  • Adds test_erosion.py (15 tests across numpy, dask+numpy, cupy, dask+cupy)
  • Adds user guide notebook 21_Hydraulic_Erosion.ipynb
  • Updates README feature matrix from fallback to native for GPU columns

Notes

  • GPU results are non-deterministic because cuda.atomic.add ordering depends on thread scheduling. GPU tests use correlation checks instead of exact comparison.
  • Float64 arithmetic in the GPU kernel to match the CPU path.
  • Erosion is a global operation (particles traverse the full grid), so dask arrays are still materialized before processing. They just run on-device now instead of round-tripping through NumPy.

Test plan

  • 15 new tests in test_erosion.py, all passing
  • Verify GPU tests pass on CI with CUDA
  • Run user guide notebook end-to-end to check plots render

Replaces the CPU-fallback path for CuPy and Dask+CuPy arrays with a
real GPU kernel. Each particle maps to one CUDA thread; conflicts at
shared heightmap cells are resolved with cuda.atomic.add.

Refactors erode() to use ArrayTypeFunctionMapping dispatch instead of
manual type checks.

Also adds test_erosion.py with 15 tests covering numpy, dask+numpy,
cupy, and dask+cupy backends.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Mar 4, 2026
@brendancol brendancol merged commit f08cb0f into master Mar 4, 2026
11 checks passed
@brendancol brendancol deleted the issue-961 branch May 4, 2026 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add native CUDA kernel for hydraulic erosion

1 participant