Skip to content

Add preview() for memory-safe raster downsampling#987

Merged
brendancol merged 5 commits into
masterfrom
worktree-issue-986
Mar 6, 2026
Merged

Add preview() for memory-safe raster downsampling#987
brendancol merged 5 commits into
masterfrom
worktree-issue-986

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Closes #986.

  • Adds preview() function that downsamples a DataArray (or Dataset) to a target pixel size using block averaging via xarray.coarsen. For dask arrays the operation stays lazy, so a 30 TB raster can be previewed at 1000x1000 without blowing memory.
  • Supports all four backends: numpy, cupy (stride-based subsampling), dask+numpy, and dask+cupy.
  • Wired into the .xrs accessor for both DataArray and Dataset.
  • Added to __init__.py exports and API reference docs.
  • Includes a user guide notebook demonstrating the workflow on a 1 TB tiled dask terrain.

Test plan

  • 17 tests covering basic correctness, explicit height, custom name, small raster passthrough, one-axis reduction, NaN handling, block averaging values, dask laziness, dask-vs-numpy parity, cupy, dask+cupy, Dataset support, input validation, and accessor syntax
  • All tests pass locally (including GPU backends)

Uses xarray coarsen with block averaging for numpy and dask backends,
stride-based subsampling for CuPy. Dask arrays stay lazy so peak memory
is bounded by the largest chunk plus the output.

Accepts DataArray or Dataset via @supports_dataset decorator.
Covers numpy, dask, cupy, dask+cupy, Dataset, NaN handling,
block averaging correctness, passthrough for small rasters,
input validation, and accessor integration.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Mar 6, 2026
@brendancol brendancol merged commit f97050e into master Mar 6, 2026
10 of 11 checks passed
@brendancol brendancol deleted the worktree-issue-986 branch May 4, 2026 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add preview() for memory-safe downsampling of large dask arrays

1 participant