Skip to content

Guard kriging() against unbounded memory allocations (#1307)#1309

Merged
brendancol merged 1 commit into
mainfrom
issue-1307
Apr 29, 2026
Merged

Guard kriging() against unbounded memory allocations (#1307)#1309
brendancol merged 1 commit into
mainfrom
issue-1307

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Closes #1307

Test plan

  • pytest xrspatial/tests/test_interpolation.py (34 passed locally)
  • CI passes
  • No regressions on existing kriging numpy/dask tests

kriging() takes an arbitrary point count N and template grid size with
no upper bound.  Three eager allocations scale with these inputs:

- np.triu_indices(N) in _experimental_variogram (O(N^2) int64 pairs)
- the (N+1) x (N+1) kriging matrix and its inverse
- the (grid_pixels, N+1) prediction matrix in _kriging_predict

A caller passing 50k points or a 5000x5000 template silently triggers
tens of GB of allocation before any guard.

Add _check_kriging_memory() that estimates the worst case of these
three and raises MemoryError when the estimate exceeds 80% of available
memory (using xrspatial.zonal._available_memory_bytes, same pattern as
balanced_allocation).  The error message names which allocation drove
the estimate so the user knows whether to reduce N or the grid size.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Apr 29, 2026
@brendancol brendancol merged commit fe755a4 into main Apr 29, 2026
11 checks passed
brendancol added a commit that referenced this pull request Apr 30, 2026
)

The k-nearest path in `idw()` calls `cKDTree.query(query_pts, k=k)`,
which returns a `(grid_pixels, k)` float64 distance array and an
int64 index array. Peak allocation is `grid_pixels * k * 16` bytes
before any IDW arithmetic runs. A 50000 x 50000 template with k=12
needs about 480 GB and OOMs the process with no message naming
the inputs that caused it.

Add `_check_idw_memory(grid_pixels, k)` and call it at the top of
the public `idw()` entrypoint when k is set on a numpy-backed
template. Dask templates dispatch `_idw_knearest_numpy` per chunk
via `map_blocks`, so chunk size already bounds the per-chunk
allocation; the guard skips dask paths to avoid refusing
legitimate chunked workloads. GPU backends reject k early.

Same shape as the kriging guard from #1309.

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: brendancol <433221+brendancol@users.noreply.github.com>
@brendancol brendancol deleted the issue-1307 branch May 4, 2026 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kriging: unbounded N×N and grid×N allocations have no memory guard

1 participant