Skip to content

Reject oversize rasterize outputs before allocation#1224

Merged
brendancol merged 1 commit into
masterfrom
issue-1223
Apr 22, 2026
Merged

Reject oversize rasterize outputs before allocation#1224
brendancol merged 1 commit into
masterfrom
issue-1223

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Fixes #1223.

rasterize() accepted width, height, and resolution without any upper bound. The numpy and cupy backends each did np.full((height, width), fill, dtype=np.float64) plus an int8 written mask, so a caller could trigger a multi-gigabyte host or device allocation before any geometry work began. The resolution path was worse: resolution=1e-6 with a 10x10 bounds resolves to 10M x 10M pixels.

The geotiff reader already ships this pattern as MAX_PIXELS_DEFAULT = 1_000_000_000 plus _check_dimensions(). This PR adds the equivalent pair to xrspatial/rasterize.py, since pulling them out of geotiff/_reader.py would create a cross-module coupling for a one-use-case check.

Changes

  • xrspatial/rasterize.py: add MAX_PIXELS_DEFAULT and _check_output_dimensions(). Add a max_pixels keyword to rasterize() (default 1e9). Call the guard right after final_width, final_height are resolved so it covers the explicit path and the resolution-derived path, and so it runs before the first np.full / cupy.full.
  • xrspatial/tests/test_rasterize.py: add six cases to TestValidation. Two oversize cases (explicit dims and resolution-derived), a moderate realistic attack size, a boundary-equality case that must pass, a one-over case that must reject, and a max_pixels override that allows a larger-than-default raster through.
  • .claude/sweep-security-state.json: record the audit.

Test plan

  • pytest xrspatial/tests/test_rasterize.py: 138 pass, 2 skip (cuda).
  • pytest xrspatial/tests/test_polygon_clip.py xrspatial/tests/test_zonal.py: 144 pass. Both call rasterize() internally through **kwargs, so the new keyword is a no-op for them.

Audit scope

Also noted during the audit but out of scope here:

  • MEDIUM: _build_row_csr_numba accumulates total = row_ptr[height] in int32. For an extreme input (many tall edges on a very tall raster) this can overflow before allocating np.empty(total, dtype=np.int32). The new pixel cap indirectly bounds realistic inputs, but a future fix should widen row_ptr / running to int64.

rasterize() took width, height, and resolution without any upper bound,
so any caller could trigger a multi-gigabyte np.full / cupy.full before
the error surfaced.

- Add MAX_PIXELS_DEFAULT = 1_000_000_000 and _check_output_dimensions()
  in xrspatial/rasterize.py.
- Add a max_pixels keyword to rasterize() and call the guard after
  width/height are resolved, covering the explicit dims path and the
  resolution-derived path.
- Add TestValidation cases: oversize explicit dims, oversize
  resolution, boundary equality, one-over rejection, and
  max_pixels override.
- Record the audit in .claude/sweep-security-state.json.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Apr 22, 2026
@brendancol brendancol merged commit 89499bc into master Apr 22, 2026
10 of 11 checks passed
@brendancol brendancol deleted the issue-1223 branch April 25, 2026 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rasterize: unbounded allocation DoS via width/height/resolution

1 participant