Reject oversize rasterize outputs before allocation#1224
Merged
Conversation
rasterize() took width, height, and resolution without any upper bound, so any caller could trigger a multi-gigabyte np.full / cupy.full before the error surfaced. - Add MAX_PIXELS_DEFAULT = 1_000_000_000 and _check_output_dimensions() in xrspatial/rasterize.py. - Add a max_pixels keyword to rasterize() and call the guard after width/height are resolved, covering the explicit dims path and the resolution-derived path. - Add TestValidation cases: oversize explicit dims, oversize resolution, boundary equality, one-over rejection, and max_pixels override. - Record the audit in .claude/sweep-security-state.json.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1223.
rasterize()acceptedwidth,height, andresolutionwithout any upper bound. The numpy and cupy backends each didnp.full((height, width), fill, dtype=np.float64)plus an int8writtenmask, so a caller could trigger a multi-gigabyte host or device allocation before any geometry work began. The resolution path was worse:resolution=1e-6with a 10x10 bounds resolves to 10M x 10M pixels.The geotiff reader already ships this pattern as
MAX_PIXELS_DEFAULT = 1_000_000_000plus_check_dimensions(). This PR adds the equivalent pair toxrspatial/rasterize.py, since pulling them out ofgeotiff/_reader.pywould create a cross-module coupling for a one-use-case check.Changes
xrspatial/rasterize.py: addMAX_PIXELS_DEFAULTand_check_output_dimensions(). Add amax_pixelskeyword torasterize()(default 1e9). Call the guard right afterfinal_width,final_heightare resolved so it covers the explicit path and the resolution-derived path, and so it runs before the firstnp.full/cupy.full.xrspatial/tests/test_rasterize.py: add six cases toTestValidation. Two oversize cases (explicit dims and resolution-derived), a moderate realistic attack size, a boundary-equality case that must pass, a one-over case that must reject, and amax_pixelsoverride that allows a larger-than-default raster through..claude/sweep-security-state.json: record the audit.Test plan
pytest xrspatial/tests/test_rasterize.py: 138 pass, 2 skip (cuda).pytest xrspatial/tests/test_polygon_clip.py xrspatial/tests/test_zonal.py: 144 pass. Both callrasterize()internally through**kwargs, so the new keyword is a no-op for them.Audit scope
Also noted during the audit but out of scope here:
_build_row_csr_numbaaccumulatestotal = row_ptr[height]in int32. For an extreme input (many tall edges on a very tall raster) this can overflow before allocatingnp.empty(total, dtype=np.int32). The new pixel cap indirectly bounds realistic inputs, but a future fix should widenrow_ptr/runningto int64.