Skip to content

rasterize: unbounded allocation DoS via width/height/resolution #1223

@brendancol

Description

@brendancol

The public rasterize() API accepts width, height, and resolution with no upper bound on the resulting raster size. A caller can request arbitrarily large output arrays, which triggers unbounded host or device memory allocation before any geometry work runs.

Where

xrspatial/rasterize.py:

  • _run_numpy line 998-999: np.full((height, width), fill, dtype=np.float64) plus an int8 written mask, 9 bytes per pixel, with no guard.
  • _run_cupy line 1336-1337: same allocation on device with cupy.full / cupy.zeros.
  • _rasterize_tile_numpy / _rasterize_tile_cupy allocate per-tile, but the full-raster size still drives tile count and filtering work.
  • rasterize() public API lines 2118-2137: final_width, final_height = int(width), int(height) and the resolution path final_width = max(int(np.ceil((xmax - xmin) / x_res)), 1) accept any positive size.

Reproducer

from xrspatial.rasterize import rasterize
from shapely.geometry import box

# 8 GB float64 + 1 GB int8 = 9 GB, no guard
rasterize([(box(0, 0, 1, 1), 1.0)], width=31623, height=31623,
          bounds=(0, 0, 1, 1))

# Via resolution: 10^18 pixels requested
rasterize([(box(0, 0, 1, 1), 1.0)], resolution=1e-9,
          bounds=(0, 0, 1, 1))

Severity

HIGH. The existing xrspatial/geotiff/_reader.py already sets MAX_PIXELS_DEFAULT = 1_000_000_000 and ships a _check_dimensions(width, height, samples, max_pixels) helper. The rasterize path does not use it, so a caller can request terabyte-scale allocations before any input is parsed.

Fix direction

  • Add a max_pixels keyword to rasterize() defaulting to MAX_PIXELS_DEFAULT (shared with the geotiff reader).
  • After resolving final_width and final_height, call _check_dimensions(final_width, final_height, 1, max_pixels) and raise a clear ValueError before allocating.
  • Cover both the explicit width / height path and the resolution path.
  • Add regression tests for oversize width/height and oversize resolution-derived dimensions, plus a passing test with a reasonable size and a test that a larger explicit max_pixels lets the call through.

Other findings in this audit

Also noted a MEDIUM int32 overflow risk in _build_row_csr_numba (total = row_ptr[height] with int32 accumulator) that can bite under extreme edge-row cell counts. Out of scope here; the allocation guard indirectly bounds realistic inputs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions