Skip to content

geotiff: VRT SimpleSource DstRect xSize/ySize allows unbounded intermediate allocation in _resample_nearest #1737

@brendancol

Description

@brendancol

Summary

A crafted VRT file with a <SimpleSource> whose <DstRect xSize=... ySize=...> is much larger than the declared <VRTDataset rasterXSize ... rasterYSize=...> extent triggers an unbounded intermediate allocation inside _resample_nearest. The output buffer is bounded by _check_dimensions(out_w, out_h, n_bands, max_pixels) at _vrt.py:517, but the resampled-source intermediate is allocated at dr.y_size x dr.x_size before the clip is taken.

Reproduction (under 30 lines of code) builds a <DstRect xSize='50000' ySize='50000'/> against a 10x10 source under a 100x100 VRT extent. The output array is 100x100 (10 KB) but the call passes through _resample_nearest(src_arr, 50000, 50000) which allocates ~2.5 GB of uint8 inside np.repeat(np.repeat(src_arr, ry, axis=0), rx, axis=1). Larger values (xSize='200000' ySize='200000') push past 40 GB on the same input. The peak allocation is unbounded by max_pixels and not gated by any explicit check.

Reproducer

import os, tempfile, numpy as np
from xrspatial.geotiff import to_geotiff
from xrspatial.geotiff._vrt import read_vrt

with tempfile.TemporaryDirectory() as td:
    src = os.path.join(td, 'src.tif')
    to_geotiff(np.zeros((10, 10), dtype=np.uint8), src, compression='none')
    vrt = os.path.join(td, 'bomb.vrt')
    with open(vrt, 'w') as f:
        f.write('''<VRTDataset rasterXSize="100" rasterYSize="100">
  <VRTRasterBand dataType="Byte" band="1">
    <SimpleSource>
      <SourceFilename relativeToVRT="1">src.tif</SourceFilename>
      <SourceBand>1</SourceBand>
      <SrcRect xOff="0" yOff="0" xSize="10" ySize="10"/>
      <DstRect xOff="0" yOff="0" xSize="50000" ySize="50000"/>
    </SimpleSource>
  </VRTRasterBand>
</VRTDataset>''')
    # Reads a 100x100 array but allocates ~2.5 GB inside _resample_nearest
    arr, _ = read_vrt(vrt)

Severity

HIGH (Cat 1, unbounded allocation / denial-of-service via crafted file). Reachable through read_vrt / open_geotiff(path.vrt) and the public xrspatial.geotiff.read_vrt. The recent path-traversal hardening in #1671 means the source itself has to live under the VRT directory, but the DstRect size is parsed from the VRT XML directly with no upper bound and feeds the intermediate buffer.

Proposed fix

Cap the resample intermediate to the largest legitimate clip area. Two reasonable choices, both bounded by the VRT extent which is already constrained by _check_dimensions:

  1. Treat the resample as occurring on the clipped subwindow only: pass (clip_r1 - clip_r0, clip_c1 - clip_c0) plus the matching offset math instead of the full dr.y_size, dr.x_size. This requires care for fence-post errors but produces the smallest intermediate.
  2. Validate dr.y_size * dr.x_size against max_pixels (or a tighter VRT-scoped cap) before calling _resample_nearest and raise ValueError when a single DstRect's resample would exceed the budget.

Option 2 is simpler and matches the existing tile/strip per-cell caps in _reader.py. Option 1 is a perf win on top of the security fix.

Found via the deep-sweep security audit on 2026-05-12.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions