Skip to content

Cap VRT DstRect resample intermediate against max_pixels (#1737)#1742

Merged
brendancol merged 3 commits into
mainfrom
deep-sweep-security-geotiff-2026-05-12-p3
May 12, 2026
Merged

Cap VRT DstRect resample intermediate against max_pixels (#1737)#1742
brendancol merged 3 commits into
mainfrom
deep-sweep-security-geotiff-2026-05-12-p3

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

  • Closes geotiff: VRT SimpleSource DstRect xSize/ySize allows unbounded intermediate allocation in _resample_nearest #1737. A crafted VRT could declare a <SimpleSource><DstRect xSize ySize> orders of magnitude larger than the VRT's rasterXSize / rasterYSize. _resample_nearest then allocated (dr.y_size, dr.x_size) before the clip was taken, demanding multi-gigabyte intermediates on tiny outputs (10x10 source with DstRect 50000x50000 against a 100x100 VRT extent reached ~2.5 GB of uint8 in np.repeat).
  • Reject sizes that exceed max_pixels in read_vrt before _resample_nearest runs. Mirrors the _check_dimensions pattern already used by the tile / strip / HTTP / dask readers.
  • Six new tests in test_vrt_dstrect_resample_cap_1737.py cover huge X, huge Y, legitimate, max_pixels override, at-cap, and negative cases.

Test plan

  • pytest xrspatial/geotiff/tests/test_vrt_dstrect_resample_cap_1737.py (6 new, all pass)
  • pytest xrspatial/geotiff/tests/ -k vrt (201 existing VRT tests pass)
  • Demonstrate the unbounded allocation on main is now rejected with ValueError

@brendancol brendancol requested a review from Copilot May 12, 2026 20:02
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 12, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the GeoTIFF VRT reader against denial-of-service via crafted <SimpleSource><DstRect xSize/ySize> values that force _resample_nearest to allocate unbounded intermediate arrays, by adding an explicit max_pixels-based cap in read_vrt and adding regression tests for the reported cases.

Changes:

  • Add a resample-intermediate size guard in xrspatial/geotiff/_vrt.py:read_vrt() to reject oversized DstRect resample intermediates before allocation.
  • Add a new regression test module covering oversized X/Y, legitimate upsample, max_pixels behavior, at-cap behavior, and negative-size cases.
  • Update the sweep-security audit state CSV entry to record the finding and fix status.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
xrspatial/geotiff/_vrt.py Adds a max_pixels-based validation guard to prevent unbounded resample intermediate allocations from malicious VRT DstRect sizes.
xrspatial/geotiff/tests/test_vrt_dstrect_resample_cap_1737.py Introduces regression tests intended to ensure oversized DstRect resample intermediates are rejected and normal cases still work.
.claude/sweep-security-state.csv Records the security sweep finding (#1737) and notes the mitigation approach.
Comments suppressed due to low confidence (1)

xrspatial/geotiff/tests/test_vrt_dstrect_resample_cap_1737.py:120

  • The negative-DstRect test is internally inconsistent: the docstring/name say negative sizes "must surface as ValueError", but the test explicitly allows success, and the in-line comment claims the "cap branch catches the negative value" even though read_vrt skips non-overlapping rects before the cap check (negative xSize/ySize makes dst_c1/dst_r1 < dst_c0/dst_r0, so it will always be treated as no-overlap and continue). Update the test and comments to match the actual behavior you want (either assert it is rejected via earlier validation, or assert it is skipped and returns the fill array).
def test_negative_dstrect_rejected():
    """Negative ``xSize`` / ``ySize`` must surface as ``ValueError`` rather
    than degenerate into a negative-stride numpy slice."""
    with tempfile.TemporaryDirectory() as td:
        _write_source(td)
        vrt_path = _write_vrt(td, dst_x_size=-5, dst_y_size=100)
        # The negative size makes ``needs_resample`` true because it differs
        # from sr.x_size=10; the cap branch catches the negative value.
        # If the source dstrect doesn't overlap the window the source is
        # skipped silently (returns the empty fill array) and no resample
        # runs - that's also OK; we accept either behaviour.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +95 to +96
# Default cap is 1e9, 2000*2000=4e6 well under.
arr, _ = read_vrt(vrt_path)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restructured both test_max_pixels_kwarg_raises_cap and test_dstrect_at_cap_succeeds in c0f9ab8. Each now asserts rejection under a tiny max_pixels, then asserts success after the cap is bumped, using the same VRT across both calls. I also shrank the VRT raster to 10x10 so the output buffer stays under _check_dimensions; otherwise the dimensions guard rejects first and the resample-intermediate check is never exercised.

Comment thread xrspatial/geotiff/_vrt.py Outdated
Comment on lines +635 to +643
if (dr.x_size < 0 or dr.y_size < 0
or dr.x_size > max_pixels
or dr.y_size > max_pixels
or dr.x_size * dr.y_size > max_pixels):
raise ValueError(
f"VRT SimpleSource DstRect "
f"(xSize={dr.x_size}, ySize={dr.y_size}) requires "
f"a resample intermediate of "
f"{dr.x_size * dr.y_size:,} pixels, which exceeds "
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, unreachable: a negative size makes dst_*1 < dst_*0 so the overlap continue always fires first. In c0f9ab8 I moved the negative-size validation ahead of the overlap math and gave it a tailored message ("negative size" rather than "exceeds" the pixel cap). The remaining cap branch just enforces the pixel budget. I also added test_negative_dstrect_y_size_rejected and tightened test_negative_dstrect_rejected to assert against the new error string.

A crafted VRT could declare a <SimpleSource><DstRect xSize=..  ySize=..>
orders of magnitude larger than the VRT's rasterXSize/rasterYSize.
_resample_nearest then allocated (dr.y_size, dr.x_size) before the clip
was taken, triggering multi-gigabyte intermediates on tiny outputs. The
output buffer was already bounded by _check_dimensions; the resample
intermediate was not. Reject sizes that exceed max_pixels in read_vrt
before _resample_nearest runs. Six new tests cover huge X, huge Y, the
legitimate-upsample, max_pixels override, at-cap, and negative cases.

Updates the security sweep state to record pass 16 on 2026-05-12.
@brendancol brendancol force-pushed the deep-sweep-security-geotiff-2026-05-12-p3 branch from b389945 to 6100ccf Compare May 12, 2026 20:22
Address Copilot review feedback on #1742:

1. The negative-size branch in the resample-cap guard was unreachable.
   A negative xSize/ySize makes dst_*1 < dst_*0 so the source was
   already skipped by the overlap `continue`. Reject the malformed
   sizes before the overlap math with a tailored error message
   ("negative size") rather than reusing the pixel-budget message.

2. test_max_pixels_kwarg_raises_cap and test_dstrect_at_cap_succeeds
   did not actually exercise the override -- they used DstRects well
   under the default cap and never passed max_pixels=. Restructure
   each to first assert rejection under a tiny max_pixels then assert
   success when the cap is bumped, using the same VRT across both
   calls. Use a 10x10 VRT raster so the output buffer stays well
   under _check_dimensions while the resample intermediate is what
   the cap actually bites on.

3. Add test_negative_dstrect_y_size_rejected for symmetric coverage.
The new test cases on this PR write a source TIFF inside a
TemporaryDirectory before invoking read_vrt. On Windows the read
path can leave a file handle open just past the context exit, so
shutil.rmtree raises WinError 32 (same shape as #1748). Linux closes
in time and never trips.

ignore_cleanup_errors=True keeps the cleanup best-effort. The
assertions still run inside the context so a real failure still
fails the test; only the tempdir teardown is now tolerant.
@brendancol brendancol merged commit e7b9cde into main May 12, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

geotiff: VRT SimpleSource DstRect xSize/ySize allows unbounded intermediate allocation in _resample_nearest

2 participants