Skip to content

geotiff: cap VRT XML read size#1818

Merged
brendancol merged 2 commits into
mainfrom
issue-1815
May 13, 2026
Merged

geotiff: cap VRT XML read size#1818
brendancol merged 2 commits into
mainfrom
issue-1815

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #1815.

Adds a 64 MiB default cap on VRT XML reads, configurable via XRSPATIAL_VRT_MAX_XML_BYTES. Streams the file with a running size check and raises ValueError when the cap is exceeded.

Tested: pytest xrspatial/geotiff/tests/test_vrt_xml_size_cap_1815.py

@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 13, 2026
@brendancol brendancol requested a review from Copilot May 13, 2026 16:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a configurable byte cap (default 64 MiB, via XRSPATIAL_VRT_MAX_XML_BYTES) on VRT XML reads in _vrt.read_vrt to prevent memory exhaustion from pathologically large VRT XML files, addressing issue #1815. Reading is now streamed in 64 KiB chunks with a running size check that raises ValueError when exceeded.

Changes:

  • Add _get_vrt_max_xml_bytes (env-driven cap with validation) and _read_vrt_xml (bounded streaming reader) helpers in _vrt.py.
  • Replace unbounded f.read() in read_vrt with the new bounded reader.
  • Add regression tests covering under-cap success, over-cap failure, env-var override, and invalid cap values.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
xrspatial/geotiff/_vrt.py Introduces env-configurable XML size cap and streaming reader used by read_vrt.
xrspatial/geotiff/tests/test_vrt_xml_size_cap_1815.py New tests verifying cap behavior, env override, and invalid-cap handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

PR #1803 forwarded the caller's max_pixels to read_to_array inside
read_vrt's source loop so a tiny VRT output cannot force a huge source
decode (#1796). The output-window check at the source read enforces that
correctly. A separate per-tile dimension check at the same call sites
also consumed the caller's max_pixels, so a caller setting max_pixels as
an output budget (e.g. 10_000) failed the per-tile sanity check on any
normal source whose default tile size is 256x256 (= 65_536 pixels).

Use MAX_PIXELS_DEFAULT for the per-tile dim check at the two call sites
in _read_tiles (local) and _read_tiles_cog_http (HTTP). The output-window
check at the same functions continues to enforce the user-supplied
max_pixels, preserving the #1796 protection.
@brendancol brendancol merged commit 4b1de37 into main May 13, 2026
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

geotiff: VRT XML parser reads file without a size cap

2 participants