geotiff: cap VRT XML read size#1818
Merged
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds a configurable byte cap (default 64 MiB, via XRSPATIAL_VRT_MAX_XML_BYTES) on VRT XML reads in _vrt.read_vrt to prevent memory exhaustion from pathologically large VRT XML files, addressing issue #1815. Reading is now streamed in 64 KiB chunks with a running size check that raises ValueError when exceeded.
Changes:
- Add
_get_vrt_max_xml_bytes(env-driven cap with validation) and_read_vrt_xml(bounded streaming reader) helpers in_vrt.py. - Replace unbounded
f.read()inread_vrtwith the new bounded reader. - Add regression tests covering under-cap success, over-cap failure, env-var override, and invalid cap values.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| xrspatial/geotiff/_vrt.py | Introduces env-configurable XML size cap and streaming reader used by read_vrt. |
| xrspatial/geotiff/tests/test_vrt_xml_size_cap_1815.py | New tests verifying cap behavior, env override, and invalid-cap handling. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
PR #1803 forwarded the caller's max_pixels to read_to_array inside read_vrt's source loop so a tiny VRT output cannot force a huge source decode (#1796). The output-window check at the source read enforces that correctly. A separate per-tile dimension check at the same call sites also consumed the caller's max_pixels, so a caller setting max_pixels as an output budget (e.g. 10_000) failed the per-tile sanity check on any normal source whose default tile size is 256x256 (= 65_536 pixels). Use MAX_PIXELS_DEFAULT for the per-tile dim check at the two call sites in _read_tiles (local) and _read_tiles_cog_http (HTTP). The output-window check at the same functions continues to enforce the user-supplied max_pixels, preserving the #1796 protection.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1815.
Adds a 64 MiB default cap on VRT XML reads, configurable via XRSPATIAL_VRT_MAX_XML_BYTES. Streams the file with a running size check and raises ValueError when the cap is exceeded.
Tested: pytest xrspatial/geotiff/tests/test_vrt_xml_size_cap_1815.py