Skip to content

geotiff: VRT XML parser reads file without a size cap #1815

@brendancol

Description

@brendancol

Summary

_vrt.read_vrt reads the VRT XML file with an unbounded f.read() (xrspatial/geotiff/_vrt.py:640-641):

with open(vrt_path, 'r') as f:
    xml_str = f.read()

safe_fromstring (the parser used downstream) blocks external entity expansion, but it cannot protect against a literally large VRT XML file. A multi-gigabyte VRT file consumes all that memory before parsing even starts.

Why this matters

VRTs are pure XML metadata; pixel data lives in the source TIFFs. A 50k-source VRT runs around 25 MB. There is no realistic scenario where a VRT XML file is hundreds of megabytes, let alone gigabytes. Reading without a cap turns an untrusted (or malformed) VRT path into a memory-exhaustion vector.

This matches the bomb-cap style fixes already applied elsewhere in the geotiff reader (JPEG predecode #1792, DstRect resample cap #1737, max_pixels for VRT source reads #1803).

Proposed fix

Add a configurable size cap on the VRT XML read. Stream the file in bounded chunks; if the total exceeds the cap, raise ValueError with the cap value and the env-var name.

  • Default cap: 64 MiB. A 50k-source VRT (~25 MB) fits comfortably with margin.
  • Env var: XRSPATIAL_VRT_MAX_XML_BYTES for operators who legitimately need a larger cap.
  • Match the existing style in _vrt.py for env-driven limits.

Tests

  • A small VRT under the cap parses normally.
  • A synthetic VRT padded with comment whitespace past the cap raises ValueError mentioning the cap and the env-var name.
  • Setting XRSPATIAL_VRT_MAX_XML_BYTES to a smaller value lets the padded VRT parse only when the env var is raised above its size.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingoomOut-of-memory risk with large datasets

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions