Skip to content

geotiff: jpeg_decompress lacks pre-decode size cap (bomb defense gap) #1792

@brendancol

Description

@brendancol

Summary

xrspatial.geotiff._compression.jpeg_decompress (lines 1042-1066) accepts width, height, and samples keyword arguments but never consults them to pre-validate the JPEG stream's declared SOF dimensions before letting Pillow allocate the decoded pixel buffer. This is asymmetric with the JPEG 2000 wrapper (which pre-parses the SIZ marker) and the LERC wrapper (which inspects getLercBlobInfo) — both reject blobs whose declared output exceeds the per-tile expected_size * 1.05 + 1 cap before any decode allocation runs.

Impact

A crafted TIFF can declare a small JPEG tile (e.g. 256x256 RGB, ~196 KB expected) but ship a JPEG payload whose SOF marker declares a much larger image. Image.open(BytesIO(data)) is lazy, but np.asarray(img).tobytes() triggers full decode. Pillow's default MAX_IMAGE_PIXELS is ~89M (warning) / ~178M (error), so an attacker-declared image up to ~178M pixels (~500 MB RGB) will allocate before Pillow's own guard fires. The downstream chunk.size != expected reshape check in _decode_strip_or_tile only runs after the full buffer is materialised.

Severity: MEDIUM. Pillow's DecompressionBombError is a partial guard; without a tile-aware pre-check, a single malicious tile can still allocate up to ~500 MB per call (and more per IFD when many tiles are decoded sequentially).

Repro sketch

A JPEG stream with SOF declaring 13000x13000x3 (~507 MB) embedded as a tile in a TIFF whose TileWidth=256, TileLength=256, SamplesPerPixel=3, Compression=7 lets jpeg_decompress allocate ~500 MB even though the caller only asked for ~196 KB.

Fix

Parse the JPEG SOF0/SOF2 marker to read declared width/height/samples without decoding pixels, then reject the blob when the declared decoded byte count exceeds expected_size * 1.05 + 1 (the same margin used by the other codec wrappers). The caller already passes the expected tile dimensions via width / height / samples; the helper just needs to use them.

The check should be additive: when called without dimensions (the round-trip / direct-caller path) it should fall back to Pillow's own guard.

Acceptance criteria

  • A new pre-decode helper that scans data for the SOF marker and rejects oversize blobs.
  • jpeg_decompress calls it when width > 0 and height > 0.
  • Tests cover: (1) a legitimate 256x256 JPEG decodes normally, (2) a JPEG declaring 13000x13000 against expected 256x256 raises ValueError mentioning the decompression bomb, (3) a malformed JPEG with no SOF marker still surfaces a clear error.
  • No regression in the JPEG-tile round-trip tests in tests/test_jpeg.py.

Notes

This is the third codec in this series. Issues #1533 (deflate/zstd/lz4/packbits) and #1625 (LERC/JPEG2000) covered the others. LZW already caps via the caller-supplied buffer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions