Skip to content

Cap decompressed size for deflate/zstd/lz4/packbits to block bomb attacks#1533

Merged
brendancol merged 4 commits into
xarray-contrib:mainfrom
brendancol:security/decompression-output-caps
May 10, 2026
Merged

Cap decompressed size for deflate/zstd/lz4/packbits to block bomb attacks#1533
brendancol merged 4 commits into
xarray-contrib:mainfrom
brendancol:security/decompression-output-caps

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Four codecs in xrspatial/geotiff/_compression.py decompressed strip and tile payloads with no output-size cap. A small malicious TIFF (web download, fsspec, third-party catalog, user upload) could declare a 1024x1024 uint8 image (1 MiB expected) and supply a 1 MiB deflate-compressed strip that decodes to 1 GiB, OOM-killing the reader. The existing size check at _decode_strip_or_tile (_reader.py:565) runs after decompression, so it cannot bound peak RSS. Audit confirmed by setting RLIMIT_AS=300MB and feeding a 1 MiB deflate-bombed TIFF to read_to_array: process is killed.

This PR adds an expected_size cap inside each codec, rejecting any decode that would exceed expected_size * 1.05 + 1 bytes with a clean ValueError before peak allocation.

Codecs covered (one fix template, four codecs)

  • deflate (deflate_decompress): zlib.decompressobj().decompress(data, max_length=cap+1) with a drain loop. zlib.decompress has no cap parameter.
  • zstd (zstd_decompress): ZstdDecompressor().stream_reader(data).read(cap+1) plus a one-byte overflow probe. decompress(data, max_output_size=cap) is not enforced when the frame embeds the content size, which is exactly the case in the bomb.
  • lz4 (lz4_decompress): LZ4FrameDecompressor().decompress(data, max_length=cap+1) and treat needs_input == False as overflow.
  • packbits (packbits_decompress): pure-Python loop already; check the running output length after each opcode.

Margin

The 1.05x margin (5%) lets legitimate codec framing or trailing metadata bytes through while still rejecting the 1000:1 ratios characteristic of bomb attacks. A test (test_cap_includes_metadata_margin) covers the legitimate metadata case, and test_legitimate_high_compression_passes confirms an all-zero array at >50:1 ratio is accepted.

Tests

xrspatial/geotiff/tests/test_decompression_caps.py (14 tests):

  • Codec-direct bomb rejection (deflate, zstd, lz4, packbits) and round-trip.
  • End-to-end TIFF reads: 1 MiB declared image with a strip that decodes to 1 GiB, all four codecs.
  • Negative tests: legitimate >50:1 compression of zeros, plus the 5% metadata margin.

The end-to-end TIFFs are hand-built with struct.pack so no test ever materializes the 1 GiB payload outside the codec's own buffer.

Out of scope

Per project policy on one-fix-per-PR for security work, the user explicitly opted to bundle these four codec caps into one PR (same vulnerability class, same fix template). Other audit findings (IFD count, IFD chain) ship in separate PRs.

Test plan

  • pytest xrspatial/geotiff/tests/test_decompression_caps.py -x -q (14 pass)
  • pytest xrspatial/geotiff/tests/test_compression.py xrspatial/geotiff/tests/test_compression_level.py xrspatial/geotiff/tests/test_lz4.py xrspatial/geotiff/tests/test_reader.py xrspatial/geotiff/tests/test_writer.py -x -q (63 pass, no regressions)

…acks

The four codecs in ``xrspatial/geotiff/_compression.py`` decompressed strip
and tile payloads with no output-size cap, so a small malicious TIFF could
expand to multiple gigabytes during decode and OOM-kill the reader.  An
audit confirmed it: a 1 MiB deflate-compressed strip declaring a 1024x1024
image expands to 1 GiB, and a process with ``RLIMIT_AS=300MB`` is killed
before the existing post-decode size check (``_decode_strip_or_tile`` in
``_reader.py:565``) ever runs.  The threat model is untrusted TIFF input
from web downloads, fsspec, third-party catalogs, or user upload.

Each codec now accepts an ``expected_size`` kwarg (the byte count the
caller already computed for the post-check) and refuses to emit more than
``expected_size * 1.05 + 1`` bytes before raising ``ValueError`` with the
codec name and actual vs cap.  The 1.05x margin allows for legitimate
codec metadata that some encoders emit; bomb ratios (1000:1+) are
rejected long before peak RSS spikes.

Per-codec implementation:

- deflate: ``zlib.decompressobj().decompress(data, max_length=cap+1)``
  with a drain loop over ``unconsumed_tail``.  ``zlib.decompress`` had no
  cap.
- zstd: ``ZstdDecompressor().stream_reader(data).read(cap+1)``, then
  probe one more byte to detect overflow.  ``decompress(data,
  max_output_size=cap)`` is not actually enforced when the frame embeds
  the content size, which the bomb does.
- lz4: ``LZ4FrameDecompressor().decompress(data, max_length=cap+1)`` and
  treat ``needs_input == False`` as overflow (decoder has buffered output
  it could not deliver).
- packbits: pure-Python loop already, so just check the running output
  length after each opcode.

The dispatch ``decompress`` plumbs ``expected_size`` through to all four.

Tests in ``xrspatial/geotiff/tests/test_decompression_caps.py`` cover:
codec-direct bomb rejection, end-to-end TIFF bomb rejection (1 MiB
declared, 1 GiB decoded), legitimate high-ratio compression (all-zero
arrays at >50:1) passing without false rejection, and the 5% metadata
margin not over-rejecting.

Audit reproducer behaviour: the 1 MiB-to-1 GiB TIFF previously triggered
``MemoryError`` (or OS OOM kill) in zlib.decompress; now raises
``ValueError("deflate decode exceeded expected size: ...")`` with peak
RSS bounded by the cap.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 8, 2026
@brendancol brendancol requested a review from Copilot May 9, 2026 15:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds decompression output-size caps to several GeoTIFF codecs to mitigate decompression-bomb (zip-bomb) style attacks during strip/tile decode, ensuring the reader can fail fast before allocating unbounded output buffers.

Changes:

  • Add expected_size-based output caps (with a 5% margin) to deflate, zstd, lz4, and packbits decompressors in xrspatial/geotiff/_compression.py.
  • Thread expected_size through the codec dispatch so _reader._decode_strip_or_tile() can enforce the cap during decompression, not after.
  • Add a new test module covering direct-codec overflow rejection and end-to-end TIFF read rejection scenarios.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File Description
xrspatial/geotiff/_compression.py Adds expected_size-capped decompression for deflate/zstd/lz4/packbits and wires caps into the decompression dispatch.
xrspatial/geotiff/tests/test_decompression_caps.py New tests intended to validate bomb rejection and ensure legitimate high compression still succeeds.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread xrspatial/geotiff/tests/test_decompression_caps.py Outdated
Comment thread xrspatial/geotiff/tests/test_decompression_caps.py
Comment on lines +211 to +214
def test_lz4_bomb_rejected(tmp_path):
lz4_frame = pytest.importorskip("lz4.frame")
payload = b'\x00' * _BOMB_BYTES
strip = lz4_frame.compress(payload)
Comment thread xrspatial/geotiff/tests/test_decompression_caps.py Outdated
Comment thread xrspatial/geotiff/tests/test_decompression_caps.py Outdated
Comment thread xrspatial/geotiff/tests/test_decompression_caps.py Outdated
Comment on lines +24 to +37
def _max_output_with_margin(expected_size: int) -> int:
"""Return the cap (in bytes) for a codec given the caller's expected size.

Adds at least one byte of slack so that callers passing 0 (meaning
"unknown") still get a usable buffer for tests, while a single byte of
overflow is detected.
"""
if expected_size <= 0:
# No cap available: fall back to a generous default to preserve
# backward compatibility with callers that don't supply a size.
# The reader always supplies a size, so this branch is mainly for
# direct callers and round-trip tests.
return 0
return int(expected_size * _DECOMPRESS_MARGIN) + 1
Comment thread xrspatial/geotiff/_compression.py Outdated
brendancol added 3 commits May 9, 2026 14:09
Eight findings, all acted on:

Source:
- _max_output_with_margin docstring claimed callers passing 0 still
  get a usable buffer, but the implementation returns 0 (= no cap).
  Update the docstring to match the implementation: positive size
  caps at 1.05x + 1; non-positive disables capping.
- Replace deflate's `out += more` bytes accumulation with a bytearray.
  The bytes path was O(n^2) for large strips because every chunk
  reallocated and copied the whole accumulated buffer; the bytearray
  path is amortized O(n).

Tests:
- Drop the 1 GiB host allocations from test_{deflate,zstd,lz4}_bomb_rejected.
  Each test was building `b'\x00' * (1 << 30)` then compressing it,
  which contradicted the file's own "no test materializes the 1 GiB
  payload" claim and would OOM or stall on shared CI runners. Shrink
  to 8 MiB; the cap (~1.05 MiB on a 1024x1024 uint8 declared image)
  is exceeded by ~7x, which still proves the codec rejects rather
  than truncates without forcing the test process to allocate
  gigabytes.
- Replace `pytest.importorskip(...)` calls inside `@pytest.mark.skipif`
  decorators with precomputed `_HAS_ZSTD` / `_HAS_LZ4` flags via
  `importlib.util.find_spec`. The previous form could skip the entire
  module at import time on hosts missing one optional dep, dropping
  unrelated tests with it.
- Update the module docstring to reflect the actual declared dimensions
  (1024x1024 uint8 = 1 MiB) instead of the misleading "~1 KiB pixel
  data" claim.
- Move the inline `pytest.importorskip(...)` calls inside the
  end-to-end zstd / lz4 tests to module-level skipif so the optional-
  dep skip is consistent across direct and end-to-end tests.
Two follow-ups on top of the earlier review-fix commit:

- Direct-codec bomb tests in TestCodecDirect / TestZstdDirect /
  TestLz4Direct were still allocating 100 MiB of zeros on the host
  (`big = b'\x00' * (100 * 1024 * 1024)`) to compress for the bomb
  payload. The earlier fix only shrank the end-to-end TIFF tests.
  Centralise on `_DIRECT_BOMB_BYTES = 4 MiB` -- still a 4000:1 ratio
  over the 1 KiB cap used in those tests, plenty to prove the
  rejection without 100 MiB of CI memory pressure per test.
- Add `expected_size=0` (default) backward-compat regression tests
  for deflate, zstd, lz4, and packbits. The cap implementation has
  a no-cap fallback for callers that don't pass a size; without
  these tests, a future change that silently enabled a default
  cap would break unmodified callers and not be caught.
`importlib.util.find_spec("lz4.frame")` imports the `lz4` parent
package to resolve the submodule spec, so it raises ModuleNotFoundError
rather than returning None when lz4 itself is not installed. macOS CI
(Python 3.12) failed at module collection with "No module named 'lz4'"
because of this. Wrap the spec lookup in a try/except helper so a
missing optional dependency cleanly turns into False regardless of
dotted depth, and route both the zstandard and lz4 detection through it.
@brendancol brendancol merged commit f00305c into xarray-contrib:main May 10, 2026
11 checks passed
brendancol added a commit that referenced this pull request May 11, 2026
) (#1629)

* Add decompression-bomb cap to LERC and JPEG 2000 codecs (#1625)

The deflate, zstd, lz4, and packbits wrappers all received a pre-decode
output-size cap in #1533 so a crafted TIFF tile cannot expand to many
GB before the post-decode size check fires. LERC and JPEG 2000 were
missed: lerc_decompress_with_mask and jpeg2000_decompress called the
underlying lerc.decode / glymur.Jp2k[:] with no bound, so the existing
size check in _decode_strip_or_tile ran only after the full buffer had
already been materialised by the external library.

LERC compresses constant-value blocks at >700,000:1, so a 94-byte blob
can request 64 MiB of host memory. A 1 KB on-disk LERC tile can ask
for several GB before the reader rejects it. JPEG 2000 has similar
amplification potential through codestream-declared dimensions.

Fix: query each codestream's declared dimensions before decoding.
- LERC: lerc.getLercBlobInfo(blob) returns nCols, nRows, nBands, and
  dataType from the header without decoding. Compute the projected
  output bytes and raise ValueError when it exceeds the same
  expected_size * 1.05 + 1 cap used by every other codec.
- JPEG 2000: glymur.Jp2k(file).shape parses only the SIZ marker, so
  inspecting it before [:] is cheap.

Pillow's JPEG decoder already has Image.MAX_IMAGE_PIXELS and raises
DecompressionBombError, so no wrapper-level cap is needed there.

Plumb expected_size through the decompress() dispatcher,
_decode_strip_or_tile's LERC branch, and the _gpu_decode CPU
fallbacks for both codecs.

Tests cover bomb-rejection and legitimate round-trips at the codec
level for both LERC and JPEG 2000, plus the existing zero-expected-size
backward-compat path.

Found by /sweep-security on the geotiff module.

* Address review: harden JPEG 2000 / LERC bomb pre-checks

Copilot review on #1629 flagged four issues. Fixes:

- np.prod overflow (JPEG 2000): replace ``int(np.prod(shape))`` with
  ``math.prod(int(d) for d in shape)``. ``np.prod`` multiplies in
  fixed-width intp and can wrap to a small or negative value on
  attacker-declared dimensions (SIZ marker uses uint32 fields, so a
  malformed tile can declare shape whose product exceeds int64),
  silently under-counting the declared size and bypassing the cap.
  ``math.prod`` uses Python arbitrary-precision ints.

- Fail-closed on unreadable SIZ marker (JPEG 2000): previously, an
  exception from ``jp2.shape`` or ``jp2.dtype`` disabled the cap and
  fell through to ``jp2[:]`` (the bomb path). ``getattr(jp2, 'dtype',
  np.uint8)`` also defaulted to the smallest dtype, undercounting
  declared bytes. Now both failure modes raise ValueError before any
  pixel decoding runs. Added test_jpeg2000_unreadable_shape_fails_closed
  to lock it in (monkeypatches Jp2k.shape to raise; asserts jp2[:] is
  never invoked).

- LERC errCode not checked: ``lerc.getLercBlobInfo`` returns
  ``(errCode, ...)`` and when errCode != 0 the remaining fields can be
  garbage. ``_check_lerc_bomb`` now returns early in that case so the
  canonical error comes from ``lerc.decode``, not from a spurious
  declared-size computation.

- Unused ``import os`` in test_jpeg2000_bomb_raises (flake8 F401):
  removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants