Skip to content

Bounds-check IFD entry table in parse_ifd (#1672)#1677

Merged
brendancol merged 2 commits into
mainfrom
issue-1672
May 12, 2026
Merged

Bounds-check IFD entry table in parse_ifd (#1672)#1677
brendancol merged 2 commits into
mainfrom
issue-1672

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Three struct.unpack_from calls inside parse_ifd read from the IFD entry table without first checking the buffer length: the num_entries field, each entry's tag/type/count, and the next-IFD pointer. A truncated file used to escape with struct.error, which is outside the ValueError/TypeError contract #1661's Hypothesis fuzz tests assume. The typed-error work in #1661 (commit 58a8b3d) covered the IFD value area but not the entry table itself.

This PR adds three bounds checks in parse_ifd, each raising ValueError with the offset, required byte count, and len(data). Message style matches the existing MAX_IFD_ENTRY_COUNT and value-area bounds-check messages.

Tests

  • test_ifd_entry_table_bounds_1672.py: unit tests building buffers exactly one byte short of each new bounds check, in both classic and BigTIFF layouts, plus a sanity case for a complete buffer and an out-of-bounds offset.
  • test_fuzz_hypothesis_1661.py: new Group 4 truncation property tests. One runs the existing corpus through random truncation; a second sweeps integer truncation points 0..400 against a fresh make_minimal_tiff. Both fail if struct.error escapes; ValueError, TypeError, MemoryError, and OverflowError are accepted.

Verification

pytest xrspatial/geotiff/tests/test_ifd_entry_table_bounds_1672.py xrspatial/geotiff/tests/test_fuzz_hypothesis_1661.py xrspatial/geotiff/tests/test_parse_ifd_bounds.py xrspatial/geotiff/tests/test_header.py -q

Fixes #1672.

@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 12, 2026
@brendancol brendancol requested a review from Copilot May 12, 2026 15:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the GeoTIFF parser’s parse_ifd routine by adding explicit bounds checks for the IFD entry table reads (entry count, per-entry header fields, and next-IFD pointer), ensuring truncated buffers raise typed ValueError instead of leaking struct.error—supporting the fuzz/property-test exception contract introduced in #1661.

Changes:

  • Add pre-unpack_from bounds checks in parse_ifd for num_entries, the entry table region, and the next-IFD pointer.
  • Add focused unit tests for classic TIFF and BigTIFF buffers that are 1 byte short of each guarded read.
  • Extend the Hypothesis fuzz suite with truncation-based properties to prevent regressions where struct.error escapes.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
xrspatial/geotiff/_header.py Adds entry-table bounds checks in parse_ifd so truncation errors raise ValueError instead of struct.error.
xrspatial/geotiff/tests/test_ifd_entry_table_bounds_1672.py New unit tests covering truncations at each newly-guarded read in classic TIFF and BigTIFF layouts.
xrspatial/geotiff/tests/test_fuzz_hypothesis_1661.py Adds truncation fuzz/property tests to enforce typed-error-only behavior on truncated inputs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +25 to +31
from xrspatial.geotiff._dtypes import LONG
from xrspatial.geotiff._header import (
TIFFHeader,
TAG_IMAGE_WIDTH,
parse_header,
parse_ifd,
)
Comment on lines +483 to +488
num_entries_size = 8 if is_big else 2
if offset + num_entries_size > data_len:
raise ValueError(
f"IFD num_entries at offset {offset} needs "
f"{num_entries_size} bytes but file length is {data_len}"
)
brendancol added a commit that referenced this pull request May 12, 2026
Addresses two Copilot review comments on PR #1677.

The new bounds checks added in #1672 only guarded
`offset + size > len(data)`, missing negative offsets. With a negative
`offset`, `struct.unpack_from` interprets it as Python's negative-index
semantic (read from the buffer end), silently returning wrong bytes or
raising raw `struct.error` and escaping the typed `ValueError` contract.
This commit adds explicit `< 0` checks on all three bounds-check sites
(num_entries read, entry-table region, next-IFD pointer) and an unused
`TIFFHeader` import is removed from the test module so flake8 stays
clean.

Tests cover negative offsets on both classic TIFF and BigTIFF, plus a
large negative value, and reuse existing fixtures to keep the file
self-contained.
`parse_ifd` was unpacking `num_entries`, every entry's tag/type/count,
and the next-IFD pointer with no prior length check. A truncated or
crafted TIFF could escape with `struct.error`, outside the
ValueError/TypeError contract the #1661 fuzz tests assume. The S2
typed-error work in #1661 covered the IFD value area but not the
entry table itself.

Adds three explicit bounds checks (num_entries field, full entry
table, next-IFD pointer) each raising ValueError with the offset,
required byte count, and file length, matching the existing
MAX_IFD_ENTRY_COUNT / value-area error message style.

Tests:
- test_ifd_entry_table_bounds_1672.py: one-byte-short buffers for
  each of the three new checks, in classic and BigTIFF layouts.
- test_fuzz_hypothesis_1661.py: new Group 4 truncation property
  tests sweeping every byte offset; fails on struct.error escape.
Addresses two Copilot review comments on PR #1677.

The new bounds checks added in #1672 only guarded
`offset + size > len(data)`, missing negative offsets. With a negative
`offset`, `struct.unpack_from` interprets it as Python's negative-index
semantic (read from the buffer end), silently returning wrong bytes or
raising raw `struct.error` and escaping the typed `ValueError` contract.
This commit adds explicit `< 0` checks on all three bounds-check sites
(num_entries read, entry-table region, next-IFD pointer) and an unused
`TIFFHeader` import is removed from the test module so flake8 stays
clean.

Tests cover negative offsets on both classic TIFF and BigTIFF, plus a
large negative value, and reuse existing fixtures to keep the file
self-contained.
@brendancol brendancol merged commit d95d2d0 into main May 12, 2026
10 of 11 checks passed
@brendancol brendancol deleted the issue-1672 branch May 15, 2026 04:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

parse_ifd raises raw struct.error before bounds checks on truncated TIFFs

2 participants