Describe the bug
write_streaming() in xrspatial/geotiff/_writer.py decides between classic TIFF and BigTIFF using only the uncompressed pixel volume at lines 1652-1658:
uncompressed_bytes = height * width * bytes_per_sample * samples
UINT32_MAX = 0xFFFFFFFF
if bigtiff is not None:
use_bigtiff = bigtiff
else:
use_bigtiff = uncompressed_bytes > UINT32_MAX
The eager _assemble_tiff path adds the IFD, strip/tile tables, geo tags, and per-tile compressed payload estimate on top of the raw pixel count before deciding -- so it promotes to BigTIFF a little earlier than the streaming path. Streaming uses the bare pixel bound, so for a raster just under 4 GiB the IFD and strip table can push the real file size over UINT32_MAX while classic-TIFF was already selected. The failure mode is a late struct.error / overflow during LONG packing of strip offsets, well after the writer has committed to a layout.
The right fix is not to make streaming as eager-accurate as the eager path -- it can't know the compressed payload up-front -- but to reserve a conservative header/IFD overhead and promote when uncompressed_bytes + reserved_overhead >= UINT32_MAX.
Expected behavior
write_streaming() should add a conservative reserved-overhead constant to uncompressed_bytes before comparing against UINT32_MAX, so the BigTIFF decision survives the actual IFD layout and codec overhead. The threshold should be >= rather than > once the overhead is included, since the classic-TIFF format cannot address offsets equal to UINT32_MAX.
Categories
- Cat 4 (error handling): late
struct.error instead of an up-front BigTIFF promotion
- Cat 5 (backend inconsistency): eager and streaming write paths disagree on when to promote to BigTIFF
Describe the bug
write_streaming()inxrspatial/geotiff/_writer.pydecides between classic TIFF and BigTIFF using only the uncompressed pixel volume at lines 1652-1658:The eager
_assemble_tiffpath adds the IFD, strip/tile tables, geo tags, and per-tile compressed payload estimate on top of the raw pixel count before deciding -- so it promotes to BigTIFF a little earlier than the streaming path. Streaming uses the bare pixel bound, so for a raster just under 4 GiB the IFD and strip table can push the real file size overUINT32_MAXwhile classic-TIFF was already selected. The failure mode is a latestruct.error/ overflow during LONG packing of strip offsets, well after the writer has committed to a layout.The right fix is not to make streaming as eager-accurate as the eager path -- it can't know the compressed payload up-front -- but to reserve a conservative header/IFD overhead and promote when
uncompressed_bytes + reserved_overhead >= UINT32_MAX.Expected behavior
write_streaming()should add a conservative reserved-overhead constant touncompressed_bytesbefore comparing againstUINT32_MAX, so the BigTIFF decision survives the actual IFD layout and codec overhead. The threshold should be>=rather than>once the overhead is included, since the classic-TIFF format cannot address offsets equal toUINT32_MAX.Categories
struct.errorinstead of an up-front BigTIFF promotion