Skip to content

perf(geotiff): drop bytes(bytearray) copy in TIFF layout assemble (#1756)#1762

Open
brendancol wants to merge 2 commits into
mainfrom
deep-sweep-performance-geotiff-2026-05-12-aa72a2f
Open

perf(geotiff): drop bytes(bytearray) copy in TIFF layout assemble (#1756)#1762
brendancol wants to merge 2 commits into
mainfrom
deep-sweep-performance-geotiff-2026-05-12-aa72a2f

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

  • _assemble_standard_layout and _assemble_cog_layout build the output TIFF in a bytearray, then ended with return bytes(output) which copies the entire buffer
  • Returning the bytearray directly drops transient peak memory in half on eager writes; downstream consumers (_write_bytes, the parse_header validation slice, BytesIO.write, file handle write) all accept the buffer protocol so no caller changes are needed
  • Type annotations on _assemble_tiff, both layout helpers, and _write_bytes are updated to reflect that the return type is now bytearray

Fixes #1756.

Measurements

On a 10000x10000 uint8 raster (95 MB output):

Before: peak Python-allocated = 202 MB
After:  peak Python-allocated = 107 MB

The savings scale linearly with output size: a 1 GB write drops 1 GB of peak memory, a 10 GB write drops 10 GB.

Test plan

  • 6 new regression tests in test_assemble_layout_no_bytes_copy_1756.py cover the bytearray return type at the layout, assembler, and end-to-end levels (BytesIO and tmp_path round-trips)
  • 2123 existing geotiff tests pass; 10 unrelated failures (test_no_georef_windowed_coords_1710, test_predictor2_big_endian_gpu_1517) reference the now-private read_to_array attribute (commit 8adb749 / issue geotiff: read_to_array leaks into public namespace but is not in __all__ or docs #1708) and predate this change
  • CPU and GPU writer paths both exercise _assemble_tiff; the GPU path is covered by test_gpu_writer_attrs_1563.py and test_kwarg_behaviour_2026_05_12.py (all 36 pass)

)

The eager (non-streaming) writer builds the output TIFF in a bytearray
inside _assemble_standard_layout and _assemble_cog_layout, then ends
with ``return bytes(output)``. The bytes() call copies the entire
buffer, transiently doubling peak Python-allocated memory for the
duration of the conversion.

Measured on a 95 MB uint8 raster:

  Before: peak 202 MB (95 MB bytearray + 95 MB bytes copy)
  After:  peak 107 MB (just the bytearray)

Returning the bytearray directly preserves correctness: ``_write_bytes``
already calls ``f.write(file_bytes)`` which accepts any buffer-protocol
object, and the post-write ``parse_header(file_bytes[:16])`` validation
slice works the same on bytearray and bytes. The streaming writer is
unaffected -- it writes straight to a file handle and never built a
single contiguous output buffer.

Type annotations on _assemble_tiff, _assemble_standard_layout,
_assemble_cog_layout, and _write_bytes are updated to reflect the
buffer-protocol contract.

Tests in test_assemble_layout_no_bytes_copy_1756.py:
* Both layout helpers return bytearray (not bytes)
* _assemble_tiff propagates the bytearray return through CPU and GPU
  writer paths
* Round-trip via BytesIO and a tmp_path .tif still produces correct
  pixel data after the type change
* The assembler returns a writeable bytearray whose first 16 bytes
  parse as a valid TIFF header
Audited the geotiff subpackage on top of Pass 7 (2026-05-12). Found and
fixed one new MEDIUM: the eager TIFF layout assembler ended with a
``bytes(bytearray)`` copy that doubled peak Python-allocated memory for
the duration of the conversion. Filed #1756, fix landed in the same
branch (PR pending).

SAFE / IO-bound verdict holds. Peak memory now scales 1x with the
output buffer size instead of 2x.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

geotiff: bytes(bytearray) at end of _assemble_*_layout doubles peak memory

1 participant