Skip to content

Widen read_vrt buffer to fit all selected band dtypes#1701

Merged
brendancol merged 2 commits into
mainfrom
fix-vrt-multiband-dtype-2026-05-12
May 12, 2026
Merged

Widen read_vrt buffer to fit all selected band dtypes#1701
brendancol merged 2 commits into
mainfrom
fix-vrt-multiband-dtype-2026-05-12

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Fixes #1696.

read_vrt allocated the output buffer from selected_bands[0].dtype. Wider bands later in the list (declared Float32, Int16, scaled by ComplexSource) were cast back to that first dtype on placement. A Byte band 0 followed by a Byte band 1 with <ScaleRatio>0.5</ScaleRatio> returned uint8 with the fractional values truncated.

Fix

Pick the effective per-band dtype (the declared dtype, or float64 when any source has scale or offset), take np.result_type across all selected bands, allocate result with that. The single-band branch uses the same logic.

The ComplexSource block at _vrt.py L562-565 already promotes src_arr to float64; the new code uses that same rule when picking the destination dtype.

Test plan

  • New: xrspatial/geotiff/tests/test_vrt_multiband_dtype_1696.py (9 tests).
  • Mixed Byte + Float32: band 1 fractional values survive.
  • ComplexSource ScaleRatio=0.5 on Byte: fractional scaled values survive.
  • All-Byte, no scaling: stays uint8.
  • ScaleRatio + ScaleOffset combination: precision preserved.
  • NoData round-trip through widened dtype.
  • Single-band scaled VRT also widens.
  • band=N selection returns the per-band declared dtype.
  • All-Float32 multi-band stays float32 (no over-widening).
  • Full VRT suite: 89 prior tests still pass (98 total with the new file).

@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 12, 2026
@brendancol brendancol requested a review from Copilot May 12, 2026 18:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes read_vrt multi-band dtype truncation by allocating the destination buffer using a common dtype wide enough for all selected bands (and for ComplexSource scale/offset promotion), preventing silent down-casts when placing per-band source arrays into the result.

Changes:

  • Compute an “effective” dtype per selected band (promote to float64 when any source uses ScaleRatio/ScaleOffset) and allocate the output buffer using np.result_type across bands.
  • Apply the same dtype-selection logic to the single-band read_vrt path.
  • Add a regression test suite covering mixed dtypes, ComplexSource scaling/offset, nodata interactions, single-band widening, and band selection.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
xrspatial/geotiff/_vrt.py Widens output buffer dtype based on all selected bands and scaling/offset rules to avoid truncation on assignment.
xrspatial/geotiff/tests/test_vrt_multiband_dtype_1696.py Adds regression tests for mixed-band dtype widening and ComplexSource scaling/offset precision preservation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread xrspatial/geotiff/_vrt.py Outdated
Comment on lines +441 to +446
# values to ``float64`` before placement (see L562-565); the destination
# has to be float-typed too, otherwise the fractional part is lost.
#
# ``np.result_type`` produces the narrowest dtype that holds every
# contributing dtype, so an all-integer VRT stays integer and only mixes
# widen to float64. See issue #1696.
Comment on lines +11 to +14
worse. The decoded source is explicitly promoted to ``float64`` at
``_vrt.py`` L562-565 (``src_arr.astype(np.float64) * src.scale``), but
the destination buffer stays uint8 if all VRT bands declare ``Byte``,
so the post-scale fractional values are lost on assignment.
Comment thread xrspatial/geotiff/_vrt.py
Comment on lines +447 to +458
effective_dtypes = []
for vrt_band in selected_bands:
eff = vrt_band.dtype
for src in vrt_band.sources:
scaled = src.scale is not None and src.scale != 1.0
offset = src.offset is not None and src.offset != 0.0
if scaled or offset:
eff = np.dtype(np.float64)
break
effective_dtypes.append(eff)
dtype = np.result_type(*effective_dtypes)
fill = np.nan if dtype.kind in ('f', 'c') else 0
read_vrt allocated the multi-band output buffer from selected_bands[0].dtype
only. Each band's source array was then assigned with
result[..., band_idx] = src_arr[...], which silently casts the source to
the narrow buffer dtype. A Float32 band 1 after a Byte band 0 returned
uint8 with the float values truncated; a Byte band with
<ScaleRatio>0.5</ScaleRatio> returned uint8 with the post-scale
fractional part lost.

Compute the effective per-band dtype (declared dtype, or float64 when
any source has scale or offset, matching the existing promotion at
_vrt.py L562-565) and take np.result_type across all selected bands
before allocating the buffer. The single-band branch follows the same
logic so a single-band scaled VRT also widens. All-integer VRTs without
scaling stay integer, so memory is not blown up for the common case.

Fixes #1696
@brendancol brendancol force-pushed the fix-vrt-multiband-dtype-2026-05-12 branch from a97e5ab to ac2b1fc Compare May 12, 2026 18:28
…ty VRT

Three issues raised in review:

* `_vrt.py` allocation comment cited specific line numbers ("see L562-565")
  for the ComplexSource scaling block. Line numbers drift; replace with a
  named reference to the `# Apply ComplexSource scaling` block.
* The same comment claimed mixes "widen to float64". `np.result_type` may
  also produce `float32` (e.g. NumPy 2.x on `uint8 + float32`) or
  `complex128` when complex bands are present. Reword to describe the
  common-dtype rule and list the typical outcomes.
* `test_vrt_multiband_dtype_1696.py` module docstring and one test
  docstring cited `_vrt.py` L327-334 / L562-565. Replace with named
  references (`parse_vrt` ComplexSource branch, `# Apply ComplexSource
  scaling` block) that survive future edits.
* A VRT with zero `<VRTRasterBand>` elements made `np.result_type(*[])`
  raise the generic "at least one array or dtype is required" error. Add
  an explicit `if not selected_bands` guard that raises a clear
  ValueError, plus a regression test asserting the new message.
@brendancol brendancol merged commit 1624d13 into main May 12, 2026
10 checks passed
@brendancol brendancol deleted the fix-vrt-multiband-dtype-2026-05-12 branch May 15, 2026 04:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

read_vrt: multi-band result allocated as band[0] dtype silently truncates wider bands

2 participants