Skip to content

read_vrt: multi-band result allocated as band[0] dtype silently truncates wider bands #1696

@brendancol

Description

@brendancol

Summary

read_vrt on a multi-band VRT allocates the output buffer from band 0's declared dataType. Each band's decoded source is then placed into that buffer with result[..., band_idx] = src_arr[...], which casts the source to the buffer dtype. A Float32 band 1 after a Byte band 0 gets truncated to uint8. No warning.

Reproduction

import numpy as np
from xrspatial.geotiff import read_vrt
from xrspatial.geotiff._writer import write

b0 = np.array([[10, 11], [12, 13]], dtype=np.uint8)
b1 = np.array([[10, 11], [12, 13]], dtype=np.uint8)
write(b0, "b0.tif", compression='none', tiled=False)
write(b1, "b1.tif", compression='none', tiled=False)

vrt_xml = '''<VRTDataset rasterXSize="2" rasterYSize="2">
  <GeoTransform>0.0, 1.0, 0.0, 0.0, 0.0, -1.0</GeoTransform>
  <VRTRasterBand dataType="Byte" band="1">
    <SimpleSource>
      <SourceFilename relativeToVRT="0">b0.tif</SourceFilename>
      <SourceBand>1</SourceBand>
      <SrcRect xOff="0" yOff="0" xSize="2" ySize="2"/>
      <DstRect xOff="0" yOff="0" xSize="2" ySize="2"/>
    </SimpleSource>
  </VRTRasterBand>
  <VRTRasterBand dataType="Byte" band="2">
    <ComplexSource>
      <SourceFilename relativeToVRT="0">b1.tif</SourceFilename>
      <SourceBand>1</SourceBand>
      <ScaleRatio>0.5</ScaleRatio>
      <SrcRect xOff="0" yOff="0" xSize="2" ySize="2"/>
      <DstRect xOff="0" yOff="0" xSize="2" ySize="2"/>
    </ComplexSource>
  </VRTRasterBand>
</VRTDataset>'''
# ... write vrt_xml and call read_vrt

r = read_vrt("test.vrt")
# Expected band 1 = [[5.0, 5.5], [6.0, 6.5]] (uint8 * 0.5)
# Got      band 1 = [[5, 5], [6, 6]]   (truncated)

A Byte + Float32 VRT hits the same bug. Band 1's Float32 values 1.5, 2.5, 3.5, 4.5 come back as 1, 2, 3, 4.

Root cause

xrspatial/geotiff/_vrt.py L431-439:

dtype = selected_bands[0].dtype
result = np.full((out_h, out_w, len(selected_bands)),
                 np.nan if dtype.kind == 'f' else 0, dtype=dtype)

L581-583 then assigns each band into the buffer. The ComplexSource block at L562-565 already promotes src_arr to float64, but the assignment casts that back to the narrow buffer dtype, so a uint8 buffer with a 0.5x scale ratio produces values like 11 * 0.5 = 5.5 that round to 5 on storage.

Fix plan

Pick the buffer dtype from all selected bands, not just band 0:

  1. For each selected band, the effective dtype is vrt_band.dtype, or np.float64 when any source on that band has a non-trivial scale or offset (matches the existing promotion at L562-565).
  2. result_dtype = np.result_type(*effective_dtypes).
  3. Allocate result with this dtype.
  4. The single-band branch (L432-435) takes the same path so a single-band scaled VRT also widens.
  5. Fill stays np.nan for float kinds, 0 otherwise.

np.result_type covers the corner cases: mixed integer widths keep an integer dtype, integer + float widens to a float dtype, complex stays complex.

Tests

Regression coverage in xrspatial/geotiff/tests/test_vrt_multiband_dtype_1696.py:

  • Mixed Byte + Float32 bands: fractional float32 values survive.
  • ComplexSource ScaleRatio=0.5 on a Byte band: fractional scaled values survive.
  • All-Byte bands without scaling: stays uint8 (no needless widening).
  • Per-band ScaleRatio + ScaleOffset: precision preserved.
  • NoData round-trip through widened dtype.
  • Single-band scaled VRT also widens.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions