Skip to content

read_vrt: SrcRect/DstRect scaling not applied to source array placement #1694

@brendancol

Description

@brendancol

read_vrt does not resample source pixel data when a source band's <SrcRect> size differs from its <DstRect> size. Any VRT that uses GDAL's source-to-destination scaling either raises ValueError (on downsample) or returns spatially wrong pixels with the rest of the destination cell left at the fill value (on upsample).

Reproducer:

import numpy as np, tempfile, os
from xrspatial.geotiff._writer import write
from xrspatial.geotiff._vrt import read_vrt

with tempfile.TemporaryDirectory() as tmp:
    arr = np.arange(16, dtype=np.uint16).reshape(4, 4)
    src_path = os.path.join(tmp, 'src.tif')
    write(arr, src_path, compression='none', tiled=False)

    vrt_path = os.path.join(tmp, 'down.vrt')
    with open(vrt_path, 'w') as f:
        f.write(f'''<VRTDataset rasterXSize="2" rasterYSize="2">
  <GeoTransform>0.0, 2.0, 0.0, 0.0, 0.0, -2.0</GeoTransform>
  <VRTRasterBand dataType="UInt16" band="1">
    <SimpleSource>
      <SourceFilename relativeToVRT="0">{src_path}</SourceFilename>
      <SourceBand>1</SourceBand>
      <SrcRect xOff="0" yOff="0" xSize="4" ySize="4"/>
      <DstRect xOff="0" yOff="0" xSize="2" ySize="2"/>
    </SimpleSource>
  </VRTRasterBand>
</VRTDataset>''')

    read_vrt(vrt_path)
    # ValueError: could not broadcast input array from shape (4,4) into shape (2,2)

The upsample variant (SrcRect 2x2 -> DstRect 4x4) does not raise, but only the top-left 2x2 of each destination cell gets written; the remaining 12 cells stay at the fill value.

Root cause: xrspatial/geotiff/_vrt.py:464-472 computes a source-coordinate window from scale_x = sr.x_size / dr.x_size and scale_y = sr.y_size / dr.y_size, calls read_to_array(window=(src_r0, src_c0, src_r1, src_c1)), which returns an array sized to the source window, and copies it into the destination buffer at L577-583 with no resampling. The "size mismatch" guard at L573-575 compares src_arr.shape against out_r1 - out_r0 where out_r1 = out_r0 + src_arr.shape[0], so the guard never trips.

Fix plan: when sr.x_size != dr.x_size or sr.y_size != dr.y_size, resample the decoded src_arr from (sr.y_size, sr.x_size) to (dr.y_size, dr.x_size) before clipping and placing into result. Nearest-neighbour matches GDAL SimpleSource semantics. Carry the nodata mask through the resample. Clip after the resample so the existing window logic stays intact. No new dependencies; integer-ratio paths can use np.repeat or strided slicing, non-integer ratios use a per-axis nearest-index gather.

Regression tests will cover downsample 4x4 -> 2x2, upsample 2x2 -> 4x4, non-integer 3x3 -> 2x2, a per-band scale mix, window= on a downsampled source, and NoData preservation across the resample.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions