Summary
xrspatial.geotiff._vrt.read_vrt silently treats <NODATA>0</NODATA> on a SimpleSource as if the element were absent. Pixels equal to 0.0 in the source file are returned as valid data instead of being masked to NaN. The root cause is the or truthiness fallback at line 370 of _vrt.py:
src_nodata = src.nodata or nodata
Python evaluates 0.0 or nodata to nodata because 0.0 is falsy, so a SimpleSource that declares <NODATA>0</NODATA> is replaced by the band-level <NoDataValue> (or None if there isn't one).
The in-code comment acknowledges the quirk:
# ``src.nodata or nodata`` is kept for backward compatibility but
# intentionally treats ``0.0`` as unset (a long-standing quirk of this reader).
But the resulting behavior is silently wrong for any VRT that pairs sources with sentinel 0.0 (a common convention for unsigned imagery where 0 marks "no data").
Reproduction
import numpy as np
import tempfile, os
from xrspatial.geotiff._writer import write
from xrspatial.geotiff._geotags import GeoTransform
from xrspatial.geotiff._vrt import read_vrt
tmp = tempfile.mkdtemp()
arr = np.array([[1.0, 0.0, 3.0, 0.0]], dtype=np.float32)
src = os.path.join(tmp, "src.tif")
write(arr, src, geo_transform=GeoTransform(0, 0, 1, -1), crs_epsg=4326,
compression='none', tiled=False)
vrt_xml = f'''<VRTDataset rasterXSize="4" rasterYSize="1">
<SRS>EPSG:4326</SRS>
<GeoTransform>0.0, 1.0, 0.0, 0.0, 0.0, -1.0</GeoTransform>
<VRTRasterBand dataType="Float32" band="1">
<SimpleSource>
<SourceFilename relativeToVRT="0">{src}</SourceFilename>
<SourceBand>1</SourceBand>
<SrcRect xOff="0" yOff="0" xSize="4" ySize="1"/>
<DstRect xOff="0" yOff="0" xSize="4" ySize="1"/>
<NODATA>0.0</NODATA>
</SimpleSource>
</VRTRasterBand>
</VRTDataset>
'''
vrt_path = os.path.join(tmp, "test.vrt")
with open(vrt_path, 'w') as f:
f.write(vrt_xml)
result, _ = read_vrt(vrt_path)
# Expected: [[1.0, nan, 3.0, nan]]
# Actual: [[1.0, 0.0, 3.0, 0.0]]
print(result)
print(f"NaN count: {np.isnan(result).sum()}, expected: 2")
Severity
Medium. Silent data corruption only when (a) <NODATA> on the SimpleSource is 0 (any cast equivalent), (b) the band has no <NoDataValue> of its own, or the band-level sentinel differs from 0. Datasets that declare both fall back to the band-level value and look correct, masking the bug from most existing tests.
Suggested fix
Replace the or truthiness shortcut with an explicit None check so legitimate 0.0 sentinels survive:
src_nodata = src.nodata if src.nodata is not None else nodata
Same change in the integer branch a few lines down. Add a regression test that exercises the <NODATA>0</NODATA> case.
Summary
xrspatial.geotiff._vrt.read_vrtsilently treats<NODATA>0</NODATA>on a SimpleSource as if the element were absent. Pixels equal to 0.0 in the source file are returned as valid data instead of being masked to NaN. The root cause is theortruthiness fallback at line 370 of_vrt.py:Python evaluates
0.0 or nodatatonodatabecause0.0is falsy, so a SimpleSource that declares<NODATA>0</NODATA>is replaced by the band-level<NoDataValue>(orNoneif there isn't one).The in-code comment acknowledges the quirk:
But the resulting behavior is silently wrong for any VRT that pairs sources with sentinel 0.0 (a common convention for unsigned imagery where 0 marks "no data").
Reproduction
Severity
Medium. Silent data corruption only when (a)
<NODATA>on the SimpleSource is0(any cast equivalent), (b) the band has no<NoDataValue>of its own, or the band-level sentinel differs from 0. Datasets that declare both fall back to the band-level value and look correct, masking the bug from most existing tests.Suggested fix
Replace the
ortruthiness shortcut with an explicitNonecheck so legitimate0.0sentinels survive:Same change in the integer branch a few lines down. Add a regression test that exercises the
<NODATA>0</NODATA>case.