Widen read_vrt buffer to fit all selected band dtypes#1701
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes read_vrt multi-band dtype truncation by allocating the destination buffer using a common dtype wide enough for all selected bands (and for ComplexSource scale/offset promotion), preventing silent down-casts when placing per-band source arrays into the result.
Changes:
- Compute an “effective” dtype per selected band (promote to
float64when any source usesScaleRatio/ScaleOffset) and allocate the output buffer usingnp.result_typeacross bands. - Apply the same dtype-selection logic to the single-band
read_vrtpath. - Add a regression test suite covering mixed dtypes,
ComplexSourcescaling/offset, nodata interactions, single-band widening, and band selection.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
xrspatial/geotiff/_vrt.py |
Widens output buffer dtype based on all selected bands and scaling/offset rules to avoid truncation on assignment. |
xrspatial/geotiff/tests/test_vrt_multiband_dtype_1696.py |
Adds regression tests for mixed-band dtype widening and ComplexSource scaling/offset precision preservation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+441
to
+446
| # values to ``float64`` before placement (see L562-565); the destination | ||
| # has to be float-typed too, otherwise the fractional part is lost. | ||
| # | ||
| # ``np.result_type`` produces the narrowest dtype that holds every | ||
| # contributing dtype, so an all-integer VRT stays integer and only mixes | ||
| # widen to float64. See issue #1696. |
Comment on lines
+11
to
+14
| worse. The decoded source is explicitly promoted to ``float64`` at | ||
| ``_vrt.py`` L562-565 (``src_arr.astype(np.float64) * src.scale``), but | ||
| the destination buffer stays uint8 if all VRT bands declare ``Byte``, | ||
| so the post-scale fractional values are lost on assignment. |
Comment on lines
+447
to
+458
| effective_dtypes = [] | ||
| for vrt_band in selected_bands: | ||
| eff = vrt_band.dtype | ||
| for src in vrt_band.sources: | ||
| scaled = src.scale is not None and src.scale != 1.0 | ||
| offset = src.offset is not None and src.offset != 0.0 | ||
| if scaled or offset: | ||
| eff = np.dtype(np.float64) | ||
| break | ||
| effective_dtypes.append(eff) | ||
| dtype = np.result_type(*effective_dtypes) | ||
| fill = np.nan if dtype.kind in ('f', 'c') else 0 |
read_vrt allocated the multi-band output buffer from selected_bands[0].dtype only. Each band's source array was then assigned with result[..., band_idx] = src_arr[...], which silently casts the source to the narrow buffer dtype. A Float32 band 1 after a Byte band 0 returned uint8 with the float values truncated; a Byte band with <ScaleRatio>0.5</ScaleRatio> returned uint8 with the post-scale fractional part lost. Compute the effective per-band dtype (declared dtype, or float64 when any source has scale or offset, matching the existing promotion at _vrt.py L562-565) and take np.result_type across all selected bands before allocating the buffer. The single-band branch follows the same logic so a single-band scaled VRT also widens. All-integer VRTs without scaling stay integer, so memory is not blown up for the common case. Fixes #1696
a97e5ab to
ac2b1fc
Compare
…ty VRT
Three issues raised in review:
* `_vrt.py` allocation comment cited specific line numbers ("see L562-565")
for the ComplexSource scaling block. Line numbers drift; replace with a
named reference to the `# Apply ComplexSource scaling` block.
* The same comment claimed mixes "widen to float64". `np.result_type` may
also produce `float32` (e.g. NumPy 2.x on `uint8 + float32`) or
`complex128` when complex bands are present. Reword to describe the
common-dtype rule and list the typical outcomes.
* `test_vrt_multiband_dtype_1696.py` module docstring and one test
docstring cited `_vrt.py` L327-334 / L562-565. Replace with named
references (`parse_vrt` ComplexSource branch, `# Apply ComplexSource
scaling` block) that survive future edits.
* A VRT with zero `<VRTRasterBand>` elements made `np.result_type(*[])`
raise the generic "at least one array or dtype is required" error. Add
an explicit `if not selected_bands` guard that raises a clear
ValueError, plus a regression test asserting the new message.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #1696.
read_vrtallocated the output buffer fromselected_bands[0].dtype. Wider bands later in the list (declaredFloat32,Int16, scaled byComplexSource) were cast back to that first dtype on placement. AByteband 0 followed by aByteband 1 with<ScaleRatio>0.5</ScaleRatio>returned uint8 with the fractional values truncated.Fix
Pick the effective per-band dtype (the declared dtype, or
float64when any source hasscaleoroffset), takenp.result_typeacross all selected bands, allocateresultwith that. The single-band branch uses the same logic.The
ComplexSourceblock at_vrt.pyL562-565 already promotessrc_arrtofloat64; the new code uses that same rule when picking the destination dtype.Test plan
xrspatial/geotiff/tests/test_vrt_multiband_dtype_1696.py(9 tests).Byte+Float32: band 1 fractional values survive.ComplexSourceScaleRatio=0.5onByte: fractional scaled values survive.Byte, no scaling: stays uint8.ScaleRatio+ScaleOffsetcombination: precision preserved.band=Nselection returns the per-band declared dtype.Float32multi-band stays float32 (no over-widening).