geotiff: raise on unknown VRT dataType; support UInt64/Int64 (#1783)#1789
Merged
Conversation
Unknown ``<VRTRasterBand dataType="...">`` values used to fall back to Float32 via ``_DTYPE_MAP.get(name, np.float32)``. Complex types (CInt16, CInt32, CFloat32, CFloat64) silently lost the imaginary component, 64-bit integer types lost precision, and typos produced wrong values. * Add ``UInt64`` -> ``np.uint64`` and ``Int64`` -> ``np.int64`` to ``_DTYPE_MAP`` (GDAL 3.5+ names, lossless round-trip). * In ``parse_vrt``, distinguish a missing ``dataType`` attribute (still defaults to ``Float32`` per the GDAL spec) from one that is present but unsupported. The latter raises ``ValueError`` naming the band, the offending dataType, and the supported types, and notes that complex types are explicitly unsupported.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens VRT dtype handling in the GeoTIFF VRT reader to prevent silent data corruption when a VRT declares an unsupported or misspelled VRTRasterBand dataType, and extends dtype support to 64-bit integers.
Changes:
- Add
UInt64andInt64to the VRTdataType→ NumPy dtype mapping. - In
parse_vrt, keep the GDAL-default behavior for missingdataType, but raiseValueErrorwhendataTypeis present and unknown/unsupported (including complex types). - Add regression tests covering: complex dtype rejection, garbage/typo rejection,
UInt64/Int64round-trips, and the missing-attribute default.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
xrspatial/geotiff/_vrt.py |
Adds 64-bit integer dtype support and changes dataType resolution to raise on unknown/unsupported values. |
xrspatial/geotiff/tests/test_vrt_dtype_1783.py |
New regression tests validating the new dtype resolution behavior and 64-bit integer support. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
310
to
312
| dtype = np.dtype(_DTYPE_MAP[dtype_name]) | ||
| nodata_str = _text(band_elem, 'NoDataValue') | ||
| nodata = float(nodata_str) if nodata_str else None |
| return _xml_quoteattr(str(value)) | ||
|
|
||
| # Lazy imports to avoid circular dependency | ||
| # Lazy imports to avoid circular dependency. |
Addresses Copilot review feedback on PR #1789. Now that UInt64 / Int64 are supported VRT dataTypes, parsing <NoDataValue> as float silently loses precision for large integer sentinels. 2**64-1 rounds up to 2**64 in float64, which then fails the iinfo range check in the downstream masking pipeline and the mask never fires; the sentinel survives in the output as a 1.8e19 float that looks like real data. * New helper _parse_band_nodata(text, dtype) in _vrt.py parses integer-band sentinels as int and falls back to float only when the value is non-integer or out of range for the dtype. Used for both <NoDataValue> at the band level and <NODATA> on each source. * _Source / _VRTBand .nodata typed as Union[int, float, None]. * _sentinel_for_dtype in __init__.py grows a fast int path so an int sentinel does not go through float() and lose precision at the 64-bit extremes. * Integer-source-to-float-band masking branch in _vrt.read_vrt accepts int sentinels directly. * Removed the stale "Lazy imports to avoid circular dependency" comment above _DTYPE_MAP (it is a dtype map, not imports). * Tests in test_vrt_dtype_1783.py cover UInt64 max, INT64_MIN, Int32 -9999, Float32 nan, Float64 scientific notation, and an end-to-end masking test that puts 2**64-1 in the data and verifies the result is NaN.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1783.
Summary
xrspatial/geotiff/_vrt.pyresolved<VRTRasterBand dataType="...">with_DTYPE_MAP.get(name, np.float32). Any unknown name fell back to Float32 with no warning. Complex types (CInt16,CInt32,CFloat32,CFloat64) silently dropped the imaginary component, 64-bit integer types lost precision, and typos produced wrong values.UInt64->np.uint64andInt64->np.int64to_DTYPE_MAP(GDAL 3.5+ names, round-trip cleanly through numpy).parse_vrt: a missingdataTypeattribute still defaults to Float32 (GDAL's documented default, no behaviour change). An attribute that is present but not in_DTYPE_MAPnow raisesValueErrornaming the band, the offending dataType, and the supported set. The message also flags complex types as unsupported.Test plan
pytest -xvs xrspatial/geotiff/tests/test_vrt_dtype_1783.py(11 tests pass)pytest xrspatial/geotiff/tests/ -k vrt(261 pass, the only deselected failure on this worktree istest_size_param_validation_gpu_vrt_1776::test_tile_size_positive_workswhich is pre-existing and unrelated -- broken by the tile_size-multiple-of-16 requirement merged in geotiff: require tile_size to be a multiple of 16 (#1767) #1781)GarbageandFlaot32(typo) raiseUInt64round-trips2**64 - 1losslessly;Int64round-tripsiinfo(int64).min/max<VRTRasterBand>with nodataTypeattribute still reads as Float32