Skip to content

geotiff: float16 GeoTIFFs cannot be read despite writer auto-promotion #1941

@brendancol

Description

@brendancol

Summary

The writer auto-promotes float16 input to float32 in _writers/eager.py and _writer.py, but the read-side dtype map in _dtypes.py tiff_dtype_to_numpy has no entry for (16, SAMPLE_FORMAT_FLOAT). Any externally-produced GeoTIFF with BitsPerSample=16 + SampleFormat=3 (IEEE half-precision float) is rejected with:

ValueError: Unsupported BitsPerSample=16, SampleFormat=3

xrspatial's own files never hit this path because the writer never emits 16-bit floats. Read parity with rasterio / GDAL (which decode float16 to numpy float16 or upcast to float32) is broken.

Fix

Map (16, SAMPLE_FORMAT_FLOAT) to np.float32 on read. Promoting to float32 (rather than numpy float16) matches the writer's auto-promotion behaviour and avoids downstream numba issues; xrspatial's CPU JIT kernels do not have float16 numba paths, and dask + numba paths would also need new dispatch.

The decode path needs to:

  1. Read the raw 2-byte little-endian half-float values from the file.
  2. Convert to float32 once before the regular pipeline runs.

A clean implementation: in the decode site, detect the float16 case via bps == 16 and sample_format == 3, view the raw bytes as numpy.float16, then .astype(np.float32). The rest of the read path already expects the chunk dtype to match the returned numpy dtype.

Regression test

Round-trip a small float16 array written via tifffile and assert open_geotiff returns float32 with the same numerical values (within float16 precision). Also assert the existing tiff_dtype_to_numpy(16, 3) path returns np.float32 (the existing test that asserts it raises is updated to match).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions