Describe the bug
The CPU decode path for TIFF predictor=3 (floating-point predictor) mis-handles multi-sample chunky data. Reading an externally-written GeoTIFF with Predictor=3, PlanarConfiguration=1 (chunky), and SamplesPerPixel > 1 via open_geotiff() returns garbage pixel values.
The writer is unaffected because it never emits predictor=3, only predictor=2.
Root cause
_reader.py:290-291 calls:
chunk = _apply_predictor(chunk, pred, width, height, bytes_per_sample * samples)
which routes to:
fp_predictor_decode(chunk, width, height, bytes_per_sample * samples)
Inside _fp_predictor_decode_row(row_data, width, bps) this treats the row as width super-samples, each bps * samples bytes wide, de-interleaving into bps * samples byte lanes of length width.
The TIFF Technical Note 3 spec (used by libtiff and GDAL) says the row should be de-interleaved into bps lanes of length width * samples. The GPU path at _gpu_decode.py:1350-1351 and :1598-1599 does this correctly:
_fp_predictor_decode_kernel[bpg, tpb](
d_decomp, d_tmp, tile_width * samples, total_rows, dtype.itemsize)
So CPU and GPU decode diverge for multi-band predictor=3 files, and the CPU output does not match libtiff or GDAL.
Reproducer
For a 4-pixel-wide, 3-band float32 row (48 bytes, 96 bytes for 2 rows), manually TN3-encoding and decoding via fp_predictor_decode with the current signature gives 56 of 96 bytes wrong.
Expected behavior
open_geotiff(path) on a GDAL-written multi-band float32 TIFF with predictor=3 should return the same pixel values as open_geotiff(path, gpu=True) and as GDAL.
Fix
In _reader.py:_apply_predictor, when pred == 3 use
```python
fp_predictor_decode(chunk, width * samples, height, bytes_per_sample)
```
instead of
```python
fp_predictor_decode(chunk, width, height, bytes_per_sample * samples)
```
This matches the GPU path and the TN3 spec. Predictor=2 keeps its current call: the stride is bytes_per_pixel and bytes_per_sample * samples is equivalent for that path.
Severity
HIGH. Multi-band float32 TIFFs with predictor=3 are common in GDAL output. The failure is silent: no error, just wrong numbers.
Scope
xrspatial/geotiff/_reader.py: fix the dispatch.
xrspatial/geotiff/tests/test_predictor_multisample.py: add a regression test that decodes a TN3-encoded multi-band predictor=3 buffer.
Found by the /sweep-accuracy geotiff audit.
Describe the bug
The CPU decode path for TIFF predictor=3 (floating-point predictor) mis-handles multi-sample chunky data. Reading an externally-written GeoTIFF with
Predictor=3,PlanarConfiguration=1(chunky), andSamplesPerPixel > 1viaopen_geotiff()returns garbage pixel values.The writer is unaffected because it never emits predictor=3, only predictor=2.
Root cause
_reader.py:290-291calls:which routes to:
Inside
_fp_predictor_decode_row(row_data, width, bps)this treats the row aswidthsuper-samples, eachbps * samplesbytes wide, de-interleaving intobps * samplesbyte lanes of lengthwidth.The TIFF Technical Note 3 spec (used by libtiff and GDAL) says the row should be de-interleaved into
bpslanes of lengthwidth * samples. The GPU path at_gpu_decode.py:1350-1351and:1598-1599does this correctly:So CPU and GPU decode diverge for multi-band predictor=3 files, and the CPU output does not match libtiff or GDAL.
Reproducer
For a 4-pixel-wide, 3-band float32 row (48 bytes, 96 bytes for 2 rows), manually TN3-encoding and decoding via
fp_predictor_decodewith the current signature gives 56 of 96 bytes wrong.Expected behavior
open_geotiff(path)on a GDAL-written multi-band float32 TIFF with predictor=3 should return the same pixel values asopen_geotiff(path, gpu=True)and as GDAL.Fix
In
_reader.py:_apply_predictor, whenpred == 3use```python
fp_predictor_decode(chunk, width * samples, height, bytes_per_sample)
```
instead of
```python
fp_predictor_decode(chunk, width, height, bytes_per_sample * samples)
```
This matches the GPU path and the TN3 spec. Predictor=2 keeps its current call: the stride is
bytes_per_pixelandbytes_per_sample * samplesis equivalent for that path.Severity
HIGH. Multi-band float32 TIFFs with predictor=3 are common in GDAL output. The failure is silent: no error, just wrong numbers.
Scope
xrspatial/geotiff/_reader.py: fix the dispatch.xrspatial/geotiff/tests/test_predictor_multisample.py: add a regression test that decodes a TN3-encoded multi-band predictor=3 buffer.Found by the
/sweep-accuracygeotiff audit.