Fix integer COG overview poisoning by sentinel value by brendancol · Pull Request #1691 · xarray-contrib/xarray-spatial

brendancol · 2026-05-12T17:56:35Z

Summary

to_geotiff(int_data, cog=True, nodata=N) produced overview pyramids that mixed the sentinel into surrounding valid pixels for mean, min, max, and median resampling.
The reader can't mask poisoned pixels back to NaN because they don't equal the sentinel, so the user silently sees garbage at every zoom level above 0.
Fix masks the sentinel to NaN before the nan-aware reduction, gated on representability in the source integer dtype, and rewrites all-sentinel blocks back to the sentinel before the integer cast.
GPU mirror in _block_reduce_2d_gpu gets the same fix for byte parity with CPU.

Repro

import numpy as np, xarray as xr, tempfile
from xrspatial.geotiff import to_geotiff, open_geotiff

H, W = 256, 256
data = np.full((H, W), 100, dtype=np.int16)
data[64:129, 64:129] = -9999

da = xr.DataArray(data, dims=('y', 'x'),
                  coords={'y': np.arange(H, dtype=np.float64),
                          'x': np.arange(W, dtype=np.float64)},
                  attrs={'crs': 4326})

with tempfile.NamedTemporaryFile(suffix='.tif', delete=False) as f:
    path = f.name

to_geotiff(da, path, nodata=-9999, cog=True,
           overview_levels=[1], overview_resampling='mean')

# Before this PR:
# Level 1 unique: [-9999 -4950 -2425   100]
# The -4950 and -2425 are sentinel-poisoned averages.
#
# After this PR:
# Level 1 unique: [-9999  100]

Test plan

pytest xrspatial/geotiff/tests/test_cog_int_overview_nodata_2026_05_12.py -- 38 new tests pass.
pytest xrspatial/geotiff/tests/test_cog_overview_nodata_1613.py xrspatial/geotiff/tests/test_cog_cubic_overview_nodata_1623.py -- regression coverage for the float and cubic cases still passes.
Full geotiff suite (1606 tests pre-PR) stays green.
CPU and GPU integer overview output is byte-equivalent (covered by test_gpu_cpu_int_overview_byte_match).

Notes

Found via /sweep-accuracy on the geotiff module (pass 20). State CSV updated in this PR.
No new public API. Internal helpers _block_reduce_2d and _block_reduce_2d_gpu accept a more permissive nodata kwarg; existing callers (_make_overview / make_overview_gpu / write()) pass it through unchanged.
Deferred LOW for a future PR: _read_cog_http accepts band=-1 and silently returns the last channel, while the local / dask / GPU paths raise IndexError (added in Local eager band parameter accepts negative and out-of-range values silently #1673). Separate backend-parity gap; opening as a follow-up per the one-finding-per-PR policy.

Copilot

Pull request overview

This PR fixes integer COG overview “sentinel poisoning” during overview pyramid generation by ensuring integer nodata sentinel values are masked to NaN before nan-aware reductions, and that all-sentinel blocks are rewritten back to the sentinel prior to casting back to integer. This aligns CPU and GPU overview generation behavior and prevents corrupted overview values that the reader cannot later re-mask.

Changes:

Update _block_reduce_2d (CPU) to mask representable integer sentinels to NaN for nan-aware overview reductions, and rewrite all-NaN reduced blocks back to the sentinel before integer casting.
Apply the same integer-sentinel masking + NaN rewrite logic in _block_reduce_2d_gpu for byte-equivalent CPU/GPU overviews.
Add a comprehensive regression test suite covering unit-level reducers, end-to-end COG round-trips, and CPU/GPU parity.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
`xrspatial/geotiff/_writer.py`	Fix integer overview reduction by sentinel-masking before nan-aware aggregation and restoring sentinel for all-missing blocks pre-cast; update related docstrings.
`xrspatial/geotiff/_gpu_decode.py`	Mirror the integer sentinel masking + all-missing rewrite logic on GPU for CPU/GPU parity.
`xrspatial/geotiff/tests/test_cog_int_overview_nodata_2026_05_12.py`	Add extensive regression coverage for integer nodata behavior across methods/dtypes, end-to-end COG, and CPU/GPU byte parity.
`.claude/sweep-accuracy-state.csv`	Update sweep accuracy state to record the finding/fix and test coverage notes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    float case). The sentinel is ignored for ``nearest`` and ``mode``
+    methods (those pick existing values rather than synthesise new
+    averages). The ``cubic`` branch honours ``nodata`` by masking the
    sentinel to NaN, running cubic with ``prefilter=False`` to keep the
    kernel local, and rewriting any NaN in the output back to the
    sentinel before returning (issue #1623).


Per Copilot review on PR #1691: the docstring previously claimed the sentinel is "ignored" for mode and nearest, but neither method applies any nodata masking. nearest returns the top-left pixel of each 2x2 block, so the sentinel survives if it's in that corner. mode runs over raw values, so the sentinel can be selected as the overview pixel if it's the most frequent value in the block. Clarify the docstring to reflect actual behavior without claiming masking that doesn't happen. Docstring-only change. The existing tests already pin mode/nearest at their actual (sentinel-passthrough) behavior, so no test updates needed.

`to_geotiff(int_data, cog=True, nodata=N)` produced overview pyramids that mixed the sentinel into surrounding valid pixels for `mean`, `min`, `max`, and `median` resampling. The reader can't mask the poisoned values back to NaN because they don't equal the sentinel, so the user silently sees garbage at every zoom level above 0. Root cause: `_block_reduce_2d` (`_writer.py:258-264`) and `_block_reduce_2d_gpu` (`_gpu_decode.py:3027-3028`) promoted the integer block to `float64` but never masked the sentinel to NaN before calling `nanmean` / `nanmin` / `nanmax` / `nanmedian`. The reduction then averaged the sentinel as if it were signal -- `(-9999 + 100 + 100 + 100) / 4 = -2425` cast back to `int16`. Fix: apply the same sentinel-to-NaN mask the float branch already uses, gated on the sentinel being representable in the source integer dtype (mirrors `_int_nodata_in_range` in `_reader.py`). After the reduction, all-sentinel blocks come back as NaN; rewrite those to the sentinel before the integer dtype cast so the cast is well-defined. The caller's post-overview rewrite loop in `write()` only runs for floats, so the integer branch closes the loop itself. GPU mirror gets the same treatment for byte parity with CPU (the contract from #1623). Tests: 38 cases in test_cog_int_overview_nodata_2026_05_12.py covering the `_block_reduce_2d` per-dtype / per-method matrix (uint8 / uint16 / int16 / int32 x mean / min / max / median), the all-sentinel block case, no-nodata regression, out-of-range sentinel no-op, end-to-end uint16 + int16 round-trip, 3-band integer COG, GPU per-dtype / per-method matrix, and CPU/GPU byte-match parity. All 1606 existing geotiff tests still pass. Found via /sweep-accuracy on the geotiff module (pass 20). State CSV updated.

Per Copilot review on PR #1691: the docstring previously claimed the sentinel is "ignored" for mode and nearest, but neither method applies any nodata masking. nearest returns the top-left pixel of each 2x2 block, so the sentinel survives if it's in that corner. mode runs over raw values, so the sentinel can be selected as the overview pixel if it's the most frequent value in the block. Clarify the docstring to reflect actual behavior without claiming masking that doesn't happen. Docstring-only change. The existing tests already pin mode/nearest at their actual (sentinel-passthrough) behavior, so no test updates needed.

github-actions Bot added the performance PR touches performance-sensitive code label May 12, 2026

brendancol requested a review from Copilot May 12, 2026 17:57

Copilot started reviewing on behalf of brendancol May 12, 2026 18:00 View session

Copilot AI reviewed May 12, 2026

View reviewed changes

brendancol added 3 commits May 12, 2026 11:27

Add PR reference to sweep-accuracy state for geotiff pass 20

24262ff

brendancol force-pushed the deep-sweep-accuracy-geotiff-2026-05-12-v2 branch from df5a50d to 447f20f Compare May 12, 2026 18:28

brendancol merged commit 2bc44f6 into main May 12, 2026
10 checks passed

brendancol deleted the deep-sweep-accuracy-geotiff-2026-05-12-v2 branch May 15, 2026 04:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix integer COG overview poisoning by sentinel value#1691

Fix integer COG overview poisoning by sentinel value#1691
brendancol merged 3 commits into
mainfrom
deep-sweep-accuracy-geotiff-2026-05-12-v2

brendancol commented May 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

brendancol commented May 12, 2026

Summary

Repro

Test plan

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants