Skip to content

COG overview generation poisoned by nodata sentinel #1613

@brendancol

Description

@brendancol

Summary

When to_geotiff(..., cog=True, nodata=<finite>) writes a Cloud Optimized GeoTIFF from a float raster that contains NaN pixels, the resulting overview levels are corrupted. NaN pixels are replaced with the nodata sentinel before overview generation, so the np.nanmean / np.nanmin / np.nanmax / np.nanmedian reductions in _make_overview treat the sentinel as a real number and bias the overview values toward the sentinel.

The same bug affects write_geotiff_gpu (the GPU writer follows the same order: rewrite NaN to sentinel, then call make_overview_gpu).

Reproduction

import os, tempfile
import numpy as np
import xarray as xr
from xrspatial.geotiff import to_geotiff, open_geotiff

# Float raster with NaN pixels in the top-left 2x2 block
arr = np.array([
    [1.0, 2.0, 3.0, 4.0],
    [np.nan, np.nan, np.nan, np.nan],
    [10.0, 20.0, 30.0, 40.0],
    [10.0, 20.0, 30.0, 40.0],
], dtype=np.float32)

da = xr.DataArray(arr, dims=['y', 'x'])

with tempfile.TemporaryDirectory() as tmp:
    p = os.path.join(tmp, 'test.tif')
    to_geotiff(da, p, nodata=-9999.0, cog=True, compression='deflate',
               tiled=True, tile_size=2, overview_levels=[1],
               overview_resampling='mean')
    ov = open_geotiff(p, overview_level=1)
    print(ov.data)

Output:

[[-4998.75 -4997.75]
 [   15.      35.  ]]

Expected (nanmean ignoring NaN): top row should be [1.5, 3.5], not [-4998.75, -4997.75].

Root cause

to_geotiff (xrspatial/geotiff/__init__.py:1202-1206) replaces NaN with the nodata sentinel before calling write(). write() then calls _make_overview() (xrspatial/geotiff/_writer.py:1103-1105) on the sentinel-rewritten array. The nanmean / nanmin etc. inside _block_reduce_2d (_writer.py:178-198) cannot see the original NaN positions, so the sentinel value participates in the reduction.

The same flow exists in write_geotiff_gpu (__init__.py:2852-2898): NaN is rewritten to the sentinel before make_overview_gpu runs.

Severity

HIGH. COGs produced with nodata=... on float rasters with NaN pixels carry numerically wrong overview pyramids. The full-resolution band is fine; the corruption only shows up when a reader picks up a non-zero overview level (e.g. a tile viewer rendering low-zoom tiles, or a read_geotiff(overview_level=N) call). The bug affects both the CPU and GPU writers and is not currently covered by any test.

Proposed fix

Generate overviews on NaN-keyed data, then rewrite NaN to the sentinel per-level just before compression. Minimal change locations:

  1. to_geotiff (__init__.py): defer the NaN-to-sentinel rewrite from line 1202 to inside write() per-level.
  2. _writer.py:write(): rewrite NaN to sentinel on each level (full-res + each overview) before passing to _write_tiled / _write_stripped.
  3. write_geotiff_gpu (__init__.py): mirror the same change on the GPU path so make_overview_gpu sees NaN.

Alternative: extend _make_overview / make_overview_gpu to accept a nodata sentinel they internally mask back to NaN before reduction. Slightly higher API surface but localises the fix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions