Skip to content

zonal_stats cupy backend silently ignores nodata_values=0 #1227

@brendancol

Description

@brendancol

Describe the bug

zonal_stats(..., nodata_values=0) on a cupy-backed DataArray skips the nodata filter. Numpy, dask+numpy, and dask+cupy all drop zero-valued cells like they should. The cupy path leaves them in and doesn't say anything about it, so you get a different answer depending on which backend ran.

The check at xrspatial/zonal.py:547 uses truthiness:

if nodata_values:
    filter_values = cupy.isfinite(values_by_zone) & (
        values_by_zone != nodata_values)
else:
    filter_values = cupy.isfinite(values_by_zone)

With nodata_values=0 the else branch runs and zeros stay in. Line 332 (numpy path) does this correctly: if nodata_values is not None:.

Expected behavior

Same input, same answer, whatever backend ran. Zero should be filtered when the user asks for it.

Reproduce

import cupy as cp
import xarray as xr
from xrspatial import zonal_stats

zones = xr.DataArray(cp.asarray([[1, 1, 2, 2], [1, 1, 2, 2]]))
values = xr.DataArray(cp.asarray([[0.0, 10.0, 0.0, 20.0],
                                  [10.0, 10.0, 20.0, 20.0]]))

result = zonal_stats(zones, values, nodata_values=0, stats_funcs=['mean', 'count'])
# cupy:  zone 1 mean=7.5,  count=4  (wrong, zero wasn't filtered)
# numpy: zone 1 mean=10.0, count=3

Fix

Change line 547 to if nodata_values is not None:. Add a cupy regression test that compares against the numpy result.

Context

Caught by the 2026-04-22 security sweep of the zonal module. Category 6 (backend-semantics confusion). HIGH because the wrong answer is silent.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions