Skip to content

validate_arrays: weak type check passes mixed dask+numpy / dask+cupy pairs #1383

@brendancol

Description

@brendancol

Summary

validate_arrays() in xrspatial/utils.py checks that all input arrays share the same backend by comparing the outer container type:

# xrspatial/utils.py, line 352
if not isinstance(first_array.data, type(arrays[i].data)):
    raise ValueError("input arrays must have same type")

A dask.array.Array backed by NumPy and a dask.array.Array backed by CuPy are both instances of dask.array.Array. The check passes them as compatible. They are not.

Reproduction

import numpy as np
import cupy as cp
import dask.array as da
import xarray as xr
from xrspatial.utils import validate_arrays

a = xr.DataArray(da.from_array(np.zeros((10, 10))))
b = xr.DataArray(da.from_array(cp.zeros((10, 10))))

validate_arrays(a, b)  # passes silently

The same hole exists for the eager case if a caller mixed numpy and cupy through some other path (the current isinstance check happens to catch that pair only because np.ndarray and cupy.ndarray are unrelated classes; it stops working as soon as both are wrapped in dask).

Downstream impact

Multiple callers feed bands straight into compiled kernels after validate_arrays:

  • xrspatial.multispectral (NDVI, EVI, true_color, ARVI, NBR, NDMI, etc.)
  • xrspatial.fire (burn severity, KBDI, fire spread)
  • xrspatial.mahalanobis
  • xrspatial.zonal (stats, crosstab, regions, apply)

Mixing backends propagates through the chunk graph and fails inside a numba/cupy kernel with a confusing traceback far from the original misuse.

Fix

Classify each input array against the same four-way taxonomy used by ArrayTypeFunctionMapping (numpy / cupy / dask+numpy / dask+cupy) using the existing helpers is_cupy_array, is_cupy_backed, is_dask_cupy. Reject any pair whose classes differ and name both backends in the error message.

Severity

MEDIUM. No data corruption or remote vector. The failure is a confusing crash rather than a silent wrong result, but the validator is the contract advertised to users and it does not hold.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions