You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
xrspatial.geotiff exposes one logical read API (open_geotiff) that fans out to four backend modules (eager numpy in __init__.py, _backends/dask.py, _backends/gpu.py, _backends/vrt.py) and several source types (local path, HTTP, fsspec URI, BytesIO). Parity between those paths is asserted today, but the assertions are scattered across roughly 15+ files keyed to past bug numbers. Add a single matrix that names every supported (backend, source) pair and asserts the same set of fields on every cell.
Where parity is checked today
The existing parity files each pin a slice of the contract for one historical bug:
xrspatial/geotiff/tests/test_backend_parity_matrix.py: already a table-driven harness, but currently parametrized over only numpy and dask+numpy. Has the assert_parity helper, _FixtureSpec dataclass, and _materialise / _coord_view helpers ready to extend. Its docstring names it as "single source of truth" for fixture-vs-backend parity going forward.
xrspatial/geotiff/tests/test_backend_pixel_parity_matrix_1813.py: 4-backend pixel-byte parity across dtype/compression/layout combinations, with its own _BACKENDS list and _materialise helper.
xrspatial/geotiff/tests/test_attrs_parity_1548.py: 4-backend attrs key/value parity for one tifffile-written fixture.
xrspatial/geotiff/tests/test_backend_kwarg_parity_1561.py: kwargs reach each backend (dispatcher does not silently drop them).
xrspatial/geotiff/tests/test_signature_parity_1631.py: signature parity between open_geotiff and the explicit read_geotiff_* entry points.
A backend change (for example #2127's _set_nodata_attrs rewire) has to be cross-checked against ~10 separate files to know whether it kept the contract. No single file says "this fixture is read by these 8 paths and they all return the same thing."
What the matrix should cover
The matrix is (fixture, backend) -> read DataArray, plus a small set of error fixtures.
Backends (8)
id
dispatch
source type
numpy
open_geotiff(path)
local path
dask+numpy
open_geotiff(path, chunks=N)
local path
gpu
open_geotiff(path, gpu=True)
local path
dask+gpu
open_geotiff(path, gpu=True, chunks=N)
local path
vrt-eager
open_geotiff(vrt_path)
local .vrt
vrt-dask
open_geotiff(vrt_path, chunks=N)
local .vrt
http-cog
open_geotiff('http://...')
HTTP URL
fsspec-memory
open_geotiff('memory://...') or BytesIO
fsspec / IO
GPU rows skip via the existing _gpu_available() predicate from test_backend_pixel_parity_matrix_1813.py. HTTP rows use pytest-httpserver (already a dev dep, see test_cog_http_concurrent.py). fsspec rows use memory:// URIs and io.BytesIO. The BytesIO row also asserts the file-like rejection path for gpu=True / chunks=N separately, since those combinations are documented ValueError cases in open_geotiff.
Fixtures (initial set)
Reuse the _FixtureSpec dataclass in xrspatial/geotiff/tests/test_backend_parity_matrix.py. Start with:
Error fixtures: pytest.raises(ExpectedExc, match=expected_msg) runs on every backend that supports the source type.
Helper location
Extend xrspatial/geotiff/tests/test_backend_parity_matrix.py instead of creating a new _parity_matrix.py. The dataclass, materialise helper, and assert_parity already live there with a docstring naming the file as single source of truth. Adding the remaining 6 backend entries to _BACKENDS and the fixtures above to _FIXTURES is mechanical.
The cross-backend helpers in xrspatial/tests/general_checks.py (general_output_checks, assert_numpy_equals_dask_numpy, assert_numpy_equals_cupy, assert_numpy_equals_dask_cupy) are designed for pure-function-of-DataArray operators and assume the input and output share attrs, dims, and coords. They do not fit here because the matrix compares different reads of the same file rather than an op applied across backends. The matrix stays in the geotiff test tree.
Migration
Keep the bug-numbered files as named regression markers. Do not delete them. They should keep their narrow per-bug assertions and stop accumulating new general parity cases. New parity assertions land in the matrix.
Concretely:
test_backend_parity_matrix.py: extend _BACKENDS and _FIXTURES, add the error-fixture sub-matrix.
test_attrs_parity_1548.py, test_backend_kwarg_parity_1561.py, test_miniswhite_backend_parity_1797.py, test_vrt_backend_coverage_2026_05_11.py: stay as named regression markers.
Out of scope
Writer parity (to_geotiff / write_geotiff_gpu / write_vrt). Writer matrix is a separate follow-up; test_writer_matrix.py is the current single point.
Performance / throughput assertions.
Golden-corpus byte-equality (covered by test_golden_corpus_*_1930.py).
A new top-level helper module. The existing file is the home.
Property-based / fuzz coverage (test_fuzz_hypothesis_1661.py already does that for one slice).
Acceptance
xrspatial/geotiff/tests/test_backend_parity_matrix.py parametrizes over all 8 backends and at least the 7 fixtures listed above.
pytest xrspatial/geotiff/tests/test_backend_parity_matrix.py -v reports one cell per (fixture, backend) pair, with GPU / HTTP / fsspec cells skipping cleanly when their deps are absent.
Adding a new backend or fixture is one row appended to _BACKENDS or _FIXTURES; no other file edits.
The existing per-bug parity files remain green and unchanged.
Summary
xrspatial.geotiffexposes one logical read API (open_geotiff) that fans out to four backend modules (eager numpy in__init__.py,_backends/dask.py,_backends/gpu.py,_backends/vrt.py) and several source types (local path, HTTP, fsspec URI, BytesIO). Parity between those paths is asserted today, but the assertions are scattered across roughly 15+ files keyed to past bug numbers. Add a single matrix that names every supported (backend, source) pair and asserts the same set of fields on every cell.Where parity is checked today
The existing parity files each pin a slice of the contract for one historical bug:
xrspatial/geotiff/tests/test_backend_parity_matrix.py: already a table-driven harness, but currently parametrized over onlynumpyanddask+numpy. Has theassert_parityhelper,_FixtureSpecdataclass, and_materialise/_coord_viewhelpers ready to extend. Its docstring names it as "single source of truth" for fixture-vs-backend parity going forward.xrspatial/geotiff/tests/test_backend_pixel_parity_matrix_1813.py: 4-backend pixel-byte parity across dtype/compression/layout combinations, with its own_BACKENDSlist and_materialisehelper.xrspatial/geotiff/tests/test_attrs_parity_1548.py: 4-backend attrs key/value parity for one tifffile-written fixture.xrspatial/geotiff/tests/test_backend_kwarg_parity_1561.py: kwargs reach each backend (dispatcher does not silently drop them).xrspatial/geotiff/tests/test_signature_parity_1631.py: signature parity betweenopen_geotiffand the explicitread_geotiff_*entry points.xrspatial/geotiff/tests/test_miniswhite_backend_parity_1797.py: MinIsWhite photometric handling parity.xrspatial/geotiff/tests/test_vrt_backend_coverage_2026_05_11.py: VRT GPU + dask+GPU coverage.xrspatial/geotiff/tests/test_bytesio_source.py: BytesIO round-trip.xrspatial/geotiff/tests/test_cog_http_*.py,test_http_*.py: HTTP COG behaviour, mostly read-side effects.A backend change (for example #2127's
_set_nodata_attrsrewire) has to be cross-checked against ~10 separate files to know whether it kept the contract. No single file says "this fixture is read by these 8 paths and they all return the same thing."What the matrix should cover
The matrix is
(fixture, backend) -> read DataArray, plus a small set of error fixtures.Backends (8)
numpyopen_geotiff(path)dask+numpyopen_geotiff(path, chunks=N)gpuopen_geotiff(path, gpu=True)dask+gpuopen_geotiff(path, gpu=True, chunks=N)vrt-eageropen_geotiff(vrt_path).vrtvrt-daskopen_geotiff(vrt_path, chunks=N).vrthttp-cogopen_geotiff('http://...')fsspec-memoryopen_geotiff('memory://...')or BytesIOGPU rows skip via the existing
_gpu_available()predicate fromtest_backend_pixel_parity_matrix_1813.py. HTTP rows usepytest-httpserver(already a dev dep, seetest_cog_http_concurrent.py). fsspec rows usememory://URIs andio.BytesIO. The BytesIO row also asserts the file-like rejection path forgpu=True/chunks=Nseparately, since those combinations are documentedValueErrorcases inopen_geotiff.Fixtures (initial set)
Reuse the
_FixtureSpecdataclass inxrspatial/geotiff/tests/test_backend_parity_matrix.py. Start with:mask_nodata=False(locks the Bug: attrs['masked_nodata'] reports True when masking was disabled #2092 / geotiff: thread masked decision through _set_nodata_attrs (#2092) #2127 masked-flag contract)tifffile)ModelTransformationTagwithoutallow_rotatedAssertions
The existing
assert_parityintest_backend_parity_matrix.pyalready covers most fields. Concrete checks per cell:_assert_pixels_equal(ref, actual): byte-equal for integer dtype,np.array_equal(equal_nan=True)for float.da.dims == spec.expected_dimsandda.shape == ref.shape._coord_view(ref, axis).tobytes() == _coord_view(da, axis).tobytes()andref_c.dtype == actual_c.dtype.da.attrs.get(k) == ref.attrs.get(k)for a fixed canonical subset (raster_type,transform,crs,crs_wkt,nodata,masked_nodata).da.dtype == spec.dtype(catches silent upcast that the reference read would also exhibit).attrs['nodata']semantics: present iff the fixture declares one; sentinel value matches;masked_nodataflag reflects whether the mask was applied (Bug: attrs['masked_nodata'] reports True when masking was disabled #2092).pytest.raises(ExpectedExc, match=expected_msg)runs on every backend that supports the source type.Helper location
Extend
xrspatial/geotiff/tests/test_backend_parity_matrix.pyinstead of creating a new_parity_matrix.py. The dataclass, materialise helper, andassert_parityalready live there with a docstring naming the file as single source of truth. Adding the remaining 6 backend entries to_BACKENDSand the fixtures above to_FIXTURESis mechanical.The cross-backend helpers in
xrspatial/tests/general_checks.py(general_output_checks,assert_numpy_equals_dask_numpy,assert_numpy_equals_cupy,assert_numpy_equals_dask_cupy) are designed for pure-function-of-DataArray operators and assume the input and output share attrs, dims, and coords. They do not fit here because the matrix compares different reads of the same file rather than an op applied across backends. The matrix stays in the geotiff test tree.Migration
Keep the bug-numbered files as named regression markers. Do not delete them. They should keep their narrow per-bug assertions and stop accumulating new general parity cases. New parity assertions land in the matrix.
Concretely:
test_backend_parity_matrix.py: extend_BACKENDSand_FIXTURES, add the error-fixture sub-matrix.test_backend_pixel_parity_matrix_1813.py: stays. Locks the dtype/compression/layout sweep specific to the geotiff: split __init__.py into per-backend modules with shared validation #1813 refactor.test_attrs_parity_1548.py,test_backend_kwarg_parity_1561.py,test_miniswhite_backend_parity_1797.py,test_vrt_backend_coverage_2026_05_11.py: stay as named regression markers.Out of scope
to_geotiff/write_geotiff_gpu/write_vrt). Writer matrix is a separate follow-up;test_writer_matrix.pyis the current single point.test_golden_corpus_*_1930.py).test_fuzz_hypothesis_1661.pyalready does that for one slice).Acceptance
xrspatial/geotiff/tests/test_backend_parity_matrix.pyparametrizes over all 8 backends and at least the 7 fixtures listed above.pytest xrspatial/geotiff/tests/test_backend_parity_matrix.py -vreports one cell per (fixture, backend) pair, with GPU / HTTP / fsspec cells skipping cleanly when their deps are absent._BACKENDSor_FIXTURES; no other file edits.