Skip to content

align/concat/combine_nested implementation#18

Merged
dcherian merged 27 commits intomainfrom
wip-align
Jun 16, 2025
Merged

align/concat/combine_nested implementation#18
dcherian merged 27 commits intomainfrom
wip-align

Conversation

@dcherian
Copy link
Copy Markdown
Contributor

@dcherian dcherian commented Apr 17, 2025

Closes #20

TODO:

  • add 2D combine_nested test
  • Add better error messages for the new tests
Details

@benbovy feel free to take over if you have time.

This is ugly BUT for the example in https://github.com/dcherian/rasterix/issues/16#issuecomment-2811585219

xr.concat([ds40, ds50], dim=("x", "y"))

now gets as far as

ile [~/miniforge3/envs/rasterix/lib/python3.12/site-packages/xarray/structure/alignment.py:511](http://localhost:8888/lab/tree/repos/workshop-cng-2025-zarr/~/miniforge3/envs/rasterix/lib/python3.12/site-packages/xarray/structure/alignment.py#line=510), in Aligner._get_dim_pos_indexers(self, matching_indexes)
    509     if obj_idx is not None:
    510         if self.reindex[key]:
--> 511             indexers = obj_idx.reindex_like(aligned_idx, **self.reindex_kwargs)
    512             dim_pos_indexers.update(indexers)
    514 return dim_pos_indexers

File [~/miniforge3/envs/rasterix/lib/python3.12/site-packages/xarray/core/indexes.py:305](http://localhost:8888/lab/tree/repos/workshop-cng-2025-zarr/~/miniforge3/envs/rasterix/lib/python3.12/site-packages/xarray/core/indexes.py#line=304), in Index.reindex_like(self, other)
    289 def reindex_like(self, other: Self) -> dict[Hashable, Any]:
    290     """Query the index with another index of the same type.
    291 
    292     Implementation is optional but required in order to support alignment.
   (...)    303         indexers.
    304     """
--> 305     raise NotImplementedError(f"{self!r} doesn't support re-indexing labels")

NotImplementedError: RasterIndex
'x':
    <rasterix.raster_index.AxisAffineTransformIndex object at 0x10e146510>
'y':
    <rasterix.raster_index.AxisAffineTransformIndex object at 0x10e146a50> doesn't support re-indexing labels

Comment thread rasterix/raster_index.py
@dcherian
Copy link
Copy Markdown
Contributor Author

dcherian commented Apr 17, 2025

Aha, here's a concrete feature request


This index is shared between dims x and y.

So I cannot do

xr.concat([ds50, ds40], dim="x")

because I get

ValueError: cannot exclude dimension(s) x from alignment because these are used by an index together with non-excluded dimensions y

So I cannot do

xr.concat([ds50, ds40], dim=("x", "y"))

Then the alignment does not exclude "x" and "y" individually, it excludes the tuple ("x", "y") so we end up in a bad situation with terrible reindexing.

Q: If asked to concat along a dim with a multi-dim index, then should we exclude all of those dims?


Using combine_nested seems promising, perhaps that is the solution

xr.combine_nested([[ds50, ds40]], concat_dim=("x", "y")) 

@dcherian
Copy link
Copy Markdown
Contributor Author

dcherian commented Apr 17, 2025

Nope, combine_nested is just concat in a loop so we are still blocked.

Q: If asked to concat along a dim with a multi-dim index, then should we exclude all of those dims?

I don't see any other option at the moment. especially if we want concat_nested to magically work.

@benbovy
Copy link
Copy Markdown
Member

benbovy commented Apr 18, 2025

If asked to concat along a dim with a multi-dim index, then should we exclude all of those dims?

It might be worth trying, although I don't know much what implications this rule would have in the general case.

Or maybe trying to fit this case into concat or combine_nested is not really worth the extra complexity and it would be fine redirecting users to a more specific method similar to Rioxarray's merge_arrays and merge_datasets?

From a UX perspective, however, the alignment error above while technically correct does not give a chance of providing more context. We should probably give this chance, e.g., via some dedicated Xarray Index API (the alignment error is raised from a loop iterating over index objects).

Comment thread tests/test_raster_index.py Outdated
@dcherian dcherian changed the title WIP align implementation WIP align/combine implementation May 16, 2025
@dcherian dcherian changed the title WIP align/combine implementation align/concat/combine implementation Jun 12, 2025
@dcherian dcherian requested a review from benbovy June 12, 2025 03:56
@dcherian dcherian marked this pull request as ready for review June 12, 2025 03:56
Comment thread src/rasterix/odc_compat.py
Comment thread src/rasterix/raster_index.py Outdated
Comment thread tests/test_raster_index.py Outdated
@dcherian dcherian changed the title align/concat/combine implementation align/concat/combine_nested implementation Jun 13, 2025
@dcherian
Copy link
Copy Markdown
Contributor Author

OK this should be good to go!

@dcherian dcherian merged commit d6242e1 into main Jun 16, 2025
9 checks passed
@dcherian dcherian deleted the wip-align branch June 16, 2025 15:42
Comment on lines +341 to +344
self._shape = {
"x": self._wrapped_indexes["x"].axis_transform.size,
"y": self._wrapped_indexes["y"].axis_transform.size,
}
Copy link
Copy Markdown
Member

@benbovy benbovy Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid hard-coding "x" and "y" coordinate names.

We should also fix self._shape for the following cases:

  • There's only one wrapped index, either AxisAffineTransformIndex or PandasIndex (e.g., result from indexing with a scalar on one of the raster dimensions)
  • There's two wrapped PandasIndex instances (e.g., result from indexing with arbitrary array values)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I got this on the PR branch with docs 😓

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm refactoring RasterIndex in #43. I think it will make our lives much easier... Supporting the cases above in RasterIndex was a (my) bad idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

concat and combine

2 participants