Save groupby codes after factorizing, pass to flox#7206
Merged
dcherian merged 30 commits intopydata:mainfrom Mar 29, 2023
Merged
Save groupby codes after factorizing, pass to flox#7206dcherian merged 30 commits intopydata:mainfrom
dcherian merged 30 commits intopydata:mainfrom
Conversation
dcherian
commented
Oct 24, 2022
| "dimension" | ||
| ) | ||
|
|
||
| full_index = None |
Contributor
Author
There was a problem hiding this comment.
Just refactored with no functionality changes except to return codes
dcherian
commented
Oct 24, 2022
1891fdb to
ac521d4
Compare
ac521d4 to
b64df5b
Compare
Fix cftime resampling
* upstream/main: (39 commits) Support the new compression argument in netCDF4 > 1.6.0 (pydata#6981) Remove setuptools-scm-git-archive, require setuptools-scm>=7 (pydata#7253) Fix mypy failures (pydata#7343) Docs: add example of writing and reading groups to netcdf (pydata#7338) Reset file pointer to 0 when reading file stream (pydata#7304) Enable mypy warn unused ignores (pydata#7335) Optimize some copying (pydata#7209) Add parse_dims func (pydata#7051) Fix coordinate attr handling in `xr.where(..., keep_attrs=True)` (pydata#7229) Remove code used to support h5py<2.10.0 (pydata#7334) [pre-commit.ci] pre-commit autoupdate (pydata#7330) Fix PR number in what’s new (pydata#7331) Enable `origin` and `offset` arguments in `resample` (pydata#7284) fix doctests: supress urllib3 warning (pydata#7326) fix flake8 config (pydata#7321) implement Zarr v3 spec support (pydata#6475) Fix polyval overloads (pydata#7315) deprecate pynio backend (pydata#7301) mypy - Remove some ignored packages and modules (pydata#7319) Switch to T_DataArray in .coords (pydata#7285) ...
* main: absolufy-imports - No relative imports - PEP8 (pydata#7204) [skip-ci] whats-new for dev (pydata#7351) Whats-new: 2022.12.0 (pydata#7345) Fix assign_coords resetting all dimension coords to default index (pydata#7347)
2 tasks
* main: Preserve `base` and `loffset` arguments in `resample` (pydata#7444) ignore the `pkg_resources` deprecation warning (pydata#7594) Update contains_cftime_datetimes to avoid loading entire variable array (pydata#7494) Support first, last with dask arrays (pydata#7562) update the docs environment (pydata#7442) Add xCDAT to list of Xarray related projects (pydata#7579) [pre-commit.ci] pre-commit autoupdate (pydata#7565) fix nczarr when libnetcdf>4.8.1 (pydata#7575) use numpys SupportsDtype (pydata#7521)
2d3f984 to
3f4dde2
Compare
4 tasks
Contributor
Author
|
I'd like to merge this soon. It's mostly a refactor moving code around, and a major improvement to the flox code path. Let me know if it'd be easier to split it in to two PRs |
Contributor
|
Hmm, did I mess something up? I believe I only changed typing related things. _bins suffix seems to have disappeared: |
dcherian
added a commit
to kmsquire/xarray
that referenced
this pull request
Mar 29, 2023
* upstream/main: Save groupby codes after factorizing, pass to flox (pydata#7206) [skip-ci] Add compute to groupby benchmarks (pydata#7690) Delete built-in cfgrib backend (pydata#7670) Added a pronunciation guide to the word Xarray in the README.MD fil… (pydata#7677) boundarynorm fix (pydata#7553) Fix lazy negative slice rewriting. (pydata#7586) [pre-commit.ci] pre-commit autoupdate (pydata#7687) Adjust sidebar font colors (pydata#7674) Bump pypa/gh-action-pypi-publish from 1.8.1 to 1.8.3 (pydata#7682) Raise PermissionError when insufficient permissions (pydata#7629)
This was referenced Nov 1, 2023
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is an alternative to #6689.
byvariable directly toflox. Most GroupBy methods however depend on various steps in__init__, so it became messy.flox. This simplifies things a lot. Since we'll want to preserve the "for loop over groups" approach forGroupBy.map, we'll need something like this anyway.The large amount of deleted code in
_flox_reducehere suggests to me that this is the better approach.I think we could also use this to generalize to multiple groupers:
ravel_muti_indexto generate a singe variable to groupby