ENH: add maxlag and lags parameters to correlate/convolve#6
Open
honi-at-simspace wants to merge 14 commits intomasterfrom
Open
ENH: add maxlag and lags parameters to correlate/convolve#6honi-at-simspace wants to merge 14 commits intomasterfrom
honi-at-simspace wants to merge 14 commits intomasterfrom
Conversation
Brings the maxlag2025 branch up to date with current numpy/numpy main (843 commits since branch base). Conflict resolution in numpy/_core/src/multiarray/multiarraymodule.c: - Adopt upstream's _pyarray_correlate signature change from `int typenum` to `PyArray_Descr *typec` (PR numpy#30931). - Keep our refactor: the function still takes (minlag, maxlag, lagstep) instead of (mode, *inverted), and handles array swap, negative-step normalization, and output reversal internally. - Update PyArray_Correlate, PyArray_Correlate2, and PyArray_CorrelateLags to use upstream's PyArray_DTypeFromObject + NPY_DT_CALL_ensure_canonical pattern for type resolution, with proper Py_DECREF(typec) cleanup. - Update the new array_correlatelags argument parser to use upstream's brace-syntax for npy_parse_arguments. Add NPY_2_6_API_VERSION 0x00000016 and matching version-string clause in numpy/_core/include/numpy/numpyconfig.h to support the bumped C API version that registers PyArray_CorrelateLags at slot 369. Tests: numpy/_core/tests/test_numeric.py: 512 passed, 1 skipped.
Update pythoncapi-compat to match upstream
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR summary
This PR adds
maxlagandlagskeyword parameters tonp.correlateandnp.convolve, plus areturns_lagvectorflag that returns the corresponding lag indices alongside the result. It also adds a new public C API functionPyArray_CorrelateLags.What was troubling me is that numpy.correlate does not have a maxlag feature. This means that even if I only want to see correlations between two time series with lags between -100 and +100 ms, for example, it will still calculate the correlation for every lag between -20000 and +20000 ms (which is the length of the time series).
This PR revives numpy#5978, which I opened back in 2015 and never finished pushing through. I had introduced this question as issues on numpy and scipy: numpy#5954, scipy/scipy#4940 and on the scipy-dev list. It also got attention on stackoverflow at the time.
Apologies for the very long downtime. I'm excited to be able to come back to this and finally get this much-needed feature included.
Proposed API
Parameter Design
Both keyword-only, mutually exclusive. Default mode auto-resolves to 'lags' when either is supplied.
Major file changes
Benchmark
For a user interested in a small number of lags around 0, there is a huge speedup:
maxlag=5(From
benchmarks/benchmarks/bench_core.py::CorrConvLags, included in this PR.)First time committer introduction
Hi! I have used NumPy for more than a decade now, both directly, and as a key supporter of the whole scientific python ecosystem (e.g. pandas, pytorch, etc). When I started working on this in 2015, I was only using NumPy for the first time, at the time to build simulations of biological neural networks that were more performant and open source than Matlab, which I had previously been using. This was a feature that would have helped me quite a bit at the time, as I was cross-correlating very long time series, but only needed a relatively small window of time lags around 0. I have wanted to come back to it because of the continued interest of others even though I myself no longer need this specific functionality.
AI Disclosure
Claude Code is what finally allowed me to bring this PR to completion.
I used Claude as a pair-programming assistant for:
I take full responsibility for every line in this PR.