-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
It sounds like a lot of folks are trying to get our (time)frequency functions to be faster. Supercedes the closed MKLFFT issue #1889, and slightly to the resampling issue of #2035.
Potential ways to enhance:
-
Use MKLFFT in our functions. WIP in WIP: initialize mklfft #2623. It looks like performance gains are not clear?
-
Use
rfftinstead offftwhere we can -- I assumedfftpack.fftwould triage based on input type whether to do real or complex DFT, but it sounds like it doesn't (!):signal.resample could be twice as fast using rfft scipy/scipy#5592
This one is actually pretty low-hanging fruit. Any volunteer? (MRG: Faster raw resampling #2978)
-
Use CUDA. I have experimented with this and not gotten very far. The trick is just doing FFT and IFFT on GPU is not enough because the transfer overhead (CPU<->GPU) is too large. So we need to do multiple operations there. This is why the overlap-add filtering, which can do FFT, multiply, and IFFT on the GPU, is an order of magnitude faster than multicore CPU. There should be a way to do something similar for e.g. CWT functions (batching the wavelets to process a single segment using multiple filters maybe?), but I haven't figured it out.
Feel free to post or edit this issue with other ideas.