-
Notifications
You must be signed in to change notification settings - Fork 19
perf: Parallelize siteRDF with ProductIterator #1882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The algorithms really enforce that the iterator must return a reference.
|
@rprospero and @RobBuchananCompPhys I may have an answer for you. I assumed that you were benchmarking previously by running the So almost a factor 10 improvement! For the large 5k argon box: A factor 30 improvement. Not too shabby! I can push up the new benchmarks on the end of this PR if you want to test for yourself. |
Wow, great stuff. Would be good to get those benchmarks. |
|
Just realised that this PR is no longer actually using the ProductIterator. I'm going to refactor to use it and check the benchmarks. If it's faster, we'll use it. Otherwise, I'll drop the class for now. |
|
After benchmarking, the ProductIterator had comparable speed in series, but was slower in parallel than just using the inner loop. Ultimately, the iterator is supposed to pay for its built in performance penalty by smoothing out the cases where certain pairs take significantly longer to calculate than others. I've removed the unused class, since I'm now less convinced that it will ever be needed. |
This adds a ProductIterator class that enables iterating over two separate indices. As a bit of a refresher:
We then use this product iterator to parallelise the siteRDF calculation.