-
Notifications
You must be signed in to change notification settings - Fork 153
Description
Details
Current
After issue #3334 #3605 and PRs #3454 #3615, now "parallelization" on rcut when ABACUS calculates <jY|jY>, <psi|jY> has been implemented, although next there will be PRs about refactoring class Numerical_Basis, the efficiency of orbital generation is still low because of the speed-up ratio plateau will be fast reached if parallelized on threads, supported by PyTorch.
Solution
There would be waste on computational resources if only use multithreads parallelization. Note that tasks of generating orbitals with different rcut are fully individual, they can be parallelized on processes.
POC
More consideration
Will expose a parameter for user to specify number of threads would like to use for each rcut orbital generation, also if set as auto, will do performance test on multithreads parallelization on PyTorch to estimate a rough value, like what is provided by @caic99: CPU threading and TorchScript interface:
import timeit
runtimes = []
threads = [1] + [t for t in range(2, 49, 2)]
for t in threads:
torch.set_num_threads(t)
r = timeit.timeit(setup = "import torch; x = torch.randn(1024, 1024); y = torch.randn(1024, 1024)", stmt="torch.mm(x, y)", number=100)
runtimes.append(r)
# ... plotting (threads, runtimes) ...Task list for Issue attackers (only for developers)
- Reproduce the performance issue on a similar system or environment.
- Identify the specific section of the code causing the performance issue.
- Investigate the issue and determine the root cause.
- Research best practices and potential solutions for the identified performance issue.
- Implement the chosen solution to address the performance issue.
- Test the implemented solution to ensure it improves performance without introducing new issues.
- Optimize the solution if necessary, considering trade-offs between performance and other factors (e.g., code complexity, readability, maintainability).
- Review and incorporate any relevant feedback from users or developers.
- Merge the improved solution into the main codebase and notify the issue reporter.
