do some small optimization to ops#943
Conversation
1. avoid concat or add in loops. Instead, append tensors to a list, and concat or accumulate_n after loops 2. remove a duplicated reshape
Codecov Report
@@ Coverage Diff @@
## devel #943 +/- ##
==========================================
+ Coverage 75.67% 75.68% +0.01%
==========================================
Files 92 92
Lines 7671 7671
==========================================
+ Hits 5805 5806 +1
+ Misses 1866 1865 -1
Continue to review full report at Codecov.
|
|
Have you benchmarked these optimization? Do they help improving the efficiency? |
I just benchmarked it. The answer is no😂 |
|
I think these optimizations may be more important to CPUs, compared to GPUs. I will recheck this PR. |
| bavg = bavg, | ||
| trainable = trainable, | ||
| suffix = "_"+str(type_i)) | ||
| if type_i == 0: |
There was a problem hiding this comment.
Did we have a bug here? if type_i == 0 and (type_input, type_i) in self.exclude_types we had ret accumulated.
|
@denghuilu Would the revised code be faster on GPUs? |
|
I think one cannot see any difference if there are only one or two elements. A system with at least 10 atom types should be tested. |
|
There is a slight performance penalty on V100 GPU with the water benchmark system: optimize-ops branch devel branch Maybe the GPU implementation did not use the stream parallelization. |
Why the testing time of optimize-ops is so long? |
It was fixed by #1419 -- this branch is behind devel. |
merge from devel
|
It did have some benefits: |



avoid concat or add in loops. Instead, append tensors to a list, and concat or accumulate_n after loops