-
Notifications
You must be signed in to change notification settings - Fork 599
Closed
Copy link
Labels
Description
Bug summary
Summary
effectively the same sets (~80000 frames)
the same other params in input
single GPU
1)~80000 frames in ~50000 systems, task takes 52 hours,
2)~80000 frames in ~17 systems takes 18 hours, (type_mixed is used to collect the data)
DeePMD-kit Version
DeePMD-kit v2.1.5
TensorFlow Version
2.9.0
How did you download the software?
Offline packages
Input Files, Running Commands, Error Log, etc.
discussed with and sets of 1) has been set to @iProzd previousely,
it's said I/O should not influence the training time after data statistics
~4 hours before the training actually started (data statistics, and lcurve.out starts to write)
"training time" in logs of both cases are effectively the same, note the disp_freq are 100 times larger for 1)
training time for 1)
train_origin.log
...
DEEPMD INFO batch 7800000 training time 1580.50 s, testing time 0.00 s
DEEPMD INFO batch 8000000 training time 1569.11 s, testing time 0.00 s
...
DEEPMD INFO wall time: 188106.747 s
training time for 2)
train_typeSel.log
...
DEEPMD INFO batch 7998000 training time 15.41 s, testing time 0.00 s
DEEPMD INFO batch 8000000 training time 15.60 s, testing time 0.00 s
...
DEEPMD INFO wall time: 65437.235 s
Steps to Reproduce
dp train
Further Information, Files, and Links
No response
Reactions are currently unavailable