Summary
For DeePMD-kit training, when there are large numbers of systems, use HDF5 instead of NumPy files.
Detailed Description
When there are large numbers of systems, it consumes a lot of time to transfer large number of small NumPy files to a supercomputer cluster with bad I/O performance. A HDF5 file can store multiple arrays so it is faster to be transfer. The test results produce the behavior.
Further Information, Files, and Links
deepmodeling/deepmd-kit#1163