-
Notifications
You must be signed in to change notification settings - Fork 599
Description
Discussed in #1289
Originally posted by TinacciL November 17, 2021
I installed the GPU version of Deepmd-kit ghcr.io/deepmodeling/deepmd-kit:2.0.3_cuda10.1_gpu via Docker, I tested and it work fine with the example proveided.
I start to replicate waters in a cluster configuration (nopbc), I create a database of about 20000 frame (energies and forces from 1 to 200 different H2O cluster at different Temperature).
I train the model with almost the same input provided in the example/water/se_e2_a:
{
"_comment": " model parameters",
"model": {
"type_map": ["O", "H"],
"descriptor" :{
"type": "se_e2_a",
"sel": [70, 140],
"rcut_smth": 0.50,
"rcut": 6.00,
"neuron": [25, 50, 100],
"resnet_dt": false,
"axis_neuron": 16,
"seed": 1,
"_comment": " that's all"
},
"fitting_net" : {
"neuron": [340, 340, 340],
"resnet_dt": true,
"seed": 1,
"_comment": " that's all"
},
"_comment": " that's all"
},
"learning_rate" :{
"type": "exp",
"decay_steps": 5000,
"start_lr": 0.001,
"stop_lr": 3.51e-8,
"_comment": "that's all"
},
"loss" :{
"type": "ener",
"start_pref_e": 0.02,
"limit_pref_e": 1,
"start_pref_f": 1000,
"limit_pref_f": 1,
"start_pref_v": 0,
"limit_pref_v": 0,
"_comment": " that's all"
},
"training" : {
"training_data": {
"systems": ["../data_gfn2/train_1WM/", "../data_gfn2/train_2WM/", "../data_gfn2/train_10WM/", "../data_gfn2/train_60WM/", "../data_gfn2/train_100WM/", "../data_gfn2/train_200WM/"],
"batch_size": "auto",
"_comment": "that's all"
},
"validation_data":{
"systems": ["../data_gfn2/test_1WM/", "../data_gfn2/test_2WM/", "../data_gfn2/test_10WM/", "../data_gfn2/test_60WM/", "../data_gfn2/test_100WM/", "../data_gfn2/test_200WM/"],
"batch_size": 1,
"numb_btch": 3,
"_comment": "that's all"
},
"numb_steps": 1000000,
"seed": 10,
"disp_file": "lcurve.out",
"disp_freq": 100,
"save_freq": 1000,
"_comment": "that's all"
},
"_comment": "that's all"
}
At the end of the training I achieve these data in the lcurve.out file:
# step rmse_val rmse_trn rmse_e_val rmse_e_trn rmse_f_val rmse_f_trn lr
999700 2.86e-02 2.12e-02 2.15e-04 1.64e-04 2.76e-02 2.08e-02 3.7e-08
999800 2.30e-02 2.31e-02 1.26e-04 2.77e-04 2.25e-02 2.25e-02 3.7e-08
999900 2.20e-02 1.87e-02 6.70e-04 4.38e-04 2.10e-02 1.81e-02 3.7e-08
1000000 2.40e-02 1.95e-02 4.62e-04 4.25e-04 2.32e-02 1.89e-02 3.5e-08
After the freezing of the model I do a test via dp test command on some of the validation data and I achieve this results:
DEEPMD INFO # number of test data : 10
DEEPMD INFO Energy RMSE : 6.584139e+03 eV
DEEPMD INFO Energy RMSE/Natoms : 1.097356e+02 eV
DEEPMD INFO Force RMSE : 3.281405e-01 eV/A
DEEPMD INFO Virial RMSE : 2.225164e+00 eV
DEEPMD INFO Virial RMSE/Natoms : 3.708607e-02 e
I did it also for the training data, in order to see if was an overfitting problem, and I got:
DEEPMD INFO # number of test data : 10
DEEPMD INFO Energy RMSE : 6.584253e+03 eV
DEEPMD INFO Energy RMSE/Natoms : 1.097375e+02 eV
DEEPMD INFO Force RMSE : 3.373093e-01 eV/A
DEEPMD INFO Virial RMSE : 3.646905e+00 eV
DEEPMD INFO Virial RMSE/Natoms : 6.078175e-02 eV
Why does the test command not provided the same results of the "testing on the fly" results?
Is it a problem of nopbc or only my inexperience?
Thanks
Metadata
Metadata
Assignees
Labels
Type
Projects
Status