deepmodeling · amcadmus · Apr 20, 2021 · Apr 20, 2021 · Apr 20, 2021
diff --git a/doc/data-conv.md b/doc/data-conv.md
@@ -0,0 +1,41 @@
+# Data
+
+
+In this example we will convert the DFT labeled data stored in VASP `OUTCAR` format into the data format used by DeePMD-kit. The example `OUTCAR` can be found in the directory. 
+```bash
+$deepmd_source_dir/examples/data_conv
+```
+
+
+## Definition
+
+The DeePMD-kit organize data in **`systems`**. Each `system` is composed by a number of **`frames`**. One may roughly view a `frame` as a snap short on an MD trajectory, but it does not necessary come from an MD simulation. A `frame` records the coordinates and types of atoms, cell vectors if the periodic boundary condition is assumed, energy, atomic forces and virial. It is noted that the `frames` in one `system` share the same number of atoms with the same type. 
+
+
+
+## Data conversion
+
+It is conveninent to use [dpdata](https://github.com/deepmodeling/dpdata) to convert data generated by DFT packages to the data format used by DeePMD-kit.
+
+To install one can execute 
+```bash
+pip install dpdata
+```
+
+An example of converting data [VASP](https://www.vasp.at/) data in `OUTCAR` format to DeePMD-kit data can be found at
+```
+$deepmd_source_dir/examples/data_conv
+```
+
+Switch to that directory, then one can convert data by using the following python script
+```python
+import dpdata
+dsys = dpdata.LabeledSystem('OUTCAR')
+dsys.to('deepmd/npy', 'deepmd_data', set_size = dsys.get_nframes())
+```
+
+`get_nframes()` method gets the number of frames in the `OUTCAR`, and the argument `set_size` enforces that the set size is equal to the number of frames in the system, viz. only one `set` is created in the `system`. 
+
+The data in DeePMD-kit format is stored in the folder `deepmd_data`.
+
+A list of all [supported data format](https://github.com/deepmodeling/dpdata#load-data) and more nice features of `dpdata` can be found at the [official website](https://github.com/deepmodeling/dpdata).
diff --git a/doc/train-hybrid.md b/doc/train-hybrid.md
@@ -0,0 +1,25 @@
+# Train a Deep Potential model using descriptor `"hybrid"`
+
+This descriptor hybridize multiple descriptors to form a new descriptor. For example we have a list of descriptor denoted by D_1, D_2, ..., D_N, the hybrid descriptor this the concatenation of the list, i.e. D = (D_1, D_2, ..., D_N).
+
+To use the descriptor in DeePMD-kit, one firstly set the `type` to `"hybrid"`, then provide the definitions of the descriptors by the items in the `list`,
+```json=
+        "descriptor" :{
+            "type": "hybrid",
+            "list" : [
+                {
+		    "type" : "se_e2_a",
+		    ...		    
+                },
+                {
+		    "type" : "se_e2_r",
+		    ...
+                }
+            ]
+        },
+```
+
+A complete training input script of this example can be found in the directory
+```bash
+$deepmd_source_dir/examples/water/hybrid/input.json
+```
diff --git a/doc/train-se-e2-a.md b/doc/train-se-e2-a.md
@@ -0,0 +1,223 @@
+# Train a Deep Potential model using descriptor `"se_e2_a"`
+
+The notation of `se_e2_a` is short for the Deep Potential Smooth Edition (DeepPot-SE) constructed from all information (both angular and radial) of atomic configurations. The `e2` stands for the embedding with two-atoms information. This descriptor was described in detail in [the DeepPot-SE paper](https://arxiv.org/abs/1805.09003).
+
+In this example we will train a DeepPot-SE model for a water system.  A complete training input script of this example can be find in the directory. 
+```bash
+$deepmd_source_dir/examples/water/se_e2_a/input.json
+```
+With the training input script, data (please read the [warning](#warning)) are also provided in the example directory. One may train the model with the DeePMD-kit from the directory.
+
+The contents of the example:
+- [The training input](#the-training-input-script)
+- [Train a Deep Potential model](#train-a-deep-potential-model)
+- [Warning](#warning)
+
+## The training input script
+
+A working training script using descriptor `se_e2_a` is provided as `input.json` in the same directory as this README.
+
+The `input.json` is divided in several sections, `model`, `learning_rate`, `loss` and `training`. 
+
+For more information, one can find the [a full documentation](https://deepmd.readthedocs.io/en/master/train-input.html) on the training input script.
+
+### Model
+The `model` defines how the model is constructed, for example
+```json=
+    "model": {
+	"type_map":	["O", "H"],
+	"descriptor" :{
+            ...
+	},
+	"fitting_net" : {
+            ...
+	}
+    }
+```
+We are looking for a model for water, so we have two types of atoms. The atom types are recorded as integers. In this example, we denote `0` for oxygen and `1` for hydrogen. A mapping from the atom type to their names is provided by `type_map`. 
+
+The model has two subsections `descritpor` and `fitting_net`, which defines the descriptor and the fitting net, respectively. The `type_map` is optional, which provides the element names (but not necessarily to be the element name) of the corresponding atom types.
+
+#### Descriptor
+The construction of the descriptor is given by section `descriptor`. An example of the descriptor is provided as follows
+```json=
+	"descriptor" :{
+	    "type":		"se_e2_a",
+	    "rcut_smth":	0.50,
+	    "rcut":		6.00,
+	    "sel":		[46, 92],
+	    "neuron":		[25, 50, 100],
+	    "axis_neuron":	16,
+	    "resnet_dt":	false,
+	    "seed":		1
+	}
+```
+* The `type` of the descriptor is set to `"se_e2_a"`. 
+* `rcut` is the cut-off radius for neighbor searching, and the `rcut_smth` gives where the smoothing starts. 
+* `sel` gives the maximum possible number of neighbors in the cut-off radius. It is a list, the length of which is the same as the number of atom types in the system, and `sel[i]` denote the maximum possible number of neighbors with type `i`. 
+* The `neuron` specifies the size of the embedding net. From left to right the members denote the sizes of each hidden layer from input end to the output end, respectively. If the outer layer is of twice size as the inner layer, then the inner layer is copied and concatenated, then a [ResNet architecture](https://arxiv.org/abs/1512.03385) is built between them.
+* The `axis_neuron` specifies the size of submatrix of the embedding matrix, the axis matrix as explained in the [DeepPot-SE paper](https://arxiv.org/abs/1805.09003) 
+* If the option `resnet_dt` is set `true`, then a timestep is used in the ResNet.
+* `seed` gives the random seed that is used to generate random numbers when initializing the model parameters.
+
+
+#### Fitting
+The construction of the fitting net is give by section `fitting_net`
+```json=
+	"fitting_net" : {
+	    "neuron":		[240, 240, 240],
+	    "resnet_dt":	true,
+	    "seed":		1
+	},
+```
+* `neuron` specifies the size of the fitting net. If two neighboring layers are of the same size, then a [ResNet architecture](https://arxiv.org/abs/1512.03385) is built between them. 
+* If the option `resnet_dt` is set `true`, then a timestep is used in the ResNet. 
+* `seed` gives the random seed that is used to generate random numbers when initializing the model parameters.
+
+### Learning rate
+
+The `learning_rate` section in `input.json` is given as follows
+```json=
+    "learning_rate" :{
+	"type":		"exp",
+	"start_lr":	0.001,
+	"stop_lr":	3.51e-8,
+	"decay_steps":	5000,
+	"_comment":	"that's all"
+    }
+```
+* `start_lr` gives the learning rate at the beginning of the training.
+* `stop_lr` gives the learning rate at the end of the training. It should be small enough to ensure that the network parameters satisfactorily converge. 
+* During the training, the learning rate decays exponentially from `start_lr` to `stop_lr` following the formula.
+    ```
+    lr(t) = start_lr * decay_rate ^ ( t / decay_steps )
+    ```
+    where `t` is the training step.
+
+### Loss
+
+The loss function of DeePMD-kit is given by
+```
+loss = pref_e * loss_e + pref_f * loss_f + pref_v * loss_v
+```
+where `loss_e`, `loss_f` and `loss_v` denote the loss in energy, force and virial, respectively. `pref_e`, `pref_f` and `pref_v` give the prefactors of the energy, force and virial losses. The prefectors may not be a constant, rather it changes linearly with the learning rate. Taking the force prefactor for example, at training step `t`, it is given by
+```math
+pref_f(t) = start_pref_f * ( lr(t) / start_lr ) + limit_pref_f * ( 1 - lr(t) / start_lr )
+```
+where `lr(t)` denotes the learning rate at step `t`. `start_pref_f` and `limit_pref_f` specifies the `pref_f` at the start of the training and at the limit of `t -> inf`.
+
+The `loss` section in the `input.json` is 
+```json=
+    "loss" : {
+	"start_pref_e":	0.02,
+	"limit_pref_e":	1,
+	"start_pref_f":	1000,
+	"limit_pref_f":	1,
+	"start_pref_v":	0,
+	"limit_pref_v":	0
+    }
+```
+The options `start_pref_e`, `limit_pref_e`, `start_pref_f`, `limit_pref_f`, `start_pref_v` and `limit_pref_v` determine the start and limit prefactors of energy, force and virial, respectively.
+
+If one does not want to train with virial, then he/she may set the virial prefactors `start_pref_v` and `limit_pref_v` to 0.
+
+### Training parameters
+
+Other training parameters are given in the `training` section.
+```json=
+    "training": {
+ 	"training_data": {
+	    "systems":		["../data_water/data_0/", "../data_water/data_1/", "../data_water/data_2/"],
+	    "batch_size":	"auto"
+	},
+	"validation_data":{
+	    "systems":		["../data_water/data_3"],
+	    "batch_size":	1,
+	    "numb_btch":	3
+	},
+
+	"numb_step":	1000000,
+	"seed":		1,
+	"disp_file":	"lcurve.out",
+	"disp_freq":	100,
+	"save_freq":	1000
+    }
+```
+The sections `"training_data"` and `"validation_data"` give the training dataset and validation dataset, respectively. Taking the training dataset for example, the keys are explained below:
+* `systems` provide paths of the training data systems. DeePMD-kit allows you to provide multiple systems. This key can be a `list` or a `str`.
+    * `list`: `systems` gives the training data systems.
+    * `str`: `systems` should be a valid path. DeePMD-kit will recursively search all data systems in this path.
+* At each training step, DeePMD-kit randomly pick `batch_size` frame(s) from one of the systems. The probability of using a system is by default in proportion to the number of batches in the system. More optional are available for automatically determining the probability of using systems. One can set the key `auto_prob` to
+    * `"prob_uniform"` all systems are used with the same probability.
+    * `"prob_sys_size"` the probability of using a system is in proportional to its size (number of frames).
+    * `"prob_sys_size; sidx_0:eidx_0:w_0; sidx_1:eidx_1:w_1;..."` the `list` of systems are divided into blocks. The block `i` has systems ranging from `sidx_i` to `eidx_i`. The probability of using a system from block `i` is in proportional to `w_i`. Within one block, the probability of using a system is in proportional to its size.
+* An example of using `"auto_prob"` is given as below. The probability of using `systems[2]` is 0.4, and the sum of the probabilities of using `systems[0]` and `systems[1]` is 0.6. If the number of frames in `systems[1]` is twice as `system[0]`, then the probability of using `system[1]` is 0.4 and that of `system[0]` is 0.2.
+```json=
+ 	"training_data": {
+	    "systems":		["../data_water/data_0/", "../data_water/data_1/", "../data_water/data_2/"],
+	    "auto_prob":	"prob_sys_size; 0:2:0.6; 2:3:0.4",
+	    "batch_size":	"auto"
+	}
+```
+* The probability of using systems can also be specified explicitly with key `"sys_prob"` that is a list having the length of the number of systems. For example
+```json=
+ 	"training_data": {
+	    "systems":		["../data_water/data_0/", "../data_water/data_1/", "../data_water/data_2/"],
+	    "sys_prob":	[0.5, 0.3, 0.2],
+	    "batch_size":	"auto:32"
+	}
+```
+* The key `batch_size` specifies the number of frames used to train or validate the model in a training step. It can be set to
+    * `list`: the length of which is the same as the `systems`. The batch size of each system is given by the elements of the list.
+    * `int`: all systems use the same batch size.
+    * `"auto"`: the same as `"auto:32"`, see `"auto:N"`
+    * `"auto:N"`: automatically determines the batch size so that the `batch_size` times the number of atoms in the system is no less than `N`.
+* The key `numb_batch` in `validate_data` gives the number of batches of model validation. Note that the batches may not be from the same system
+
+Other keys in the `training` section are explained below:
+* `numb_step` The number of training steps.
+* `seed` The random seed for getting frames from the training data set.
+* `disp_file` The file for printing learning curve.
+* `disp_freq` The frequency of printing learning curve. Set in the unit of training steps
+* `save_freq` The frequency of saving check point.
+
+
+## Train a Deep Potential model
+When the input script is prepared, one may start training by 
+```bash=
+dp train input.json
+```
+By default, the verbosity level of the DeePMD-kit is `INFO`, one may see a lot of important information on the code and environment showing on the screen. Among them two pieces of information regarding data systems worth special notice. 
+```bash=
+DEEPMD INFO    ---Summary of DataSystem: training     -----------------------------------------------
+DEEPMD INFO    found 3 system(s):
+DEEPMD INFO                                        system  natoms  bch_sz   n_bch   prob  pbc
+DEEPMD INFO                         ../data_water/data_0/     192       1      80  0.250    T
+DEEPMD INFO                         ../data_water/data_1/     192       1     160  0.500    T
+DEEPMD INFO                         ../data_water/data_2/     192       1      80  0.250    T
+DEEPMD INFO    --------------------------------------------------------------------------------------
+DEEPMD INFO    ---Summary of DataSystem: validation   -----------------------------------------------
+DEEPMD INFO    found 1 system(s):
+DEEPMD INFO                                        system  natoms  bch_sz   n_bch   prob  pbc
+DEEPMD INFO                          ../data_water/data_3     192       1      80  1.000    T
+DEEPMD INFO    --------------------------------------------------------------------------------------
+```
+The DeePMD-kit prints detailed informaiton on the training and validation data sets. The data sets are defined by `"training_data"` and `"validation_data"` defined in the `"training"` section of the input script. The training data set is composed by three data systems, while the validation data set is composed by one data system. The number of atoms, batch size, number of batches in the system and the probability of using the system are all shown on the screen. The last column presents if the periodic boundary condition is assumed for the system. 
+
+During the training, the error of the model is tested every `disp_freq` training steps with the batch used to train the model and with `numb_btch` batches from the validating data. The training error and validation error are printed correspondingly in the file `disp_file`. The batch size can be set in the input script by the key `batch_size` in the corresponding sections for training and validation data set. An example of the output 
+```bash=
+#  step      rmse_val    rmse_trn    rmse_e_val  rmse_e_trn    rmse_f_val  rmse_f_trn         lr
+      0      3.33e+01    3.41e+01      1.03e+01    1.03e+01      8.39e-01    8.72e-01    1.0e-03
+    100      2.57e+01    2.56e+01      1.87e+00    1.88e+00      8.03e-01    8.02e-01    1.0e-03
+    200      2.45e+01    2.56e+01      2.26e-01    2.21e-01      7.73e-01    8.10e-01    1.0e-03
+    300      1.62e+01    1.66e+01      5.01e-02    4.46e-02      5.11e-01    5.26e-01    1.0e-03
+    400      1.36e+01    1.32e+01      1.07e-02    2.07e-03      4.29e-01    4.19e-01    1.0e-03
+    500      1.07e+01    1.05e+01      2.45e-03    4.11e-03      3.38e-01    3.31e-01    1.0e-03
+```
+The file contains 8 columns, form right to left, are the training step, the validation loss, training loss, root mean square (RMS) validation error of energy, RMS training error of energy, RMS validation error of force, RMS training error of force and the learning rate. The RMS error (RMSE) of the energy is normalized by number of atoms in the system.
+
+## Warning
+It is warned that the example water data (in folder `examples/water/data`) is of very limited amount, is provided only for testing purpose, and should not be used to train a productive model.
+
+
+
diff --git a/doc/train-se-e2-r.md b/doc/train-se-e2-r.md
@@ -0,0 +1,23 @@
+# Train a Deep Potential model using descriptor `"se_e2_r"`
+
+The notation of `se_e2_r` is short for the Deep Potential Smooth Edition (DeepPot-SE) constructed from the radial information of atomic configurations. The `e2` stands for the embedding with two-atom information. 
+
+A complete training input script of this example can be found in the directory
+```bash
+$deepmd_source_dir/examples/water/se_e2_r/input.json
+```
+
+The training input script is very similar to that of [`se_e2_a`](train-se-e2-a.md#the-training-input-script). The only difference lies in the `descriptor` section
+```json=
+	"descriptor": {
+	    "type":		"se_e2_r",
+	    "sel":		[46, 92],
+	    "rcut_smth":	0.50,
+	    "rcut":		6.00,
+	    "neuron":		[5, 10, 20],
+	    "resnet_dt":	false,
+	    "seed":		1,
+	    "_comment": " that's all"
+	},
+```
+The type of the descriptor is set by the key `"type"`.
diff --git a/doc/train-se-e3.md b/doc/train-se-e3.md
@@ -0,0 +1,23 @@
+# Train a Deep Potential model using descriptor `"se_e3"`
+
+The notation of `se_e3` is short for the Deep Potential Smooth Edition (DeepPot-SE) constructed from all information (both angular and radial) of atomic configurations. The embedding takes angles between two neighboring atoms as input (denoted by `e3`).
+
+A complete training input script of this example can be found in the directory
+```bash
+$deepmd_source_dir/examples/water/se_e3/input.json
+```
+
+The training input script is very similar to that of [`se_e2_a`](train-se-e2-a.md#the-training-input-script). The only difference lies in the `descriptor` section
+```json=
+	"descriptor": {
+	    "type":		"se_e3",
+	    "sel":		[40, 80],
+	    "rcut_smth":	0.50,
+	    "rcut":		6.00,
+	    "neuron":		[2, 4, 8],
+	    "resnet_dt":	false,
+	    "seed":		1,
+	    "_comment":		" that's all"
+	},
+```
+The type of the descriptor is set by the key `"type"`.