Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions deepmd/train/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -384,10 +384,10 @@ def _build_training(self):
optimizer = self.run_opt._HVD.DistributedOptimizer(optimizer)
else:
optimizer = tf.train.AdamOptimizer(learning_rate = self.learning_rate)
grads = tf.gradients(self.l2_l, trainable_variables)
apply_op = optimizer.apply_gradients (zip (grads, trainable_variables),
global_step=self.global_step,
name='train_step')
apply_op = optimizer.minimize(loss=self.l2_l,
global_step=self.global_step,
var_list=trainable_variables,
name='train_step')
train_ops = [apply_op] + self._extra_train_ops
self.train_op = tf.group(*train_ops)
log.info("built training")
Expand Down
14 changes: 10 additions & 4 deletions doc/train/parallel-training.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,19 @@ Currently, parallel training is enabled in a sychoronized way with help of [Horo
Testing `examples/water/se_e2_a` on a 8-GPU host, linear acceleration can be observed with increasing number of cards.
| Num of GPU cards | Seconds every 100 samples | Samples per second | Speed up |
| -- | -- | -- | -- |
| 1 | 1.6116 | 62.05 | 1.00 |
| 2 | 1.6310 | 61.31 | 1.98 |
| 4 | 1.6168 | 61.85 | 3.99 |
| 8 | 1.6212 | 61.68 | 7.95 |
| 1 | 1.4515 | 68.89 | 1.00 |
| 2 | 1.5962 | 62.65*2 | 1.82 |
| 4 | 1.7635 | 56.71*4 | 3.29 |
| 8 | 1.7267 | 57.91*8 | 6.72 |

To experience this powerful feature, please intall Horovod and [mpi4py](https://github.com/mpi4py/mpi4py) first. For better performance on GPU, please follow tuning steps in [Horovod on GPU](https://github.com/horovod/horovod/blob/master/docs/gpus.rst).
```bash
# With GPU, prefer NCCL as communicator.
HOROVOD_WITHOUT_GLOO=1 HOROVOD_WITH_TENSORFLOW=1 HOROVOD_GPU_OPERATIONS=NCCL HOROVOD_NCCL_HOME=/path/to/nccl pip3 install horovod mpi4py
```

If your work in CPU environment, please prepare runtime as below:
```bash
# By default, MPI is used as communicator.
HOROVOD_WITHOUT_GLOO=1 HOROVOD_WITH_TENSORFLOW=1 pip install horovod mpi4py
```
Expand Down