Multi-GPU caffe trainning is slow with CuDNN 

I compared 8-gpu caffe training with and without CuDNN. Surprisingly, CuDNN reduces training speed. I was wondering if anybody has seen this.

Here are some details:
OS: RHEL 6.5
CUDA: 7.5
CUDNN: 5.1
GPUs: 8 Telsa-K80
Caffe model: caffenet reference model
Data set: ImageNet.

Speed: 
   1-gpu with cudnn = 1.7 X 1-gpu without cudnn.
   8-gpu with cudnn =  0.86 X 8-gpu without cudnn.

I can provide more information if needed.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-GPU caffe trainning is slow with CuDNN #4901

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multi-GPU caffe trainning is slow with CuDNN #4901

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions