Skip to content

Comments

CuDNN 7 grouped convolution#5879

Open
kouyoumin wants to merge 4 commits intoBVLC:masterfrom
kouyoumin:cudnn7
Open

CuDNN 7 grouped convolution#5879
kouyoumin wants to merge 4 commits intoBVLC:masterfrom
kouyoumin:cudnn7

Conversation

@kouyoumin
Copy link

Uses CuDNN 7's new grouped convolution instead of for loop.

@fengziyong
Copy link

@kouyoumin Have you test the forward time?
I have try to use cuDNNv7 before, but I found it was slower than the original implementation in group mode.

@Noiredd
Copy link
Member

Noiredd commented Nov 2, 2017

I have restarted the tests that failed prior to #5973, it passes now. However I'm not sure if Travis is even able to test cuDNN 7 right now. Could you take a look at #5972, where the dependencies script was also modified, and compare?

Also, this needs a squash before merging, but I think that for a review it's good as is.

@twmht
Copy link
Contributor

twmht commented Nov 3, 2017

not fast enough.

Anyone have tested this on MobileNet?

@Noiredd
Copy link
Member

Noiredd commented Nov 7, 2017

@kouyoumin Please note before rebasing that #5972 modified the cudnn.hpp, adding the v7 cases.

Copy link
Member

@Noiredd Noiredd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kouyoumin I made a more thorough testing of this PR: from what I see (both on some synthetic benchmark nets like a 1x256x256x256 cube convolved with 5x5 filters in a group of 256, as well as more realistic nets), this does not seem to make it faster (on my old GTX TITAN Z), but the RAM saving is quite significant (from 4% to almost 60% - the larger group the better). Please take a look at the comments I left in the code review, squash and rebase and I will merge this.

cudnn::dataType<Dtype>::one,
bias_desc_, bias_diff + bias_offset_ * g));
#else
CUDNN_CHECK(cudnnConvolutionBackwardBias(handle_[g],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we leave CUDNN_CHECK(cudnnConvolutionBackwardBias(handle_[0*this->group_ + g], here?

case CUDNN_STATUS_RUNTIME_IN_PROGRESS:
return "CUDNN_STATUS_RUNTIME_IN_PROGRESS";
case CUDNN_STATUS_RUNTIME_FP_OVERFLOW:
return "CUDNN_STATUS_RUNTIME_FP_OVERFLOW";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this fragement is no longer necessary (since #5972), please remove this while rebasing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants