Reshape Layers before calling Forward / Backward by slayton58 · Pull Request #35 · NVIDIA/caffe

slayton58 · 2015-09-17T18:16:14Z

Fix issue where cuDNN convolution algorithms were being chosen, then subsequent layer reshapes caused allocations that resulted in not enough memory being available to actually allocate workspace(s) in CuDNNConvolutionLayer->{Forward,Backward}_gpu()

lukeyeager · 2015-09-17T20:01:55Z

This fixes the bug that Andrew reported. Is it safe enough to merge? I don't understand all the implications of making this change.

borisfom · 2015-11-10T20:38:36Z

Looks safe to me - definitely should help with memory issues. Luke, it's your call to merge.

lukeyeager · 2015-11-10T21:24:08Z

I'll defer to @thatguymike - he said this required some more thought.

thatguymike · 2015-11-10T21:28:11Z

I think this is okay now.

thatguymike · 2015-11-10T21:31:58Z

So the question is if we need/want this path on 0.14 as well. With the CUB path I don't think we need it, although I don't think it causes an issue, but maybe still with CNMem. @borisfom and @slayton58, comments?

borisfom · 2015-11-10T22:36:21Z

Doesn't it recalculate actual workspace size needed for forward/backward pass ?
Then it should affect performance (or even success?) somehow, with any pool being used.
Also, with the change I merged last week (adding buffer class that retains memory locally in absence of the pool), any pool strategy should operate with similar performance.

slayton58 · 2015-11-10T23:38:00Z

I think it'll still be needed (unless @borisfom has made some changes I haven't kept up with) -- we can still have the case where we Reshape() a convolution layer during network initialization, determining that a large amount of memory is currently free, and a costly algorithm should be used. We then finish initializing the network, potentially allocating all remaining space with subsequent layers + params. Then, when we go to grab that workspace from the allocator during the forward or backward pass, there may not be enough memory left, causing the error.

borisfom · 2015-11-11T00:11:56Z

Agreed with Simon - dynamics may be very different and recalculation would not hurt.
In case of using pool we are actually more prone to the error - in the no-pool case memory is already retained, so if no change in calculated worksite since last call, it won't fail. Pool case will fail if it will try allocating workspace using stale workspace size info.

Reshape Layers before calling Forward / Backward

lukeyeager · 2015-11-20T18:15:30Z

Merged into master and caffe-0.13.

Reshape Layers before calling Forward / Backward

f0b8e81

lukeyeager added the bug label Sep 17, 2015

lukeyeager added a commit that referenced this pull request Nov 20, 2015

Merge pull request #35 from slayton58/reshape_fix

a8cf019

Reshape Layers before calling Forward / Backward

lukeyeager merged commit a8cf019 into NVIDIA:master Nov 20, 2015

thatguymike mentioned this pull request Dec 3, 2015

Port Simon's fix foward #91

Closed

lukeyeager mentioned this pull request Dec 28, 2015

Revert caffe-0.13 branch #96

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Reshape Layers before calling Forward / Backward#35

Reshape Layers before calling Forward / Backward#35
lukeyeager merged 1 commit intoNVIDIA:masterfrom
slayton58:reshape_fix

slayton58 commented Sep 17, 2015

Uh oh!

lukeyeager commented Sep 17, 2015

Uh oh!

borisfom commented Nov 10, 2015

Uh oh!

lukeyeager commented Nov 10, 2015

Uh oh!

thatguymike commented Nov 10, 2015

Uh oh!

thatguymike commented Nov 10, 2015

Uh oh!

borisfom commented Nov 10, 2015

Uh oh!

slayton58 commented Nov 10, 2015

Uh oh!

borisfom commented Nov 11, 2015

Uh oh!

lukeyeager commented Nov 20, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

slayton58 commented Sep 17, 2015

Uh oh!

lukeyeager commented Sep 17, 2015

Uh oh!

borisfom commented Nov 10, 2015

Uh oh!

lukeyeager commented Nov 10, 2015

Uh oh!

thatguymike commented Nov 10, 2015

Uh oh!

thatguymike commented Nov 10, 2015

Uh oh!

borisfom commented Nov 10, 2015

Uh oh!

slayton58 commented Nov 10, 2015

Uh oh!

borisfom commented Nov 11, 2015

Uh oh!

lukeyeager commented Nov 20, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants