ReshapeLayer and FlattenLayer should not allow in-place computation#3393
ReshapeLayer and FlattenLayer should not allow in-place computation#3393kkhoot wants to merge 1 commit intoBVLC:masterfrom
Conversation
|
While so far only neuron layers should be applied in-place, I am not sure if there is anyone who replies on in-place reshape. |
|
Perhaps this should be generalized to throw an error for all layers that aren't neuron layers? |
Yes, I believe this should be correct guideline of in-place computation. |
|
For me, unlike other common layers these reshaping layers seemed to work in-place since they do not change the input values, and I had been using that way for months. I recently found it did not work during debugging. I am not sure if this is a common case, but IMO it is better to be prevented or documented at least. |
|
Generally I think in-place should be disallowed except in explicit layers where the coder for that layer has indicated that it is safe. I think this needs to be part of an explicit There are also considerations in terms of the previous layer. For example, if a layer needs the |
|
Thanks for the PR @kkhoot. I think these checks should be moved to the I started trying to clean up the in-place behavior a while ago but never really completed it. I was trying to have |
|
Thanks for comments. I also agree with @seanbell 's idea. Hope @jeffdonahue 's nice work will be merged someday. |
|
I went ahead and merged this in 940b299 (with a couple of spacing fixes) as it's useful to have even without fancier/default checks. These checks should be removed if/when such checks are implemented. (And I realized @kkhoot put the check in the |
This PR is to prevent incorrect gradients caused by in-place reshaping of Blobs in ReshapeLayer and FlattenLayer.
If bottom and top blobs are the same in both of these layers and they are on top of a layer which uses the shape of its top blob for backward (such as SliceLayer or PoolingLayer) , then its gradients can be calculated in a wrong way. It can also allow bottom and top blobs of different counts since currently the count checks are performed after reshaping the blob.
A simple solution to this is not allow in-place operation for these layers. Another solution can be save the bottom shape and restore it during backward.
This PR is the former simpler solution.