Conversation
|
See #1938 (comment) regarding the case of mean subtraction |
|
Could our Brewers give some indication if they would prefer the MVNLayer fixes to be separated out of this PR, as cdoersch suggests here that they should? It makes abundant sense that reviewing bug fixes for an existing feature is a higher priority than reviewing code for a proposed new feature. Especially since the MVNLayer fixes are a smallish subset of the the changes in this PR. I can do that, but it's a fair bit of work and don't want to expend the effort if it is no more likely to get review attention. Thanks. |
|
@shelhamer Are there any plans to merge this? If not, at least the fix to the MVNLayer gradient should be cherry-picked in. The gradient is currently incorrect (and it's not detected by the gradient checker because the test is also incorrect). |
|
MVNLayer fixes put in 2964, InverseMVNLayer doesn't seem interesting to others. |
Replaces 1895.
This PR extends the MVNLayer to allow the mean and variance blobs to be exported as top blobs. It adds a new layer type InverseMVNLayer which takes the mean and variance as bottom blobs, and performs the inverse operation (adding the mean back, and denormalizing for variance). A use case for this is an autoencoder that feeds input into the MVNLayer, and generates the output from the InverseMVNLayer, with autoencoding layers in-between.
There was also a problem with the MVNLayer that was fixed: if it is given input that has exactly zero variance (e.g. a solid color RGB image, with across_channels=false), it computes the variance as E(X^2) - (EX)^2, but the result isn't usually exactly zero, but has small negative values due to floating point resolution. The subsequent square root operation then produces NaN. This PR also fixes this issue.