Conversation
|
I think the code misses computing the second part of derivative over mean during back propagation. |
There was a problem hiding this comment.
// I think we should add these following code between line 244 and 246. Correct me if I am wrong
// EX across spatial
caffe_cpu_gemv(CblasNoTrans, N_ * C_, H_ * W_, Dtype(1), bottom_diff, spatial_multiplier_.cpu_data(), Dtype(0), spatial_mean_.mutable_cpu_data());
// EX across batch
caffe_cpu_gemv(CblasNoTrans, N_, C_, Dtype(1), spatial_mean_.cpu_data(), batch_sum_multiplier_.cpu_data(), Dtype(0), batch_mean_.mutable_cpu_data());
caffe_cpu_gemm(CblasNoTrans, CblasNoTrans, N_, C_, 1, Dtype(1), batch_sum_multiplier_.cpu_data(), batch_mean_.cpu_data(), Dtype(0), spatial_mean_.mutable_cpu_data());
caffe_cpu_gemm(CblasNoTrans, CblasNoTrans, N_ * C_, H_ * W_, 1, Dtype(-1), spatial_mean_.cpu_data(), spatial_multiplier_.cpu_data(), Dtype(1), bottom_diff);
|
@weiliu89, |
|
@ChenglongChen |
|
Replaced by #1965 |

based on fixed PRs from Russell91 (base test) and ChenglongChen (implementation)