Incorrect gradient from a SoftmaxWithLossLayer with loss_weight 0

I was debugging a network with two loss layers and wanted to disable one of them (a SoftmaxWithLossLayer), as such I set the loss_weight to 0. However, this does not do what I expected at all. The clearest way to explain this is probably using an example on how to reproduce it.

To reproduce one can take the examples/mnist/lenet_train_test.prototxt and add a second loss layer, with weight 0:

```
layer {
 name: "bad_loss"
 type: "SoftmaxWithLoss"
 bottom: "ip2"
 bottom: "label"
 top: "bad_loss"
 loss_weight: 0
}
```

and then run this python script:

``` python
caffe_root = '/roaming/nanne/caffe/' # Update this path to the correct path 
import sys
sys.path.insert(0, caffe_root + 'python')
import os
os.chdir(caffe_root)
import caffe
import numpy as np

caffe.set_mode_gpu()
solver = caffe.SGDSolver(caffe_root + 'examples/mnist/lenet_solver.prototxt')

solver.step(1)

print solver.net.blobs['ip2_ip2_0_split_0'].diff.squeeze()[5:7, :]
print solver.net.blobs['ip2_ip2_0_split_1'].diff.squeeze()[5:7, :]

print solver.net.blobs['ip2'].diff.squeeze()[5:7, :]
```

The diff for the split belonging to the SoftmaxWithLoss with loss_weight 0 will contain 64 (batchsize) values equal to the loss (NOT the gradient) for that input, and all the other elements will be 0. The other split will correctly contain all the diff values (64*10) for the loss with weight 1.

However, these two splits still get combined, creating the diff for 'ip2' for which the first 64 values are not comparable to the last 576. Am I wrong in how I tried to use the loss_weight or is this a bug? (It doesn't seem to be specific to SoftmaxWithLoss, though its most clear for this layer).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect gradient from a SoftmaxWithLossLayer with loss_weight 0 #2895

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incorrect gradient from a SoftmaxWithLossLayer with loss_weight 0 #2895

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions