Add missing value support to SoftmaxLossLayer by longjon · Pull Request #1654 · BVLC/caffe

longjon · 2014-12-30T01:33:54Z

This PR adds an option, missing_value, which allows some specified label to indicate that loss should not be computed at the corresponding location.

This creates an issue for normalization. To match the existing normalization, we should normalize by the number of labels actually present. However, this means that the weight of each label depends on the number of missing values in each batch, which may be undesirable. For this reason an extra boolean, normalize, is added; if normalize is false, normalization occurs only over the batch size.

Ideally this would be done in a way that can be used uniformly across different loss layers, but this is the form it exists in now.

Not yet well tested.

Add missing value support to SoftmaxLossLayer

longjon · 2014-12-30T08:28:59Z

This is currently passing, but there were some odd nondeterministic failures in the CPU_ONLY/CMake case, see: https://s3.amazonaws.com/archive.travis-ci.org/jobs/45421159/log.txt. No idea what's up with that.

bhack · 2014-12-30T13:00:14Z

@longjon What kind of failures? Do you mean tests on db?

longjon · 2014-12-30T20:38:20Z

@bhack oops, the link isn't to the failed test, but updated to the passing test... NetTest was failing nondeterministically. This was due to a bug that should be fixed now (an uninitialized member variable).

I also noticed some test sketchiness while debugging this, will make another PR.

Add missing value support to SoftmaxLossLayer

sguada · 2014-12-30T21:32:50Z

src/caffe/layers/softmax_loss_layer.cpp

I don't think this is safe, since the parameter is optional and doesn't have default value

Yes, it's safe, see https://developers.google.com/protocol-buffers/docs/reference/cpp-generated#fields and https://developers.google.com/protocol-buffers/docs/proto#optional. It might be a good idea to add a comment to make this clear.

For int the default value is 0

Sergio

2014-12-30 13:56 GMT-08:00 Jon Long notifications@github.com:

In src/caffe/layers/softmax_loss_layer.cpp
#1654 (diff):

@@ -17,6 +17,11 @@ void SoftmaxWithLossLayer::LayerSetUp(
softmax_top_vec_.clear();
softmax_top_vec_.push_back(&prob_);
softmax_layer_->SetUp(softmax_bottom_vec_, softmax_top_vec_);
+

has_missing_values_ =

this->layer_param_.softmax_loss_param().has_missing_value();

missing_value_ = this->layer_param_.softmax_loss_param().missing_value();

Yes, it's safe, see
https://developers.google.com/protocol-buffers/docs/reference/cpp-generated#fields
and https://developers.google.com/protocol-buffers/docs/proto#optional.
It might be a good idea to add a comment to make this clear.

—
Reply to this email directly or view it on GitHub
https://github.com/BVLC/caffe/pull/1654/files#r22366612.

Yes, I know. has_missing_values_ gets set to false, and missing_value_ gets set to zero, but that value has no effect. Anyway I think I'll just add the unnecessary check, since it's less of a mental speed bump.

sguada · 2014-12-30T21:35:48Z

@longjon I think it will be great if we could define a typical default -1 for missing_value (better called ignore_label), that could be used for other losses including Multi-Label

longjon · 2014-12-30T22:31:37Z

I agree that ignore_label is a better, more descriptive name; I've updated accordingly. I don't agree that -1 labels should be ignored by default; I think it's better to ignore nothing by default, and be explicit when ignoring certain labels. Erroneously providing -1 is not terribly hard to imagine; e.g., this is exactly what happened before #1661 which I just posted. Being explicit also means that nets are better self-documented, and avoids "magic numbers".

sguada · 2014-12-31T01:16:05Z

What do you think about renaming the new param softmax_loss_param to ignore_label_param that way could be used by other the Loss layers? I used a fixed value to ignore labels in #523, but this will be better.

Add missing value support to SoftmaxLossLayer

longjon · 2014-12-31T05:13:07Z

I agree that it makes sense to use a more generic name, but then what about the normalize option? Adding yet another param block for that seems messy. Since that option can also apply to other loss layers, maybe both should be in a generic loss_param?

sguada · 2014-12-31T05:14:15Z

That sounds good to me.

On Tuesday, December 30, 2014, Jon Long notifications@github.com wrote:

I agree that it makes sense to use a more generic name, but then what
about the normalize option? Adding yet another param block for that seems
messy. Since that option can also apply to other loss layers, maybe both
should be in a generic loss_param?

—
Reply to this email directly or view it on GitHub
#1654 (comment).

Sergio

With missing values (and batches of varying spatial dimension), normalizing each batch across instances can inappropriately give different instances different weights, so we give the option of simply normalizing by the batch size instead.

Add missing value support to SoftmaxLossLayer

shelhamer · 2015-01-16T23:39:14Z

Needs tests (just check loss and gradients for a toy input and truth pair) and docs, then merge.

Add missing value support to SoftmaxLossLayer

longjon · 2015-01-27T21:36:42Z

I've added a doc comment describing the additional parameters, and simple tests that check the gradient with an ignore label set and with normalize set to false, so this should be ready pending any further comments.

shelhamer · 2015-01-27T21:55:29Z

@longjon the docs and gradient tests look good, but whether the ignored labels do not actually contribute the loss or not deserves coverage. Once that test is in, let's merge.

longjon · 2015-01-27T23:59:44Z

Indeed, the pure gradient checks leave something out. I've added a simple test that checks that labels are being ignored.

Add missing value support to SoftmaxLossLayer

shelhamer · 2015-01-30T04:52:21Z

Thanks Jon!

longjon force-pushed the softmax-missing-values branch from b172127 to 51d63a5 Compare December 30, 2014 01:35

longjon added a commit to longjon/caffe that referenced this pull request Dec 30, 2014

Merge pull request BVLC#1654 from longjon/softmax-missing-values

fcd9fd5

Add missing value support to SoftmaxLossLayer

shelhamer added the focus label Dec 30, 2014

longjon force-pushed the softmax-missing-values branch from 51d63a5 to b14afd3 Compare December 30, 2014 05:02

longjon added a commit to longjon/caffe that referenced this pull request Dec 30, 2014

Merge pull request BVLC#1654 from longjon/softmax-missing-values

1ebddca

Add missing value support to SoftmaxLossLayer

longjon force-pushed the softmax-missing-values branch from b14afd3 to a9a1cac Compare December 30, 2014 20:35

longjon added a commit to longjon/caffe that referenced this pull request Dec 30, 2014

Merge pull request BVLC#1654 from longjon/softmax-missing-values

b851900

Add missing value support to SoftmaxLossLayer

longjon added a commit to longjon/caffe that referenced this pull request Dec 30, 2014

Merge pull request BVLC#1654 from longjon/softmax-missing-values

dade63c

Add missing value support to SoftmaxLossLayer

sguada reviewed Dec 30, 2014
View reviewed changes

longjon force-pushed the softmax-missing-values branch from a9a1cac to 0fdee92 Compare December 30, 2014 22:15

longjon added a commit to longjon/caffe that referenced this pull request Dec 31, 2014

Merge pull request BVLC#1654 from longjon/softmax-missing-values

5f8dcd1

Add missing value support to SoftmaxLossLayer

longjon added 3 commits December 30, 2014 21:55

add missing value support to SoftmaxLossLayer

5843b52

clean up formatting in SoftmaxLossLayer

c7f63da

longjon force-pushed the softmax-missing-values branch from 0fdee92 to c7f63da Compare December 31, 2014 06:01

longjon added a commit to longjon/caffe that referenced this pull request Dec 31, 2014

Merge pull request BVLC#1654 from longjon/softmax-missing-values

e33f159

Add missing value support to SoftmaxLossLayer

longjon added a commit to longjon/caffe that referenced this pull request Dec 31, 2014

Merge pull request BVLC#1654 from longjon/softmax-missing-values

e939b70

Add missing value support to SoftmaxLossLayer

longjon added a commit to longjon/caffe that referenced this pull request Jan 1, 2015

Merge pull request BVLC#1654 from longjon/softmax-missing-values

2f94aec

Add missing value support to SoftmaxLossLayer

longjon added a commit to longjon/caffe that referenced this pull request Jan 2, 2015

Merge pull request BVLC#1654 from longjon/softmax-missing-values

0231100

Add missing value support to SoftmaxLossLayer

longjon added a commit to longjon/caffe that referenced this pull request Jan 2, 2015

Merge pull request BVLC#1654 from longjon/softmax-missing-values

264becd

Add missing value support to SoftmaxLossLayer

longjon added a commit to longjon/caffe that referenced this pull request Jan 3, 2015

Merge pull request BVLC#1654 from longjon/softmax-missing-values

1fc36f5

Add missing value support to SoftmaxLossLayer

longjon added a commit to longjon/caffe that referenced this pull request Jan 3, 2015

Merge pull request BVLC#1654 from longjon/softmax-missing-values

0ed9883

Add missing value support to SoftmaxLossLayer

shelhamer added the in progress label Jan 16, 2015

shelhamer mentioned this pull request Jan 25, 2015

GPU version of SoftmaxWithLossLayer #1789

Merged

philkr added a commit to philkr/caffe that referenced this pull request Jan 25, 2015

Merge pull request BVLC#1654 from longjon/softmax-missing-values

89c8691

Add missing value support to SoftmaxLossLayer

document the loss_param options to SoftmaxWithLossLayer

41b754c

longjon added ready for review and removed in progress labels Jan 27, 2015

[test] gradient checks for softmax ignore_label and normalize: false

f1eada7

longjon force-pushed the softmax-missing-values branch from 5bd5684 to f1eada7 Compare January 27, 2015 23:57

shelhamer added a commit that referenced this pull request Jan 30, 2015

Merge pull request #1654 from longjon/softmax-missing-values

cff3007

Add missing value support to SoftmaxLossLayer

shelhamer merged commit cff3007 into BVLC:dev Jan 30, 2015

jeffdonahue mentioned this pull request Nov 10, 2015

Better normalization options for SoftmaxWithLoss layer #3296

Merged

Conversation

longjon commented Dec 30, 2014

Uh oh!

longjon commented Dec 30, 2014

Uh oh!

bhack commented Dec 30, 2014

Uh oh!

longjon commented Dec 30, 2014

Uh oh!

sguada Dec 30, 2014

Choose a reason for hiding this comment

Uh oh!

longjon Dec 30, 2014

Choose a reason for hiding this comment

Uh oh!

sguada Dec 30, 2014

Choose a reason for hiding this comment

Uh oh!

longjon Dec 30, 2014

Choose a reason for hiding this comment

Uh oh!

sguada commented Dec 30, 2014

Uh oh!

longjon commented Dec 30, 2014

Uh oh!

sguada commented Dec 31, 2014

Uh oh!

longjon commented Dec 31, 2014

Uh oh!

sguada commented Dec 31, 2014

Uh oh!

shelhamer commented Jan 16, 2015

Uh oh!

longjon commented Jan 27, 2015

Uh oh!

shelhamer commented Jan 27, 2015

Uh oh!

longjon commented Jan 27, 2015

Uh oh!

shelhamer commented Jan 30, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments