HingeLossLayer by longjon · Pull Request #303 · BVLC/caffe

longjon · 2014-04-08T05:14:40Z

Currently, HingeLossLayer only provides a CPU implementation
Tests are not included
I intend eventually to rectify the above, but these commits should be usable now
~~Adding L2 regularization directly to InnerProductLayer might seem heavy-handed (although the implementation is simple)~~
~~AFAICT Implement regularizers #258 does not address regularization of parameters, so it will not make 86ef499 go away, but future work might provide a more general solution~~
Although this means one can use caffe to train linear SVMs, this is clumsy without the convex optimization smarts provided by SVM packages; however, caffe can do end-to-end training of a nonconvex network with a max-margin objective

jeffdonahue · 2014-04-08T05:25:21Z

Why L2 regularization in InnerProductLayer? Should be equivalent to weight decay, no? (Though your implementation does save an axpy if using lambda instead of weight_decay, with weight_decay set to 0, but seems potentially hazardous if we're not going to remove weight_decay altogether and do something similar in all layers with parameters imho.)

longjon · 2014-04-08T05:57:59Z

Good point, wasn't thinking about that. One might conceivably want a different tradeoff parameter at the HingeLossLayer, but that could be dealt with in a different way. In fact lambda = 0 does very well on LeNet/hinge, so I'm removing that commit.

jeffdonahue · 2014-04-08T08:12:57Z

Inside each LayerParameter you can specify a weight_decay multiplier to the global weight_decay in SolverParameter.

HingeLossLayer looks good to me, would merge with basic unit tests.

jeffdonahue · 2014-04-08T08:15:56Z

src/caffe/layers/loss_layer.cpp

caffe_copy(count, bottom_data, bottom_diff)

s9xie · 2014-04-10T05:10:52Z

@longjon can you share the prototxt for LeNet/hinge? I've got numerical overflow on gradient computations with your loss...

longjon · 2014-04-10T05:41:55Z

My apologies, @s9xie, I accidentally clobbered the working commit with a broken one (an errant minus sign). I've put up a fixed version that gets (e.g.) 0.9921 accuracy after 10k iterations. The only change to the prototxt is to replace SOFTMAX_LOSS with HINGE_LOSS. (The default learning rate is fine.)

zgxiangyang · 2014-04-13T03:36:38Z

I'm new to hinge loss, how can it be applied to a multi-class problem?
l = max(0, 1-xy) with x = -1/ 1 for a single node
then what is the whole Loss

longjon · 2014-04-13T09:42:53Z

@zgxiangyang, this layer implements one-vs-all hinge loss, so the loss for each example is the hinge loss for the binary problem of separating the true class of that example from all other classes. There is also (not implemented here) a different multiclass hinge loss, the Crammer and Singer version, that some feel is more natural (and that extends naturally to structured prediction problems); one-vs-all is, however, more common in practice.

zgxiangyang · 2014-04-13T10:02:47Z

@longjon thanks!

This layer implements a "one-vs-all" hinge loss, (1/n) sum_ij max(0, 1 - y_ij x_ij), with bottom blob x_ij (i ranging over examples and j over classes), and y_ij = +1/-1 indicating the label. No regularization is included, since regularization is done via weight decay or using the parameters of another layer. The gradient is taken to be zero at the hinge point. This commit only provides the CPU implementation.

In theory, layer functions could be nonsmooth anywhere; in all cases in use so far, they are nonsmooth at either zero or +1 and -1. In the future, it might be necessary to generalize the kink mechanism beyond this stopgap measure.

Based on SoftmaxWithLossLayerTest.

longjon · 2014-04-26T00:07:00Z

Now using caffe_copy, tests added and passing (note the change I made to how kink works), lint passes.

Can someone confirm that is okay to store intermediate computations in diff during Forward?

Other than that, this is ready for review.

jeffdonahue · 2014-04-26T22:13:13Z

looks great, thanks Jon!

HingeLossLayer

sguada · 2014-05-08T00:21:18Z

@longjon I think in the long run we probably don't want to use diff to store intermediate results, but it is fine for now.

HingeLossLayer

longjon changed the title ~~HingeLossLayer and L2 regularization in InnerProductLayer~~ HingeLossLayer Apr 8, 2014

jeffdonahue reviewed Apr 8, 2014
View reviewed changes

src/caffe/layers/loss_layer.cpp Outdated

Copy link

Contributor

jeffdonahue Apr 8, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

caffe_copy(count, bottom_data, bottom_diff)

shelhamer added the enhancement label Apr 8, 2014

longjon added 3 commits April 25, 2014 16:38

make gradient checker's kink use feature absolute value

a61a2ac

In theory, layer functions could be nonsmooth anywhere; in all cases in use so far, they are nonsmooth at either zero or +1 and -1. In the future, it might be necessary to generalize the kink mechanism beyond this stopgap measure.

test HingeLossLayer

3271ce8

Based on SoftmaxWithLossLayerTest.

shelhamer assigned jeffdonahue Apr 26, 2014

jeffdonahue added a commit that referenced this pull request Apr 26, 2014

Merge pull request #303 from longjon/hinge-loss-layer

4a31964

HingeLossLayer

jeffdonahue merged commit 4a31964 into BVLC:dev Apr 26, 2014

sguada mentioned this pull request May 7, 2014

Add L2 hinge loss #398

Merged

shelhamer mentioned this pull request May 20, 2014

Next: 0.999 #429

Merged

mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014

Merge pull request BVLC#303 from longjon/hinge-loss-layer

8498e7a

HingeLossLayer

longjon deleted the hinge-loss-layer branch December 30, 2014 04:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HingeLossLayer#303

HingeLossLayer#303
jeffdonahue merged 3 commits intoBVLC:devfrom
longjon:hinge-loss-layer

longjon commented Apr 8, 2014

Uh oh!

jeffdonahue commented Apr 8, 2014

Uh oh!

longjon commented Apr 8, 2014

Uh oh!

jeffdonahue commented Apr 8, 2014

Uh oh!

jeffdonahue Apr 8, 2014

Uh oh!

s9xie commented Apr 10, 2014

Uh oh!

longjon commented Apr 10, 2014

Uh oh!

zgxiangyang commented Apr 13, 2014

Uh oh!

longjon commented Apr 13, 2014

Uh oh!

zgxiangyang commented Apr 13, 2014

Uh oh!

longjon commented Apr 26, 2014

Uh oh!

jeffdonahue commented Apr 26, 2014

Uh oh!

sguada commented May 8, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Comments

Conversation

longjon commented Apr 8, 2014

Uh oh!

jeffdonahue commented Apr 8, 2014

Uh oh!

longjon commented Apr 8, 2014

Uh oh!

jeffdonahue commented Apr 8, 2014

Uh oh!

jeffdonahue Apr 8, 2014

Choose a reason for hiding this comment

Uh oh!

s9xie commented Apr 10, 2014

Uh oh!

longjon commented Apr 10, 2014

Uh oh!

zgxiangyang commented Apr 13, 2014

Uh oh!

longjon commented Apr 13, 2014

Uh oh!

zgxiangyang commented Apr 13, 2014

Uh oh!

longjon commented Apr 26, 2014

Uh oh!

jeffdonahue commented Apr 26, 2014

Uh oh!

sguada commented May 8, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Comments