Implement regularizers by kloudkl · Pull Request #258 · BVLC/caffe

kloudkl · 2014-03-25T03:40:24Z

This PR replaces #113 to set the merging target to be the dev branch. Please refer to #113 for the discussions about the design decisions.

kloudkl · 2014-03-25T03:54:26Z

The original PR #113 passed all the tests. But to accommodate the switching of the return types of Layer::Forward and Backward, regularization in the RegularizerAsLossLayer was moved from the latter to the former. After the change, many tests failed. Root cause and the solution have not been determined.

longjon · 2014-04-17T07:39:10Z

I wanted this so I took a look:

The main issue is that RegularizerAsLossLayer is redundant with the regularizer implementation. Making it almost trivial (just setting diff to zero in Forward) mostly fixes things, except the below.
Is there a reason for using sigma=10 Gaussian initalization in RegularizationAsLossTest? I can get things to pass iff I use the default sigma=1 instead.
The kink parameters to GradientChecker needs to be used to avoid the nonsmooth region of the L1 loss (how did these tests pass before?)
Making the changes described above fixes the tests but still does not leave a correct implementation, because the regularizer gradient is computed in Forward, and could be clobbered by the layer's Backward.
There are many redundant checks to see if things are nonzero (the number of regularizers or the size of a blob). Is there a good reason for these? To me they feel like noise.
Why regularize the bottom blob? To me the top feels more natural (but it does complicate the implementation of RegularizerAsLossLayer...)

aravindhm · 2014-04-21T14:04:55Z

I changed the gradient checker call from Single to Exhaustive and all the test passed. I don't think I changed anything else. Call CheckGradientExhaustive with only the first three options.

As for the gradient getting clobbered (forward vs backward) ... is this addressed by the automatic insertion of a split node creating a dedicated bottom blob for the regularizer? If not, I don't mind calling the regularize_gpu /cpu function again in the layer's backward_cpu/gpu.

longjon · 2014-04-21T22:15:10Z

@aravindhm, CheckGradientExhaustive loops over top blobs, which the RegularizerAsLossLayer has none. The tests pass, but don't test anything.

To fix the clobbering problem, no extra storage or computation is needed. In my branch I've split Regularize into Loss and Gradient functions, calling the former in Forward and the latter in Backward. (I've also switched regularization to work against the top blob.) If @kloudkl wants I can clean this up and send him a PR.

kloudkl · 2014-04-22T05:39:44Z

@longjon, I'm too busy and exhausted to take care of this recently. You can modify the code into whatever shape that satisfies your needs and open a new PR.

rodrigob · 2014-10-06T11:02:24Z

It seems that I also implemented something similar to this on my private branch (by adding a "PostUpdate" stage to the layers class). I am somewhat confused with the current design. How is L1Regularizer different from an L1Loss ? (and by the way, there is no L1Loss in dev branch, right ?)

longjon · 2015-03-09T21:58:06Z

Closing as this is now out-of-date/abandoned and can be executed through less intrusive means, e.g., explicit loss layers or per-param regularization options.

zhaogengyan · 2017-04-03T04:36:42Z

Hello @longjon , I have been searching for how to involve both L1 and L2 norm regularization on W in the cost function for the whole night, but got no answer. There are many people asking the same question in the User group, but there is nobody answering. I'm super confused on how to do the

explicit loss layers or per-param regularization options

mentioned by you.

Can Caffe involve both L1 and L2 regularization at the same time? And can Caffe involve layer-wise different regularizers (some layers get L1 and others get L2)? Thank you very much.

robmosh · 2017-04-24T13:06:17Z

plus one interested here for a regularization layer

foolwood · 2018-01-16T09:22:27Z

plus one interested here for a regularization layer

kloudkl added 16 commits March 25, 2014 10:03

Add & test regularizer class hierarchy: L1, L2 & skeleton of MaxNorm

474899e

Add support for multiple regularizers in one layer

8c6ee8c

Simplify the macros in test_regualarizer_as_loss_layer & add more cases

493cbc1

Integrate the Regularizer with the Layer

6b6c60f

Skip testing failure cases of test_regularizer_as_loss_layer

877f8f6

Add Regularizer::Regularizer return value to the Backward return value

a9f355f

Rename ret to loss to indicate purpose in Layer::Backward

bd071dd

Fix cpp lint errors in the regularizer related filed

7e5b516

Change the return types of RegularizerAsLossLayer::Forward/Backward

e165972

Split regularizer_as_loss_layer.cpp into cpp and cu

610ac2b

Fix bottom blob vector element access bug

1a69f4b

Change RegularizationAsLossTest to accommodate CheckGradientSingle

ae9699b

Fix Layer::Forward switch case no break bug introduced during merging

57441fd

Split regularizer.cu into cpp and cu files

454fc0e

Change the ScaleSign in regularizer.cu to use CUDA_KERNEL_LOOP

ac68ed4

Change L1Regularizer::Regularize_cpu to use caffe_sign & caffe_cpu_asum

3140448

kloudkl mentioned this pull request Mar 25, 2014

Add & test regularizer class hierarchy: L1, L2 & skeleton of MaxNorm #113

Closed

shelhamer added enhancement labels Mar 28, 2014

shelhamer added this to the 1.1 milestone Mar 28, 2014

longjon mentioned this pull request Apr 8, 2014

HingeLossLayer #303

Merged

shelhamer assigned longjon Apr 21, 2014

longjon mentioned this pull request Apr 22, 2014

Make CheckGradientExhaustive fail for topless layers #350

Merged

shelhamer force-pushed the dev branch 2 times, most recently from c01f07a to 4278286 Compare August 28, 2014 06:35

shelhamer force-pushed the dev branch from 4278286 to c01f07a Compare August 28, 2014 07:00

shelhamer added in progress and removed work in progress labels Aug 29, 2014

shelhamer force-pushed the dev branch from 64258b6 to 403b56b Compare September 19, 2014 04:38

shelhamer removed the in progress label Sep 21, 2014

shelhamer force-pushed the dev branch from d8eb4df to 914da95 Compare October 8, 2014 16:36

sergeyk force-pushed the dev branch from 2fb4c97 to 1718903 Compare October 17, 2014 18:44

shelhamer added the abandoned label Mar 9, 2015

longjon closed this Mar 9, 2015

shelhamer removed this from the Future milestone Mar 10, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement regularizers#258

Implement regularizers#258
kloudkl wants to merge 16 commits intoBVLC:devfrom
kloudkl:regularizer_class_hierarchy

kloudkl commented Mar 25, 2014

Uh oh!

kloudkl commented Mar 25, 2014

Uh oh!

longjon commented Apr 17, 2014

Uh oh!

aravindhm commented Apr 21, 2014

Uh oh!

longjon commented Apr 21, 2014

Uh oh!

kloudkl commented Apr 22, 2014

Uh oh!

rodrigob commented Oct 6, 2014

Uh oh!

longjon commented Mar 9, 2015

Uh oh!

zhaogengyan commented Apr 3, 2017

Uh oh!

robmosh commented Apr 24, 2017

Uh oh!

foolwood commented Jan 16, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Comments

Conversation

kloudkl commented Mar 25, 2014

Uh oh!

kloudkl commented Mar 25, 2014

Uh oh!

longjon commented Apr 17, 2014

Uh oh!

aravindhm commented Apr 21, 2014

Uh oh!

longjon commented Apr 21, 2014

Uh oh!

kloudkl commented Apr 22, 2014

Uh oh!

rodrigob commented Oct 6, 2014

Uh oh!

longjon commented Mar 9, 2015

Uh oh!

zhaogengyan commented Apr 3, 2017

Uh oh!

robmosh commented Apr 24, 2017

Uh oh!

foolwood commented Jan 16, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Comments