Bugs fixed in the euclidean loss layer. by aravindhm · Pull Request #137 · BVLC/caffe

aravindhm · 2014-02-20T23:13:03Z

I have the following concerns

The backpropogated gradient is sent to only one of the bottom blobs bottom[0]. One of the bottom blobs was not getting updated in my experiments. I think it should go on both blobs.
Some formulations use 1/n sum over i (x_i - y_i)^2 where n is the size of x and y vectors. Can this be accommodated by using the scale parameter already present in the layer (default value 1).
The euclidean loss when used in both net.prototxt and val.prototxt files would not publish any error over validation data because it does not output a measure of accuracy into a layer above it (no output layers in the network). I propose that this layer compute the error and publish it to a blob top[0] if one exists. In other words we need to define forward_cpu and forward_gpu so that the network in test mode outputs some error corresponding to this layer.

Please provide feedback on this pull request. I've not tested it for MKL though I've editted the test file. I think the gradient checker hasn't been used correctly in my version. Please let me know.

jeffdonahue · 2014-02-25T18:57:49Z

src/caffe/test/test_euclidean_loss_layer.cpp

use EXPECT_NEAR instead of EXPECT_LE and EXPECT_GE

jeffdonahue · 2014-02-25T19:09:19Z

Thanks @aravindhm! I added a couple of minor comments above, will merge once these are addressed.

…test the same. Changed testing macros to EXPECT_NEAR.

aravindhm · 2014-02-26T00:23:08Z

I've fixed the problems and just checked that the code complies and works with the boost-eigen branch (after a few changes specific to that branch).

jeffdonahue · 2014-02-27T05:03:25Z

Hi @aravindhm, I compiled and ran your tests and I get numerous gradient check failures (which look well beyond a simple precision issue):

The difference between computed_gradient and estimated_gradient is 5.5232507603383505, which exceeds threshold_ * scale, where
computed_gradient evaluates to 0.098407263758736319,
estimated_gradient evaluates to 5.621658024097087, and
threshold_ * scale evaluates to 0.05621658024097087.
debug: (top_id, top_data_id, blob_id, feat_id)=-1,-1,0,497
./src/caffe/test/test_gradient_check_util.hpp:124: Failure
The difference between computed_gradient and estimated_gradient is 9.9778362607820572, which exceeds threshold_ * scale, where
computed_gradient evaluates to -0.17777421436460128,
estimated_gradient evaluates to -10.155610475146659, and
threshold_ * scale evaluates to 0.10155610475146659.
debug: (top_id, top_data_id, blob_id, feat_id)=-1,-1,0,498
./src/caffe/test/test_gradient_check_util.hpp:124: Failure
The difference between computed_gradient and estimated_gradient is 7.921049009144614, which exceeds threshold_ * scale, where
computed_gradient evaluates to -0.14112862024779446,
estimated_gradient evaluates to -8.0621776293924086, and
threshold_ * scale evaluates to 0.080621776293924086.
debug: (top_id, top_data_id, blob_id, feat_id)=-1,-1,0,499
[  FAILED  ] EuclideanLossLayerTest/1.TestGradientCPU, where TypeParam = double (111 ms)
[----------] 4 tests from EuclideanLossLayerTest/1 (114 ms total)

[----------] Global test environment tear-down
[==========] 8 tests from 2 test cases ran. (782 ms total)
[  PASSED  ] 6 tests.
[  FAILED  ] 2 tests, listed below:
[  FAILED  ] EuclideanLossLayerTest/0.TestGradientCPU, where TypeParam = float
[  FAILED  ] EuclideanLossLayerTest/1.TestGradientCPU, where TypeParam = double

If you're able to fix these errors, please also remove the duplicated difference & loss computation in the backward pass, rebase to dev, and fix the style issues that make lint spits out.

sergeyk · 2014-03-13T18:30:55Z

@aravindhm Will you be able to make the suggested changes? We will merge as soon as tests pass.

shelhamer · 2014-04-04T04:30:08Z

@jeffdonahue could you follow-up on these fixes / fix this yourself since this PR seems to be abandoned?

aravindhm · 2014-04-04T13:58:18Z

I can work on this over this weekend.

What are the semantics of the return value for backward_cpu and backward_gpu in the context of a loss layer type? If the loss value is also provided to top[0] then does the meaning of this return value change? I am unable to reverse engineer the meaning of these quantities from the gradient checker code.

Also, if we have multiple data points in a batch, is the gradient chosen to be the average across the batch (1/bottom[0]->num()) or the total?

Also, is it better if I create a new PR based on a checkout of the most recent dev or simply rebase this PR?

shelhamer · 2014-04-05T05:39:34Z

Thanks for finishing this up!

Note that loss is now computed in the forward pass since #209 and the backward_* methods are void, so that settles your question about return semantics.

is the gradient chosen to be the average across the batch?

Yeah, both the loss and diff are scaled by dividing through by the number of instances. See loss_layers.cpp for examples.

Re: how to update the PR, please rebase against the latest dev and force push (git push -f) to your branch. It's easier to keep this thread and follow it.

shelhamer · 2014-06-26T13:59:02Z

Irrelevant with the new backward protocol of #497.

aravindh_m added 2 commits February 20, 2014 23:07

Bugs fixed in the euclidean loss layer.

53130b3

Forgot to add the definition for caffe_gpu_sub()

94dbb0a

sergeyk assigned jeffdonahue Feb 25, 2014

jeffdonahue reviewed Feb 25, 2014
View reviewed changes

src/caffe/test/test_euclidean_loss_layer.cpp Outdated

Copy link

Contributor

jeffdonahue Feb 25, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use EXPECT_NEAR instead of EXPECT_LE and EXPECT_GE

aravindh_m added 2 commits February 25, 2014 23:32

Changed loss function to 1/2 ||x-y||^2 and updated the test cases to …

f4bae0b

…test the same. Changed testing macros to EXPECT_NEAR.

Fixed the bug with EXPECT_NEAR argument.

88f8597

sergeyk added the bug label Mar 13, 2014

shelhamer closed this Jun 26, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugs fixed in the euclidean loss layer.#137

Bugs fixed in the euclidean loss layer.#137
aravindhm wants to merge 4 commits intoBVLC:devfrom
aravindhm:euclideanlossbugfix

aravindhm commented Feb 20, 2014

Uh oh!

jeffdonahue Feb 25, 2014

Uh oh!

jeffdonahue commented Feb 25, 2014

Uh oh!

aravindhm commented Feb 26, 2014

Uh oh!

jeffdonahue commented Feb 27, 2014

Uh oh!

sergeyk commented Mar 13, 2014

Uh oh!

shelhamer commented Apr 4, 2014

Uh oh!

aravindhm commented Apr 4, 2014

Uh oh!

shelhamer commented Apr 5, 2014

Uh oh!

shelhamer commented Jun 26, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

aravindhm commented Feb 20, 2014

Uh oh!

jeffdonahue Feb 25, 2014

Choose a reason for hiding this comment

Uh oh!

jeffdonahue commented Feb 25, 2014

Uh oh!

aravindhm commented Feb 26, 2014

Uh oh!

jeffdonahue commented Feb 27, 2014

Uh oh!

sergeyk commented Mar 13, 2014

Uh oh!

shelhamer commented Apr 4, 2014

Uh oh!

aravindhm commented Apr 4, 2014

Uh oh!

shelhamer commented Apr 5, 2014

Uh oh!

shelhamer commented Jun 26, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments