Bugs fixed in the euclidean loss layer.#137
Conversation
There was a problem hiding this comment.
use EXPECT_NEAR instead of EXPECT_LE and EXPECT_GE
|
Thanks @aravindhm! I added a couple of minor comments above, will merge once these are addressed. |
…test the same. Changed testing macros to EXPECT_NEAR.
|
I've fixed the problems and just checked that the code complies and works with the boost-eigen branch (after a few changes specific to that branch). |
|
Hi @aravindhm, I compiled and ran your tests and I get numerous gradient check failures (which look well beyond a simple precision issue): If you're able to fix these errors, please also remove the duplicated difference & loss computation in the backward pass, rebase to dev, and fix the style issues that |
|
@aravindhm Will you be able to make the suggested changes? We will merge as soon as tests pass. |
|
@jeffdonahue could you follow-up on these fixes / fix this yourself since this PR seems to be abandoned? |
|
I can work on this over this weekend. What are the semantics of the return value for backward_cpu and backward_gpu in the context of a loss layer type? If the loss value is also provided to top[0] then does the meaning of this return value change? I am unable to reverse engineer the meaning of these quantities from the gradient checker code. Also, if we have multiple data points in a batch, is the gradient chosen to be the average across the batch (1/bottom[0]->num()) or the total? Also, is it better if I create a new PR based on a checkout of the most recent dev or simply rebase this PR? |
|
Thanks for finishing this up! Note that loss is now computed in the forward pass since #209 and the
Yeah, both the loss and diff are scaled by dividing through by the number of instances. See loss_layers.cpp for examples. Re: how to update the PR, please rebase against the latest dev and force push ( |
|
Irrelevant with the new backward protocol of #497. |
I have the following concerns
Please provide feedback on this pull request. I've not tested it for MKL though I've editted the test file. I think the gradient checker hasn't been used correctly in my version. Please let me know.