Conversation
|
@Yangqing @shelhamer @jeffdonahue @longjon @sergeyk |
|
I'm for it |
|
I agree that the accuracy should not be coupled to the multinomial logistic loss like it is now. The first step I see to fix this would be to update the Factoring out the argmax might also be desirable, but I'd also be fine with leaving that out of this PR to keep things simple. |
…o compute loss Changed all val.prototxt in examples to add a LossLayer to compute loss in Test
|
@sergeyk @longjon @shelhamer can each one of you test one of the examples to make sure they still work for you? |
|
I'm for the separation. I agree with @longjon that splitting the accuracy and multinomial logistic loss is enough for now provided that we update our prototxt. I'll test LeNet and the MNIST autoencoder and report back. The rest should be fine since the changes are the same, but someone else should check CaffeNet and AlexNet. Thanks for clearing up the design Sergio! |
|
A nice follow-up, not in this PR but in the future, would be to add a |
|
@sguada the I tested LeNet, the MNIST autoencoder, and that the CIFAR examples don't crash. (@jeffdonahue I noticed the consolidated LeNet example crashes, seemingly because of LevelDB lock contention. This was on OS X. I have some recollection of a similar issue cropping up in a data layer in the past -- I don't think the train net and the test-on-train net can exist concurrently and read the same LevelDB.) |
|
I think it actually should work, but maybe the OSX implementation of leveldb is buggy? leveldb is built to handle reads from multiple threads within the same process; just not from multiple processes. See http://stackoverflow.com/questions/9177598/multiple-instances-of-a-leveldb-database-at-the-same-time |
|
It seems OS X leveldb is buggy then. I remember re-scoping tests to initialize / free the leveldb between tests because otherwise it crashed on OS X like it does now with this example. Perhaps another reason to adopt lmdb. |
|
Oh, now that I actually look at the leveldb documentation [1] it seems that the threads need to be sharing the same [1] http://leveldb.googlecode.com/svn/trunk/doc/index.html
|
|
@shelhamer thanks for doing some extra testing with this new PR. |
|
@sguada Oh, I actually took a different approach and gave it two top blobs max so it can report the loss and the actual softmax output too: shelhamer@adab413 What do you think? I could also move it to |
|
@shelhamer that's even better, maybe someone wants to get both things. I think it is a good idea to move it to I have added to my fork, so you should be able to push it now. |
|
I have tested the examples and they seem fine. I will push my fix and the transplant of As a follow-up, how does everyone feel about introducing a |
|
Thanks @shelhamer for the merge and final retouches. |
Split accuracy and loss
|
how to dowload these 20 files? |
This PR would separate the loss from the accuracy. This will allow using different losses while computing the accuracy.
Once the
AccuracyLayerdoesn't compute the loss then one need to add aLossLayerif want to compute it.There are two options for
AccuracyLayer, now it is doing a ArgMax on the data to compute the predicted label and then compared with expected label. That could be removed and let theArgMaxLayerdo that.Another option will be to create the
ArgMaxLayer, and maybe aLossLayerwithin theAccuracyLayerthen connect them.