Skip to content

Accuracy without loss#522

Merged
shelhamer merged 5 commits intoBVLC:devfrom
sguada:accuracy_without_loss
Jun 26, 2014
Merged

Accuracy without loss#522
shelhamer merged 5 commits intoBVLC:devfrom
sguada:accuracy_without_loss

Conversation

@sguada
Copy link
Contributor

@sguada sguada commented Jun 20, 2014

This PR would separate the loss from the accuracy. This will allow using different losses while computing the accuracy.

Once the AccuracyLayer doesn't compute the loss then one need to add a LossLayer if want to compute it.

There are two options for AccuracyLayer, now it is doing a ArgMax on the data to compute the predicted label and then compared with expected label. That could be removed and let the ArgMaxLayer do that.

Another option will be to create the ArgMaxLayer, and maybe a LossLayer within the AccuracyLayer then connect them.

@sguada
Copy link
Contributor Author

sguada commented Jun 20, 2014

@Yangqing @shelhamer @jeffdonahue @longjon @sergeyk
What do you think about separating the Loss and the Accuracy? Or at least make it more general.

@sergeyk
Copy link
Contributor

sergeyk commented Jun 20, 2014

I'm for it

@longjon
Copy link
Contributor

longjon commented Jun 20, 2014

I agree that the accuracy should not be coupled to the multinomial logistic loss like it is now. The first step I see to fix this would be to update the AccuracyLayer to compute only the accuracy, to update the example prototxts to also compute the appropriate losses, and to make sure the resulting logs still contain the desired loss information (and if there are any tools that read that information, that they still work).

Factoring out the argmax might also be desirable, but I'd also be fine with leaving that out of this PR to keep things simple.

@sguada sguada mentioned this pull request Jun 20, 2014
…o compute loss

Changed all val.prototxt in examples to add a LossLayer to compute loss in Test
@sguada
Copy link
Contributor Author

sguada commented Jun 21, 2014

@sergeyk @longjon @shelhamer can each one of you test one of the examples to make sure they still work for you?

@shelhamer
Copy link
Member

I'm for the separation. I agree with @longjon that splitting the accuracy and multinomial logistic loss is enough for now provided that we update our prototxt.

I'll test LeNet and the MNIST autoencoder and report back. The rest should be fine since the changes are the same, but someone else should check CaffeNet and AlexNet.

Thanks for clearing up the design Sergio!

@shelhamer
Copy link
Member

A nice follow-up, not in this PR but in the future, would be to add a TYPE field to AccuracyLayer that could be argmax, top 5, absolute difference, squared error, and so on.

@shelhamer shelhamer mentioned this pull request Jun 24, 2014
@shelhamer
Copy link
Member

@sguada the SOFTMAX_LOSS layer needed its top blob(s) declaration updated. The net tests were failing because of the current exact number check and the lack of the top allocation. I pushed a fix to my fork, but can you add me as a repo collaborator so I can just push it to this PR?

I tested LeNet, the MNIST autoencoder, and that the CIFAR examples don't crash.

(@jeffdonahue I noticed the consolidated LeNet example crashes, seemingly because of LevelDB lock contention. This was on OS X. I have some recollection of a similar issue cropping up in a data layer in the past -- I don't think the train net and the test-on-train net can exist concurrently and read the same LevelDB.)

@jeffdonahue
Copy link
Contributor

I think it actually should work, but maybe the OSX implementation of leveldb is buggy? leveldb is built to handle reads from multiple threads within the same process; just not from multiple processes. See http://stackoverflow.com/questions/9177598/multiple-instances-of-a-leveldb-database-at-the-same-time

@shelhamer
Copy link
Member

It seems OS X leveldb is buggy then. I remember re-scoping tests to initialize / free the leveldb between tests because otherwise it crashed on OS X like it does now with this example. Perhaps another reason to adopt lmdb.

@jeffdonahue
Copy link
Contributor

Oh, now that I actually look at the leveldb documentation [1] it seems that the threads need to be sharing the same leveldb::DB object, which makes sense as otherwise it's probably not any different from opening in two separate processes. So perhaps it's actually a bug that it doesn't crash on Linux?

[1] http://leveldb.googlecode.com/svn/trunk/doc/index.html

Concurrency

A database may only be opened by one process at a time. The leveldb implementation acquires a lock from the operating system to prevent misuse. Within a single process, the same leveldb::DB object may be safely shared by multiple concurrent threads. I.e., different threads may write into or fetch iterators or call Get on the same database without any external synchronization (the leveldb implementation will automatically do the required synchronization). However other objects (like Iterator and WriteBatch) may require external synchronization. If two threads share such an object, they must protect access to it using their own locking protocol. More details are available in the public header files.

@sguada sguada changed the title Accuracy without loss [Don't Merge yet] Accuracy without loss Jun 25, 2014
@sguada
Copy link
Contributor Author

sguada commented Jun 26, 2014

@shelhamer thanks for doing some extra testing with this new PR.
I also got the same problem with the SOFTMAX_LOSS but didn't have the time to push it. Maybe now it would be a good idea to move it to the loss_layers.hpp.

@shelhamer
Copy link
Member

@sguada Oh, I actually took a different approach and gave it two top blobs max so it can report the loss and the actual softmax output too: shelhamer@adab413

What do you think? I could also move it to loss_layers.hpp then push to your branch for merge if you like.

@sguada
Copy link
Contributor Author

sguada commented Jun 26, 2014

@shelhamer that's even better, maybe someone wants to get both things. I think it is a good idea to move it to loss_layers.hpp where it belongs, even if that means adding some includes.

I have added to my fork, so you should be able to push it now.

@shelhamer
Copy link
Member

I have tested the examples and they seem fine. I will push my fix and the transplant of SoftmaxWithLoss to loss_layers.hpp soon and merge.

As a follow-up, how does everyone feel about introducing a common_layers.hpp for layers like Softmax, Flatten, Argmax etc. that aren't actually vision specific? Purely for the sake of organization of course.

shelhamer added a commit that referenced this pull request Jun 26, 2014
@shelhamer shelhamer merged commit 2cd155f into BVLC:dev Jun 26, 2014
@sguada
Copy link
Contributor Author

sguada commented Jun 26, 2014

Thanks @shelhamer for the merge and final retouches.

@baimin1
Copy link

baimin1 commented Jun 16, 2015

how to dowload these 20 files?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants

Comments