MNIST autoencoder example by jeffdonahue · Pull Request #330 · BVLC/caffe

jeffdonahue · 2014-04-16T05:08:26Z

This PR moves examples/lenet/ to examples/mnist/ and adds the necessary files to train an autoencoder on MNIST with the architecture of Hinton & Salakhutdinov [1] using SGD with no pre-training (e.g. via RBM). It uses a sparse Gaussian initialization (added to filler.hpp) as suggested by [2] as a strategy for training autoencoders via SGD* without pretraining. It uses a fixed LR of 0.0001 which could probably be greatly improved upon, but I haven't played with it much.

After 2 million iterations (which took a few hours on the GPU -- I didn't originally intend to train it this long but this is what it was at when I came back to it) the test L2 reconstruction error was around 1.5-1.6. (For an idea of what this means, a reconstruction with 2 out of 784 pixels flipped from perfectly white to perfectly black (or vice versa) would have an L2 reconstruction error of 2.0)

*actually [2] used Nesterov's accelerated gradient, but this is not currently implemented in Caffe and SGD seems to be fairly effective.

[1] http://www.cs.toronto.edu/~hinton/science.pdf

[2] http://jmlr.org/proceedings/papers/v28/sutskever13.pdf

layer, sparse gaussian filler)

jeffdonahue · 2014-04-16T15:30:50Z

Ready to merge, @shelhamer.

The unit test problem with SigmoidCrossEntropyLossLayer I mentioned I was having occurred when I changed CheckGradientSingle to CheckGradientExhaustive. The reason that the exhaustive version fails is that, like in the current implementation of the Euclidean loss layer, I'm not propagating gradients down to the latter input which we assume is "ground truth".

While I could propagate down to the 2nd input, the semantics of this layer (take sigmoid of first input and compute cross-entropy error on the sigmoidal outputs and the second input assumed to already be in a 0-1 range) seem like they would make it odd to use with weights below the second input, and it's wasteful to propagate down to the second input if there aren't any weights below. I think this should eventually be fixed somehow, e.g. by making the bool propagate_down input to Backward a vector<bool> propagate_down with propagate_down.size() == bottom.size(). Right now I think the propagate_down input is just a hard-coded true everywhere in the code that Backward is called.

shelhamer · 2014-04-16T15:44:58Z

Ok, looks good. Thanks Jeff!

by making the bool propagate_down input to Backward a vector propagate_down with propagate_down.size() == bottom.size()

This seems like a good way to fix it to me.

Agreed with not propagating to the 2nd input for now for the reason given. For anyone with a model where both paths have parameters, they could hack in a field in the loss layer like propagate_all or some-such until we fix the problem with an actual vector of propagation flags that Net::Init() can figure out.

MNIST autoencoder example

moi90 · 2015-06-08T19:38:35Z

I'd be very happy about some comments in mnist_autoencoder.prototxt or a accompanying readme about how exactly the autoencoder works. In particular, the roles of the two distinct loss layers (and why the one operates on flatdata and decode1 and the other on flatdata and decode1neuron) are not clear to me.

liuruoze · 2015-09-24T00:13:02Z

I agree with @moi90 , a readme and tutorials would be very nice to one who wants to used them.
If @jeffdonahue could give some instructions, that would be great.

jeffdonahue added 10 commits April 15, 2014 10:56

rename lenet dir to mnist

9f1dc96

add mnist autoencoder example necessities (sigmoid cross entropy loss

a767caf

layer, sparse gaussian filler)

enable DataLayer to output unlabeled data

7b5a7e5

mnist autoencoder test proto bugfix: add sigmoid layer before loss

24f7ec0

make solver able to compute and display test loss

22c698c

change lenet dir to 'mnist' in docs

f4678bc

mnist_autoencoder_solver cleanup

50cfd53

add sigmoid cross ent layer unit tests

5babe0d

clear sigmoid top vec at initialization

c50da5c

change to correct next layer id for merge

2d968ed

shelhamer added a commit that referenced this pull request Apr 16, 2014

Merge pull request #330 from jeffdonahue/mnist-autoencoder-example

2dad9ca

MNIST autoencoder example

shelhamer merged commit 2dad9ca into BVLC:dev Apr 16, 2014

jeffdonahue deleted the mnist-autoencoder-example branch April 16, 2014 17:58

shelhamer mentioned this pull request Apr 22, 2014

Examples of unsupervised feature learning? #188

Closed

tempestatis mentioned this pull request May 19, 2014

Convolutional unsupervised learning? #426

Closed

shelhamer mentioned this pull request May 20, 2014

Next: 0.999 #429

Merged

mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014

Merge pull request BVLC#330 from jeffdonahue/mnist-autoencoder-example

be0d40e

MNIST autoencoder example

umguec mentioned this pull request Apr 3, 2015

Is there any autoencoder supported? vlfeat/matconvnet#109

Closed

gheinrich mentioned this pull request Oct 6, 2015

Torch generic inference NVIDIA/DIGITS#345

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MNIST autoencoder example#330

MNIST autoencoder example#330
shelhamer merged 10 commits intoBVLC:devfrom
jeffdonahue:mnist-autoencoder-example

jeffdonahue commented Apr 16, 2014

Uh oh!

jeffdonahue commented Apr 16, 2014

Uh oh!

shelhamer commented Apr 16, 2014

Uh oh!

moi90 commented Jun 8, 2015

Uh oh!

liuruoze commented Sep 24, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

jeffdonahue commented Apr 16, 2014

Uh oh!

jeffdonahue commented Apr 16, 2014

Uh oh!

shelhamer commented Apr 16, 2014

Uh oh!

moi90 commented Jun 8, 2015

Uh oh!

liuruoze commented Sep 24, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments