Allow arbitrary output for HDF5 layer by pluskid · Pull Request #1183 · BVLC/caffe

pluskid · 2014-09-29T21:22:04Z

modified HDF5 layer to allow arbitrary output of the layer (not only label and data any more). This is useful for example, when there are multiple stream of data, say one part of the data needs to go through convolution while the other part of the data should go directly to higher fully connected layer. In this case, people could split the dataset and store them as different "dataset" in the HDF5 file.

Test case is also modified to cover the new behavior. The modification is backward compatible (except that now we are not constraining the dimension of the label dataset).

jeffdonahue · 2014-09-29T21:36:48Z

This is great -- I've wanted it for a while; thanks @pluskid!

Tying the HDF5 dataset names to the top blob names seems like a natural way to do it, but it will break models for users that assume the HDF5 dataset names will always be "data" and "label" and then name their top blobs something else. Perhaps if taking the blob name to be the HDF5 dataset name results in a failed load, we should fall back to the existing two names? Or we could just accept the breaking change in this case.

jeffdonahue · 2014-09-29T21:39:37Z

include/caffe/data_layers.hpp

instead of removing this line, change it to:

virtual inline int MinTopBlobs() const { return 1; }

this will require the layer to have at least 1 output

pluskid · 2014-09-29T21:46:21Z

@jeffdonahue I am OK to add a fallback to data and label but personally I think this makes the code a bit messy -- what if we have three dataset to load? What if there was just a typo in user's model file?

And I think since HDF5 layer is a bit new and there is no detailed document yet on the old behavior, it won't be too difficult for users to adopt the new behavior. Actually when I started using HDF5 layer, I simply followed the ipython notebook example to name the two dataset data and label, and I didn't know the two names are hard-coded until I read the caffe code.

jeffdonahue · 2014-09-29T21:54:48Z

@jeffdonahue I am OK to add a fallback to data and label but personally I think this makes the code a bit messy -- what if we have three dataset to load? What if there was just a typo in user's model file?

I agree the code would be a little messy. I was thinking the fallback would only happen if the user had exactly 2 top blobs -- i.e., their hdf5data layer specification would work using the current Caffe code (otherwise it would fail as usual). If it failed just due to a typo in the user's model file, the fallback presumably wouldn't succeed either (or at least it's hard for me to come up with a scenario when the fallback would succeed but should have failed).

Anyway, I think I'm personally fine with breaking the current workflow since this approach is cleaner and, as you said, it's kind of messy to fix and the current workflow is basically undocumented.

pluskid · 2014-09-29T22:11:36Z

Hi @jeffdonahue, thank you for the code review! I fixed them. Sorry my local cluster is temporarily broken so I'm pushing the code directly to see the Travis CI build output.

jeffdonahue · 2014-09-29T22:18:17Z

src/caffe/layers/hdf5_data_layer.cu

remove these two lines -- unused now (same in the CPU version if they are there as well)

jeffdonahue · 2014-09-29T22:20:49Z

Thanks for the fixes -- see my additional comment above. Once that's fixed, if we can hear from @shelhamer, @longjon, @sergeyk, or one of the other core devs that it's ok to break how the current HDF5 data layer works, I should be able to merge this. (Or if not ok, will need to add the fallback before merging.)

sergeyk · 2014-09-29T22:33:34Z

Fine to break current implicit name expectation, provided the example still works.

shelhamer · 2014-09-29T22:34:19Z

Agreed.

On Mon, Sep 29, 2014 at 3:33 PM, Sergey Karayev notifications@github.com
wrote:

Fine to break current implicit name expectation, provided the example
still works.

—
Reply to this email directly or view it on GitHub
#1183 (comment).

pluskid · 2014-09-30T00:11:27Z

all fixed now!

jeffdonahue · 2014-09-30T18:10:07Z

Whoops, this PR should have been made to dev. I rebased this on dev and merged it (24c9d8a). Thanks again @pluskid!

pluskid added 4 commits September 29, 2014 17:16

make HDF5 layer support multiple data output

4b7770d

remove the restriction that HDF5 layer generates exactly 2 outputs

fce42aa

added test case to cover new HDF5 behavior

6924251

fix style errors reported by lint.

2a1e929

jeffdonahue reviewed Sep 29, 2014
View reviewed changes

small fixes according to jeffdonahue

add0553

jeffdonahue reviewed Sep 29, 2014
View reviewed changes

clean up unused variables.

645dc3f

pluskid force-pushed the hdf5layer branch from 3b089da to 645dc3f Compare September 29, 2014 23:57

jeffdonahue added a commit that referenced this pull request Sep 30, 2014

Merge pull request #1183 from pluskid/hdf5layer

24c9d8a

jeffdonahue closed this Sep 30, 2014

pluskid deleted the hdf5layer branch October 5, 2014 03:58

mitmul pushed a commit to mitmul/caffe that referenced this pull request Oct 11, 2014

Merge pull request BVLC#1183 from pluskid/hdf5layer

44f84ec

RazvanRanca pushed a commit to RazvanRanca/caffe that referenced this pull request Nov 4, 2014

Merge pull request BVLC#1183 from pluskid/hdf5layer

6795e99

shelhamer mentioned this pull request Jan 9, 2015

Indirection layer #1414

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Allow arbitrary output for HDF5 layer#1183

Allow arbitrary output for HDF5 layer#1183
pluskid wants to merge 6 commits intoBVLC:masterfrom
pluskid:hdf5layer

pluskid commented Sep 29, 2014

Uh oh!

jeffdonahue commented Sep 29, 2014

Uh oh!

jeffdonahue Sep 29, 2014

Uh oh!

pluskid commented Sep 29, 2014

Uh oh!

jeffdonahue commented Sep 29, 2014

Uh oh!

pluskid commented Sep 29, 2014

Uh oh!

jeffdonahue Sep 29, 2014

Uh oh!

jeffdonahue commented Sep 29, 2014

Uh oh!

sergeyk commented Sep 29, 2014

Uh oh!

shelhamer commented Sep 29, 2014

Uh oh!

pluskid commented Sep 30, 2014

Uh oh!

jeffdonahue commented Sep 30, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

pluskid commented Sep 29, 2014

Uh oh!

jeffdonahue commented Sep 29, 2014

Uh oh!

jeffdonahue Sep 29, 2014

Choose a reason for hiding this comment

Uh oh!

pluskid commented Sep 29, 2014

Uh oh!

jeffdonahue commented Sep 29, 2014

Uh oh!

pluskid commented Sep 29, 2014

Uh oh!

jeffdonahue Sep 29, 2014

Choose a reason for hiding this comment

Uh oh!

jeffdonahue commented Sep 29, 2014

Uh oh!

sergeyk commented Sep 29, 2014

Uh oh!

shelhamer commented Sep 29, 2014

Uh oh!

pluskid commented Sep 30, 2014

Uh oh!

jeffdonahue commented Sep 30, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants