Skip to content

Comments

Made load_hd5 check blob dims by default, instead of reshaping.#4630

Merged
shelhamer merged 1 commit intoBVLC:masterfrom
BlGene:load_hdf5_fix
Mar 8, 2017
Merged

Made load_hd5 check blob dims by default, instead of reshaping.#4630
shelhamer merged 1 commit intoBVLC:masterfrom
BlGene:load_hdf5_fix

Conversation

@BlGene
Copy link
Contributor

@BlGene BlGene commented Aug 25, 2016

Reason:

Reshaping by default, as done by the load_hd5 function can cause problems. I my case I was:

  1. loading weights to a training net with solver.net.load_hdf5(model_fn)
  2. Copying to solver with solver.test_nets[0].share_with(solver.net)
  3. Getting the following error:
    F0825 01:06:18.144207 21739 net.cpp:703] Check failed: target_blobs[j]->shape() == source_blob->shape() Cannot share param 0 weights from layer 'fc6'; shape mismatch. Source param shape is 4096 25088 (102760448); target param shape is 4096 512 7 7 (102760448)

This was very confusing, because the actual train net was different from the version shown in the log files. I would expect an when calling load_hdf5. If I recall this would match the behavior of the protobuf serialization.

Open Questions:

  1. Is the error message sufficiently clear or should hdf5_load_nd_dataset_helper return false and the error be throw where we know the layer for which this problem occurred.
  2. An alternative implementation would be to call hdf5_load_nd_dataset with a temporary blob, and only copy this over to the current one if the shape is the same, this would move changes to net.cpp:846, but this might take more memory.

@BlGene
Copy link
Contributor Author

BlGene commented Aug 25, 2016

@erictzeng
(haven't looked at what this means for unit tests.)

@BlGene
Copy link
Contributor Author

BlGene commented Oct 21, 2016

This should fix unit tests.

@BlGene
Copy link
Contributor Author

BlGene commented Oct 21, 2016

The error is because Travis is failing.

Copy link
Contributor

@erictzeng erictzeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor typo that could use fixing, but otherwise this looks good to me.

string source_shape_string = stream.str();

CHECK(blob_dims == blob->shape()) << "Cannot load blob from hdf5; shape "
<< "missmatch. Source shape is " << source_shape_string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: "missmatch" -> "mismatch"

@BlGene BlGene force-pushed the load_hdf5_fix branch 2 times, most recently from 98a3232 to 95a436c Compare October 22, 2016 00:07
@BlGene
Copy link
Contributor Author

BlGene commented Oct 22, 2016

Ok, I squashed. Pending tests its ready.

Size checks are needed for loading parameters to avoid strange bugs
when loading data we continue to reshape.
@shelhamer shelhamer merged commit e687a71 into BVLC:master Mar 8, 2017
@BlGene BlGene deleted the load_hdf5_fix branch March 21, 2017 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants