Load weights from multiple caffemodels.#1456
Conversation
|
This is a helpful generalization for certain uses. Note that this can The level and stage rules for layer inclusion / exclusion are helpful for
|
|
With the advances of pycaffe one can copy weights from several models by Net.copy_from. I like preparing the nets through Python for its generality, but copying weights from multiple nets could be a useful special case. However I'm inclined to keep the |
|
I think it's useful and non-intrusive. I'm not a huge fan of the interface (commas in a flag argument) but I can't think of anything better (gflags doesn't let you specify the same flag multiple times and give you a |
…-caffemodels Load weights from multiple models by listing comma separated caffemodels as the `-weights` arg to the caffe command.
|
@jyegerlehner thanks for the convenient multi-model fine-tuning initialization. I merged this to master in a9bf7b9 (and collapsed this to a single commit). |
At least one use case requiring this is doing layerwise or "stacked" autoencoder training: First I train the newly-added encoder and decoder layers by themselves (using features extracted from the net having only the previously-trained layers). Then when I begin to train the combined network, it needs to pull weights from two different caffemodel files. So this change allows the
--weightsparameter to be a comma-separated list of caffemodels instead of just a single caffemodel.The other code change is that the test nets are also initialized from the provided caffemodels, not just the train net. So if the trained net is a subset of the test net, then some of the test nets' layers' weights would be uninitialized, whereas with this change they are initialized from the specified models.