Use layer stages for all-in-one nets#628
Conversation
ffd6258 to
c005f75
Compare
|
Added some documentation: Updated with a few suggestions from @jmancewicz - thanks! |
c005f75 to
9d7c33b
Compare
| if layer.type == 'Softmax': | ||
| found_softmax = True | ||
| break | ||
| assert found_softmax, 'Your deploy network is missing a Softmax layer! Read the documentation for custom networks and/or look at the standard networks for examples.' |
There was a problem hiding this comment.
Did I miss the bit in the documentation where it is explained to the user that a softmax layer is needed to display a probability distribution?
Use layer.include.stage to specify train/val/deploy all in a single .prototxt description Stop automatically creating Softmax layers in deploy networks for classification jobs Only set inner_product_param.num_output for classification jobs if it was unset Update the standard networks
9d7c33b to
2f0873c
Compare
digits/model/tasks/caffe_train.py
Outdated
| # Check to see if top_k > num_categories | ||
| if ( layer.accuracy_param.HasField('top_k') and | ||
| layer.accuracy_param.top_k >= num_categories ): | ||
| self.logger.warning( |
There was a problem hiding this comment.
self is not defined here and the layer isn't actually being removed
There was a problem hiding this comment.
Whoops that was sloppy. Thanks for the review! Fixed.
Surprisingly, you can edit an array while enumerating over it. Python is so convenient sometimes.
>>> a = range(10)
>>> for i, x in enumerate(a):
... if (x%3 == 0):
... del a[i]
...
>>> a
[1, 2, 4, 5, 7, 8]There was a problem hiding this comment.
Oops I misspoke. The way I implemented it will break if two subsequent layers both have an invalid top_k because the second one won't be processed.
I need to fix that tomorrow...
>>> a = range(10)
>>> for i, x in enumerate(a):
... print 'Processing %d (%d) ...' % (i, x)
... if (x%3 == 0):
... del a[i]
...
Processing 0 (0) ...
Processing 1 (2) ...
Processing 2 (3) ...
Processing 3 (5) ...
Processing 4 (6) ...
Processing 5 (8) ...
Processing 6 (9) ...
>>> a
[1, 2, 4, 5, 7, 8]|
Great PR! Looks good except for the small omission in the processing of accuracy layers. I've also pushed #632 to update examples. |
| <li> | ||
| The <i>num_output</i> for each <b>InnerProduct</b> layer which is a network output gets set to the number of labels in the chosen dataset. | ||
| The Deploy network <b>must contain a Softmax layer</b>. | ||
| This should produce the only network output. |
There was a problem hiding this comment.
I think GoogleNet does not abide by this principle. Do the auxiliary classifiers need to be sent to SilenceLayers in the deploy network?
There was a problem hiding this comment.
Actually it does. I pruned the auxiliary classifiers from the deploy network (solving #335).
There was a problem hiding this comment.
Oh yes indeed, my mistake, sorry.
Use layer stages for all-in-one nets
Since NVIDIA#628 DIGITS does not overwrite the number of outputs in the last fully-connected layer if it is already set to a value. If the user accidentally specifies too large a `num_output` then inference will fail as reported on NVIDIA#678. This change causes classification outputs to be ignored if there is no corresponding label. close NVIDIA#678

Close #605, close #623, close #335
TODO: