Skip to content
This repository was archived by the owner on Jan 7, 2025. It is now read-only.

Comments

Changes to layer exclusion prefixes and Softmax-related code [DON'T MERGE]#623

Closed
lukeyeager wants to merge 6 commits intoNVIDIA:masterfrom
lukeyeager:layer-exclusion-prefixes
Closed

Changes to layer exclusion prefixes and Softmax-related code [DON'T MERGE]#623
lukeyeager wants to merge 6 commits intoNVIDIA:masterfrom
lukeyeager:layer-exclusion-prefixes

Conversation

@lukeyeager
Copy link
Member

Close #605, close #335

NOTE: this will break networks which used to work in DIGITS

Changes:

  1. Add layer exclusion to classification nets (originally in Layer exclusion naming convention applied to classification nets [DON'T MERGE YET] #605)
  2. Stop automatically creating Softmax layers in the deploy prototxt when SoftmaxWithLoss layers were present in the original prototxt (since you can do it manually with prefixes now)
  3. Only set inner_product_param.num_output when it was unset in the original prototxt
  4. Update standard networks to use the new features
    1. Prune useless GoogLeNet layers from deploy prototxt (Automatically prune GoogleNet useless branches in the deploy #335)

TODO:

  • Upgrade old internal network definitions
  • Documentation

@lukeyeager lukeyeager force-pushed the layer-exclusion-prefixes branch 2 times, most recently from a074edb to 5a635ca Compare March 9, 2016 20:52
@lukeyeager
Copy link
Member Author

Fixed tests and fixed a bug

@gheinrich
Copy link
Contributor

If the softmax layer is omitted from the deploy network, results look a bit funny:
lenet-without-softmax
Can we drop an error if a classification network is created without a softmax in (at least) the deploy network? Or if we want to allow users to not use them we could detect this and display inference results as scores, not probabilities. I'd vote for making them mandatory though.

@gheinrich
Copy link
Contributor

gheinrich and others added 5 commits March 10, 2016 11:10
In case someone hits a problem like that mentioned in NVIDIA#601 for a classification network.
Once Caffe implements input layers and phase control from Python we should be able to remove those workarounds.
Important changes:
  * Only set inner_product_param.num_output when it was unset

Also:
  * Stop trying to calculate network outputs (was only needed for
      setting inner_product_param.num_output)
  * Ensure that layer prefix exclusion works the same way for both
      generic and classification nets
  * A little refactoring to make generic and classification code paths
      more similar
Makes network specification more explicit
Explain how inner_product_param.num_output gets filled in automatically
when missing.
@lukeyeager
Copy link
Member Author

Can we drop an error if a classification network is created without a softmax in (at least) the deploy network?

Yes, that makes sense for classification networks. Whenever we get around to merging "classification" and "generic" we can do something more clever. But that's the right solution for now.

@lukeyeager
Copy link
Member Author

As I'm thinking more about this, should we ask users to differentiate between Train/Val/Deploy by using layer.include.stage instead of by using layer.name prefixes? Since stages aren't fully supported yet, we'd still need to parse the network definition and convert it into train_val.prototxt and deploy.prototxt for now. But we could get users to start learning the stage syntax now, and just remove the internal hackery whenever Caffe fully supports stages.

Example of an all-in-one network specified using layer.include.stage:
https://github.com/lukeyeager/caffe/blob/ed2621d775/python/caffe/test/test_net.py#L155-L206

PR giving greater stage support to Caffe:
BVLC/caffe#3736

# Using layer name prefixes
layer {
  name: "train_loss"
  type: "SoftmaxWithLoss"
}
layer {
  name: "deploy_softmax"
  type: "Softmax"
}
# Using stages
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  include { stage: "train" }
  include { stage: "val" }
}
layer {
  name: "softmax"
  type: "Softmax"
  include { stage: "deploy" }
}

@lukeyeager lukeyeager force-pushed the layer-exclusion-prefixes branch from 5a635ca to 96fe5d5 Compare March 10, 2016 21:31
@lukeyeager lukeyeager changed the title Changes to layer exclusion prefixes and Softmax-related code Changes to layer exclusion prefixes and Softmax-related code [DON'T MERGE] Mar 10, 2016
@gheinrich
Copy link
Contributor

That sounds like a great idea. Is the stage syntax already known? I haven't found definitions of train, val and deploy phases in Caffe.

@lukeyeager
Copy link
Member Author

Yeah, I think that's what we need to do. I'm working on it now.

The number of ways you can specify the NetState is somewhat staggering.

https://github.com/NVIDIA/caffe/blob/v0.14.2/src/caffe/proto/caffe.proto#L330-L337
https://github.com/NVIDIA/caffe/blob/v0.14.2/src/caffe/proto/caffe.proto#L258-L274

DIGITS definitely can't support all of those options. We'll have to keep it to a subset.

@lukeyeager
Copy link
Member Author

Closing in favor of #628. I decided to make a new PR since it's such a different approach.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Automatically prune GoogleNet useless branches in the deploy

2 participants