New tutorial on how to create a new custom layer in Gluon#10607
New tutorial on how to create a new custom layer in Gluon#10607piiswrong merged 4 commits intoapache:masterfrom
Conversation
|
@Ishitori you are missing the download placeholder at the end of your tutorial and you haven't updated the tutorials/index.md file |
|
We already have one here http://gluon.mxnet.io/chapter03_deep-neural-networks/custom-layer.html |
thomelane
left a comment
There was a problem hiding this comment.
I learnt a lot! Mostly grammatical changes, and a few other questions with parameters.
|
|
||
| While Gluon API for Apache MxNet comes with [a decent number of predefined layers](https://mxnet.incubator.apache.org/api/python/gluon/nn.html), at some point one may find that a new layer is needed. Adding a new layer in Gluon API is straightforward, yet there are a few things that one needs to keep in mind. | ||
|
|
||
| In this article, I will cover how to create a new layer from scratch, how to use it, what are possible pitfalls and how to avoid them. |
There was a problem hiding this comment.
Did we decide on we? Unless you were saying what instance/package you personally were using to run the code.
|
|
||
| ## The simplest custom layer | ||
|
|
||
| To create a new layer in Gluon API, one must create a class that inherits from [Block](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.Block) class. This class provides the most basic functionality, and all predefined layers inherit from it directly or via other subclasses. Because each layer in Apache MxNet inherits from `Block`, words "layer" and "block" are used interchangeably inside of the Apache MxNet community. |
|
|
||
| To create a new layer in Gluon API, one must create a class that inherits from [Block](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.Block) class. This class provides the most basic functionality, and all predefined layers inherit from it directly or via other subclasses. Because each layer in Apache MxNet inherits from `Block`, words "layer" and "block" are used interchangeably inside of the Apache MxNet community. | ||
|
|
||
| The only instance method needed to be implemented is [forward()](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.Block.forward), which defines what exactly your layer is going to do during forward propagation. Notice, that it doesn't require to provide what the block should do during backpropagation. Backpropagation pass for blocks is done by Apache MxNet for you. |
There was a problem hiding this comment.
The only instance method that needs to be implemented
There was a problem hiding this comment.
Does init count as a method?
There was a problem hiding this comment.
during the forward pass.
There was a problem hiding this comment.
Notice, that it's not required to provide what the block should do during backpropagation.
There was a problem hiding this comment.
Apache MXNet performs backpropagation automatically for you.
There was a problem hiding this comment.
Fixed these.
Yes, docs.python.org defines init() as a method with a reserved name
| return (x - nd.min(x)) / (nd.max(x) - nd.min(x)) | ||
| ``` | ||
|
|
||
| The rest of methods of the `Block` class are already implemented, and majority of them are used to work with parameters of a block. There is one very special method named [hybridize()](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.Block.hybridize), though, which I am going to cover before moving to a more complex example of a custom layer. |
There was a problem hiding this comment.
A bit more explanation needed here; "work with parameters of block".
There was a problem hiding this comment.
The rest of methods -> The other methods
There was a problem hiding this comment.
There is one very special method named hybridize(), though, which I am going to cover before moving to a more complex example of a custom layer. -> We will now discuss a special method called hybridize() before moving on to more complex examples of custom layers.
|
|
||
| Looking into the implementation of [existing layers](https://mxnet.incubator.apache.org/api/python/gluon/nn.html), one may find that more often a block inherits from a [HybridBlock](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.HybridBlock), instead of directly inheriting from `Block` class. | ||
|
|
||
| The reason for that is that `HybridBlock` allows to write custom layers that can be used in imperative programming as well as in symbolic programming. It is convinient to support both ways, because of the different values these programming models bring. The imperative programming eases the debugging of the code - one can use regular debugging tools available in modern IDEs to go line by line through the computation. The symbolic programming provides faster execution speed, but harder to debug. You can learn more about the difference between symbolic vs. imperative programming from [this article](https://mxnet.incubator.apache.org/architecture/program_model.html). |
| * The calculation of dot product is done using `F.FullyConnected()` method instead of `F.dot()` method. The one was chosen over another because the former supports automatic infering shapes of inputs while the latter doesn't. This is important to know, if one doesn't want to hard code all the shapes. The best way to learn what operators supports automatic inference of input shapes at the moment is browsing C++ implementation of operators to see if one uses a method `SHAPE_ASSIGN_CHECK();` for `in_shape`. The output shape is always inferred automatically. | ||
| * `hybrid_forward()` method signature has changed. It accepts two new arguments: `weights` and `scales`. | ||
|
|
||
| The last peculiarity is due to support of imperative and symbolic programming by `HybridBlock`. During training phase, parameters are passed to the layer by Apache MxNet framework as additional arguments to the method, because they might need to be converted to `Symbols` depending on if the layer was hybridized. One shouldn't use parameters from the class instance directly or from `self.params.get()` method in `hybrid_forward()`, except to get shapes of parameters. |
There was a problem hiding this comment.
Some more explanation of this would be good. As you explained to me in person. About the self.weights being NDArray, but when hybridize these parameters will still be NDArray and the symbol equivalents will be passed as kwargs.
There was a problem hiding this comment.
Yes, I added a few print statements to show how exactly it looks like + small note explaining it once again.
|
|
||
| The last peculiarity is due to support of imperative and symbolic programming by `HybridBlock`. During training phase, parameters are passed to the layer by Apache MxNet framework as additional arguments to the method, because they might need to be converted to `Symbols` depending on if the layer was hybridized. One shouldn't use parameters from the class instance directly or from `self.params.get()` method in `hybrid_forward()`, except to get shapes of parameters. | ||
|
|
||
| In the example below, we run one forward and one backward passes to show that `weights` do change while `scales` parameter doesn't change during the training. |
There was a problem hiding this comment.
one forward and one backward passes -> pass
|
|
||
| for key, value in hybridlayer_params.items(): | ||
| print('{} = {}\n'.format(key, value.data())) | ||
|
|
There was a problem hiding this comment.
Code block is quite hard to read. Could add double space here, and remove other spaces, or create multiple blocks. Also push out comments further so you don't need to wrap code as often.
| """ | ||
| Helper function to print out the state of parameters of NormalizationHybridLayer | ||
| """ | ||
| print(title) |
There was a problem hiding this comment.
print "=========== Parameters after {} pass ===========\n" in function, then pass forward or backward as argument
|
|
||
| ## Conclusion | ||
|
|
||
| One important quality of a Deep learning framework is extensibility. Empowered by flexible abstractions, like `Block` and `HybridBlock`, one can easily extend Apache MxNet functionality to match its needs. |
There was a problem hiding this comment.
one can easily extend Apache MxNet functionality to match its needs. -> you can easily extend Apache MXNet's functionality as you need.
'One' is a bit too formal in my opinion.
|
@piiswrong, while topic is the same, the styles and level of details are different, so it is hard to blend it together. And since they are in different repositories, they can co-exist. |
* Add how to create a new custom layer in Gluon * Fix code review comments * Fix code review comments * Add check for custom layer tutorial
* Add how to create a new custom layer in Gluon * Fix code review comments * Fix code review comments * Add check for custom layer tutorial
Description
This pull request adds a new tutorial explaining how to create a new custom layer using Gluon API. It doesn't introduce any changes to the code base.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes