New tutorial on how to create a new custom layer in Gluon by Ishitori · Pull Request #10607 · apache/mxnet

Ishitori · 2018-04-19T00:18:49Z

Description

This pull request adds a new tutorial explaining how to create a new custom layer using Gluon API. It doesn't introduce any changes to the code base.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

Changes are complete (i.e. I finished coding on this PR)
Code is well-documented

Changes

New tutorial for Python/Gluon

ThomasDelteil · 2018-04-19T00:34:45Z

@Ishitori you are missing the download placeholder at the end of your tutorial and you haven't updated the tutorials/index.md file

piiswrong · 2018-04-19T17:15:06Z

We already have one here http://gluon.mxnet.io/chapter03_deep-neural-networks/custom-layer.html
Can we merge these?

thomelane

I learnt a lot! Mostly grammatical changes, and a few other questions with parameters.

thomelane · 2018-04-19T00:24:02Z

+
+While Gluon API for Apache MxNet comes with [a decent number of predefined layers](https://mxnet.incubator.apache.org/api/python/gluon/nn.html), at some point one may find that a new layer is needed. Adding a new layer in Gluon API is straightforward, yet there are a few things that one needs to keep in mind.
+
+In this article, I will cover how to create a new layer from scratch, how to use it, what are possible pitfalls and how to avoid them.


Did we decide on we? Unless you were saying what instance/package you personally were using to run the code.

thomelane · 2018-04-19T00:24:52Z

+
+## The simplest custom layer
+
+To create a new layer in Gluon API, one must create a class that inherits from [Block](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.Block) class. This class provides the most basic functionality, and all predefined layers inherit from it directly or via other subclasses. Because each layer in Apache MxNet inherits from `Block`, words "layer" and "block" are used interchangeably inside of the Apache MxNet community.


MXNet instead of MxNet

thomelane · 2018-04-19T00:25:53Z

+
+To create a new layer in Gluon API, one must create a class that inherits from [Block](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.Block) class. This class provides the most basic functionality, and all predefined layers inherit from it directly or via other subclasses. Because each layer in Apache MxNet inherits from `Block`, words "layer" and "block" are used interchangeably inside of the Apache MxNet community.
+
+The only instance method needed to be implemented is [forward()](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.Block.forward), which defines what exactly your layer is going to do during forward propagation. Notice, that it doesn't require to provide what the block should do during backpropagation. Backpropagation pass for blocks is done by Apache MxNet for you. 


The only instance method that needs to be implemented

Does init count as a method?

during the forward pass.

Notice, that it's not required to provide what the block should do during backpropagation.

Apache MXNet performs backpropagation automatically for you.

Fixed these.
Yes, docs.python.org defines init() as a method with a reserved name

thomelane · 2018-04-19T00:30:56Z

+        return (x - nd.min(x)) / (nd.max(x) - nd.min(x))
+```
+
+The rest of methods of the `Block` class are already implemented, and majority of them are used to work with parameters of a block. There is one very special method named [hybridize()](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.Block.hybridize), though, which I am going to cover before moving to a more complex example of a custom layer.


A bit more explanation needed here; "work with parameters of block".

The rest of methods -> The other methods

There is one very special method named hybridize(), though, which I am going to cover before moving to a more complex example of a custom layer. -> We will now discuss a special method called hybridize() before moving on to more complex examples of custom layers.

thomelane · 2018-04-19T00:35:49Z

+
+Looking into the implementation of [existing layers](https://mxnet.incubator.apache.org/api/python/gluon/nn.html), one may find that more often a block inherits from a [HybridBlock](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.HybridBlock), instead of directly inheriting from `Block` class.
+
+The reason for that is that `HybridBlock` allows to write custom layers that can be used in imperative programming as well as in symbolic programming. It is convinient to support both ways, because of the different values these programming models bring. The imperative programming eases the debugging of the code - one can use regular debugging tools available in modern IDEs to go line by line through the computation. The symbolic programming provides faster execution speed, but harder to debug. You can learn more about the difference between symbolic vs. imperative programming from [this article](https://mxnet.incubator.apache.org/architecture/program_model.html).


but is harder to debug

thomelane · 2018-04-19T17:09:32Z

+* The calculation of dot product is done using `F.FullyConnected()` method instead of `F.dot()` method. The one was chosen over another because the former supports automatic infering shapes of inputs while the latter doesn't. This is important to know, if one doesn't want to hard code all the shapes. The best way to learn what operators supports automatic inference of input shapes at the moment is browsing C++ implementation of operators to see if one uses a method `SHAPE_ASSIGN_CHECK();` for `in_shape`. The output shape is always inferred automatically.
+* `hybrid_forward()` method signature has changed. It accepts two new arguments: `weights` and `scales`.
+
+The last peculiarity is due to support of imperative and symbolic programming by `HybridBlock`. During training phase, parameters are passed to the layer by Apache MxNet framework as additional arguments to the method, because they might need to be converted to `Symbols` depending on if the layer was hybridized. One shouldn't use parameters from the class instance directly or from `self.params.get()` method in `hybrid_forward()`, except to get shapes of parameters. 


Some more explanation of this would be good. As you explained to me in person. About the self.weights being NDArray, but when hybridize these parameters will still be NDArray and the symbol equivalents will be passed as kwargs.

Yes, I added a few print statements to show how exactly it looks like + small note explaining it once again.

thomelane · 2018-04-19T17:09:54Z

+
+The last peculiarity is due to support of imperative and symbolic programming by `HybridBlock`. During training phase, parameters are passed to the layer by Apache MxNet framework as additional arguments to the method, because they might need to be converted to `Symbols` depending on if the layer was hybridized. One shouldn't use parameters from the class instance directly or from `self.params.get()` method in `hybrid_forward()`, except to get shapes of parameters. 
+
+In the example below, we run one forward and one backward passes to show that `weights` do change while `scales` parameter doesn't change during the training.


one forward and one backward passes -> pass

thomelane · 2018-04-19T17:11:58Z

+
+    for key, value in hybridlayer_params.items():
+        print('{} = {}\n'.format(key, value.data()))
+


Code block is quite hard to read. Could add double space here, and remove other spaces, or create multiple blocks. Also push out comments further so you don't need to wrap code as often.

thomelane · 2018-04-19T17:13:44Z

+    """
+    Helper function to print out the state of parameters of NormalizationHybridLayer
+    """
+    print(title)


print "=========== Parameters after {} pass ===========\n" in function, then pass forward or backward as argument

thomelane · 2018-04-19T17:17:20Z

+
+## Conclusion
+
+One important quality of a Deep learning framework is extensibility. Empowered by flexible abstractions, like `Block` and `HybridBlock`, one can easily extend Apache MxNet functionality to match its needs.


one can easily extend Apache MxNet functionality to match its needs. -> you can easily extend Apache MXNet's functionality as you need.
'One' is a bit too formal in my opinion.

Ishitori · 2018-04-23T21:50:30Z

@piiswrong, while topic is the same, the styles and level of details are different, so it is hard to blend it together. And since they are in different repositories, they can co-exist.

* Add how to create a new custom layer in Gluon * Fix code review comments * Fix code review comments * Add check for custom layer tutorial

Ishitori requested a review from szha as a code owner April 19, 2018 00:18

thomelane suggested changes Apr 19, 2018

View reviewed changes

Sergey Sokolov added 3 commits April 23, 2018 15:02

Add how to create a new custom layer in Gluon

4d41c64

Fix code review comments

ec0fc28

Fix code review comments

47ed280

Ishitori force-pushed the master branch from 0d38ba1 to 47ed280 Compare April 24, 2018 00:00

Add check for custom layer tutorial

8acc9cb

piiswrong merged commit 8011f3b into apache:master Apr 24, 2018

ThomasDelteil mentioned this pull request Apr 27, 2018

Update CONTRIBUTORS.md #10720

Merged


		While Gluon API for Apache MxNet comes with [a decent number of predefined layers](https://mxnet.incubator.apache.org/api/python/gluon/nn.html), at some point one may find that a new layer is needed. Adding a new layer in Gluon API is straightforward, yet there are a few things that one needs to keep in mind.

		In this article, I will cover how to create a new layer from scratch, how to use it, what are possible pitfalls and how to avoid them.


		## The simplest custom layer

		To create a new layer in Gluon API, one must create a class that inherits from [Block](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.Block) class. This class provides the most basic functionality, and all predefined layers inherit from it directly or via other subclasses. Because each layer in Apache MxNet inherits from `Block`, words "layer" and "block" are used interchangeably inside of the Apache MxNet community.


		To create a new layer in Gluon API, one must create a class that inherits from [Block](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.Block) class. This class provides the most basic functionality, and all predefined layers inherit from it directly or via other subclasses. Because each layer in Apache MxNet inherits from `Block`, words "layer" and "block" are used interchangeably inside of the Apache MxNet community.

		The only instance method needed to be implemented is [forward()](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.Block.forward), which defines what exactly your layer is going to do during forward propagation. Notice, that it doesn't require to provide what the block should do during backpropagation. Backpropagation pass for blocks is done by Apache MxNet for you.


		Looking into the implementation of [existing layers](https://mxnet.incubator.apache.org/api/python/gluon/nn.html), one may find that more often a block inherits from a [HybridBlock](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.HybridBlock), instead of directly inheriting from `Block` class.

		The reason for that is that `HybridBlock` allows to write custom layers that can be used in imperative programming as well as in symbolic programming. It is convinient to support both ways, because of the different values these programming models bring. The imperative programming eases the debugging of the code - one can use regular debugging tools available in modern IDEs to go line by line through the computation. The symbolic programming provides faster execution speed, but harder to debug. You can learn more about the difference between symbolic vs. imperative programming from [this article](https://mxnet.incubator.apache.org/architecture/program_model.html).


		The last peculiarity is due to support of imperative and symbolic programming by `HybridBlock`. During training phase, parameters are passed to the layer by Apache MxNet framework as additional arguments to the method, because they might need to be converted to `Symbols` depending on if the layer was hybridized. One shouldn't use parameters from the class instance directly or from `self.params.get()` method in `hybrid_forward()`, except to get shapes of parameters.

		In the example below, we run one forward and one backward passes to show that `weights` do change while `scales` parameter doesn't change during the training.


		for key, value in hybridlayer_params.items():
		print('{} = {}\n'.format(key, value.data()))


		## Conclusion

		One important quality of a Deep learning framework is extensibility. Empowered by flexible abstractions, like `Block` and `HybridBlock`, one can easily extend Apache MxNet functionality to match its needs.

Conversation

Ishitori commented Apr 19, 2018

Description

Checklist

Essentials

Changes

Uh oh!

ThomasDelteil commented Apr 19, 2018

Uh oh!

piiswrong commented Apr 19, 2018

Uh oh!

thomelane left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Ishitori commented Apr 23, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants