Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

New tutorial on how to create a new custom layer in Gluon#10607

Merged
piiswrong merged 4 commits intoapache:masterfrom
Ishitori:master
Apr 24, 2018
Merged

New tutorial on how to create a new custom layer in Gluon#10607
piiswrong merged 4 commits intoapache:masterfrom
Ishitori:master

Conversation

@Ishitori
Copy link
Copy Markdown
Contributor

Description

This pull request adds a new tutorial explaining how to create a new custom layer using Gluon API. It doesn't introduce any changes to the code base.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • Code is well-documented

Changes

  • New tutorial for Python/Gluon

@Ishitori Ishitori requested a review from szha as a code owner April 19, 2018 00:18
@ThomasDelteil
Copy link
Copy Markdown
Contributor

@Ishitori you are missing the download placeholder at the end of your tutorial and you haven't updated the tutorials/index.md file

@piiswrong
Copy link
Copy Markdown
Contributor

We already have one here http://gluon.mxnet.io/chapter03_deep-neural-networks/custom-layer.html
Can we merge these?

Copy link
Copy Markdown
Contributor

@thomelane thomelane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I learnt a lot! Mostly grammatical changes, and a few other questions with parameters.


While Gluon API for Apache MxNet comes with [a decent number of predefined layers](https://mxnet.incubator.apache.org/api/python/gluon/nn.html), at some point one may find that a new layer is needed. Adding a new layer in Gluon API is straightforward, yet there are a few things that one needs to keep in mind.

In this article, I will cover how to create a new layer from scratch, how to use it, what are possible pitfalls and how to avoid them.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we decide on we? Unless you were saying what instance/package you personally were using to run the code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

Comment thread docs/tutorials/python/custom_layer.md Outdated

## The simplest custom layer

To create a new layer in Gluon API, one must create a class that inherits from [Block](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.Block) class. This class provides the most basic functionality, and all predefined layers inherit from it directly or via other subclasses. Because each layer in Apache MxNet inherits from `Block`, words "layer" and "block" are used interchangeably inside of the Apache MxNet community.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MXNet instead of MxNet

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment thread docs/tutorials/python/custom_layer.md Outdated

To create a new layer in Gluon API, one must create a class that inherits from [Block](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.Block) class. This class provides the most basic functionality, and all predefined layers inherit from it directly or via other subclasses. Because each layer in Apache MxNet inherits from `Block`, words "layer" and "block" are used interchangeably inside of the Apache MxNet community.

The only instance method needed to be implemented is [forward()](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.Block.forward), which defines what exactly your layer is going to do during forward propagation. Notice, that it doesn't require to provide what the block should do during backpropagation. Backpropagation pass for blocks is done by Apache MxNet for you.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only instance method that needs to be implemented

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does init count as a method?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

during the forward pass.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice, that it's not required to provide what the block should do during backpropagation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apache MXNet performs backpropagation automatically for you.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed these.
Yes, docs.python.org defines init() as a method with a reserved name

Comment thread docs/tutorials/python/custom_layer.md Outdated
return (x - nd.min(x)) / (nd.max(x) - nd.min(x))
```

The rest of methods of the `Block` class are already implemented, and majority of them are used to work with parameters of a block. There is one very special method named [hybridize()](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.Block.hybridize), though, which I am going to cover before moving to a more complex example of a custom layer.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit more explanation needed here; "work with parameters of block".

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest of methods -> The other methods

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one very special method named hybridize(), though, which I am going to cover before moving to a more complex example of a custom layer. -> We will now discuss a special method called hybridize() before moving on to more complex examples of custom layers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment thread docs/tutorials/python/custom_layer.md Outdated

Looking into the implementation of [existing layers](https://mxnet.incubator.apache.org/api/python/gluon/nn.html), one may find that more often a block inherits from a [HybridBlock](https://mxnet.incubator.apache.org/api/python/gluon/gluon.html#mxnet.gluon.HybridBlock), instead of directly inheriting from `Block` class.

The reason for that is that `HybridBlock` allows to write custom layers that can be used in imperative programming as well as in symbolic programming. It is convinient to support both ways, because of the different values these programming models bring. The imperative programming eases the debugging of the code - one can use regular debugging tools available in modern IDEs to go line by line through the computation. The symbolic programming provides faster execution speed, but harder to debug. You can learn more about the difference between symbolic vs. imperative programming from [this article](https://mxnet.incubator.apache.org/architecture/program_model.html).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but is harder to debug

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment thread docs/tutorials/python/custom_layer.md Outdated
* The calculation of dot product is done using `F.FullyConnected()` method instead of `F.dot()` method. The one was chosen over another because the former supports automatic infering shapes of inputs while the latter doesn't. This is important to know, if one doesn't want to hard code all the shapes. The best way to learn what operators supports automatic inference of input shapes at the moment is browsing C++ implementation of operators to see if one uses a method `SHAPE_ASSIGN_CHECK();` for `in_shape`. The output shape is always inferred automatically.
* `hybrid_forward()` method signature has changed. It accepts two new arguments: `weights` and `scales`.

The last peculiarity is due to support of imperative and symbolic programming by `HybridBlock`. During training phase, parameters are passed to the layer by Apache MxNet framework as additional arguments to the method, because they might need to be converted to `Symbols` depending on if the layer was hybridized. One shouldn't use parameters from the class instance directly or from `self.params.get()` method in `hybrid_forward()`, except to get shapes of parameters.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more explanation of this would be good. As you explained to me in person. About the self.weights being NDArray, but when hybridize these parameters will still be NDArray and the symbol equivalents will be passed as kwargs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I added a few print statements to show how exactly it looks like + small note explaining it once again.

Comment thread docs/tutorials/python/custom_layer.md Outdated

The last peculiarity is due to support of imperative and symbolic programming by `HybridBlock`. During training phase, parameters are passed to the layer by Apache MxNet framework as additional arguments to the method, because they might need to be converted to `Symbols` depending on if the layer was hybridized. One shouldn't use parameters from the class instance directly or from `self.params.get()` method in `hybrid_forward()`, except to get shapes of parameters.

In the example below, we run one forward and one backward passes to show that `weights` do change while `scales` parameter doesn't change during the training.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one forward and one backward passes -> pass

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


for key, value in hybridlayer_params.items():
print('{} = {}\n'.format(key, value.data()))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code block is quite hard to read. Could add double space here, and remove other spaces, or create multiple blocks. Also push out comments further so you don't need to wrap code as often.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

"""
Helper function to print out the state of parameters of NormalizationHybridLayer
"""
print(title)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

print "=========== Parameters after {} pass ===========\n" in function, then pass forward or backward as argument

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


## Conclusion

One important quality of a Deep learning framework is extensibility. Empowered by flexible abstractions, like `Block` and `HybridBlock`, one can easily extend Apache MxNet functionality to match its needs.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one can easily extend Apache MxNet functionality to match its needs. -> you can easily extend Apache MXNet's functionality as you need.
'One' is a bit too formal in my opinion.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@Ishitori
Copy link
Copy Markdown
Contributor Author

@piiswrong, while topic is the same, the styles and level of details are different, so it is hard to blend it together. And since they are in different repositories, they can co-exist.

@piiswrong piiswrong merged commit 8011f3b into apache:master Apr 24, 2018
rahul003 pushed a commit to rahul003/mxnet that referenced this pull request Jun 4, 2018
* Add how to create a new custom layer in Gluon

* Fix code review comments

* Fix code review comments

* Add check for custom layer tutorial
zheng-da pushed a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 28, 2018
* Add how to create a new custom layer in Gluon

* Fix code review comments

* Fix code review comments

* Add check for custom layer tutorial
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants