Conversation
Current coverage is
|
ac43433 to
7f9663a
Compare
|
I added a Jupyter notebook as an example, similar to http://nbviewer.ipython.org/gist/tnarihi/54744612d35776f53278 |
|
@pluskid How would I best turn of learning for a layer? |
|
The API looks good to me! By "turn of" do you mean "turn off"? There is a hacky way of turning off learning by using
One nice thing we could have (as in Caffe) is per-layer (per-operator) learning rate. Choices include
I think the 3rd option sounds best as it requires minimum change to the backend codebase and it actually makes more sense as deconv_w = mx.list_arguments(deconv)[2]to get the key to be used in the dictionary. Regarding this, the 2nd option might be a good compromise. |
|
Yeah passing a dictionary in would be the least hacky, but also the most inconvenient. Maybe one could alleviate that by adding |
7f9663a to
bf02531
Compare
|
@pluskid So I was looking into using attributes to set |
|
Yes, let me think about it. When you construct a symbolic graph, the operators kind of get smashed into a single symbolic node at the end. I'm not even sure if there might be some graph re-writing to optimize runtime efficiency without looking at the libmxnet source code. @tqchen Is there easy API to inspect the original symbolic hierarchy? (Other than dumping into a JSON) |
|
superseded by apache/mxnet#746 |
fixes bilinear initializer following approach in #34
This PR adds a Bilinear initializer similar to BVLC/caffe#2213 which is useful for upsampling with deconvolution. Additionally this allows to set different initializers for different layers.
Todo
Setting initializer per layer
Proper upsampling
Ref: #31