Rework Xavier to be more flexibility by vchuravy · Pull Request #32 · dmlc/MXNet.jl

vchuravy · 2015-11-19T03:52:59Z

Following the discussion in apache/mxnet#610 I took another swing at Xavier.

The idea is that the concepts proposed in the papers [1, 2, 3] can be generalized in choosing the regularization factor 1/fan_in 1/fan_out 2/(fan_out + fan_in), the distribution to sample from and a magnitude scaling factor. [1] proposes 3/fan_in and 6/(fan_out+fan_in) and [2,3] propose 2/fan_in.

[1] X. Bengio and Y. Glorot (2010) http://jmlr.csail.mit.edu/proceedings/papers/v9/glorot10a.html
[2] K. He, X. Zhang, S. Ren, and J. Sun (2015) http://arxiv.org/abs/1502.01852
[3] A. M. Saxe, J. L. McClelland, and S. Ganguli (2013/2014) http://arxiv.org/abs/1312.6120v3

vchuravy · 2015-11-19T04:01:30Z

Another point we should discuss is the calculation of fan_out.

Currently we have:

fan_in  = prod(dims[2:end])
fan_out = dims[1]

But following [1] and [4] `input blob has shape (num, a, b, c) where a * b * c = fan_in and num * b * c = fan_out.

We maybe should have (and if somebody could double check my logic :) )

fan_in = prod(dims[2:end])
fan_out = prod(dims[1:end]) / dims[2]

[4] https://github.com/BVLC/caffe/blob/603cbfb97767d1b9ebf102200646f5df237d1749/include/caffe/filler.hpp#L150-L151

pluskid · 2015-11-19T06:19:51Z

The caffe code you cited looks definite weird to me. The fan out as a most intuitive interpretation should be the number of output units, which for the convolution filters, is only the number of output filters. I'm not sure why caffe choose to include the kernel size in this calculation. If consistency is the goal, I guess should really check what the cited papers say.

Also things are quite different when it comes to FullyConnected layer, the weights should be a matrix (instead of 4D tensor), and the fan-in fan-out calculation should handle this gracefully.

vchuravy · 2015-11-19T06:28:30Z

Yeah I am unsure about 710dd01. In [1] fan_out is the size of the next layer and [2] only uses fan_in

vchuravy · 2015-11-20T08:56:21Z

So for me this would be ready.

codecov-io · 2015-11-20T09:14:39Z

Current coverage is `76.39%`

Merging #32 into master will increase coverage by +0.26% as of 813436d

@@            master     #32   diff @@
======================================
  Files           20      20       
  Stmts         1454    1449     -5
  Branches         0       0       
  Methods          0       0       
======================================
  Hit           1107    1107       
  Partial          0       0       
+ Missed         347     342     -5

Review entire Coverage Diff as of 813436d

Powered by Codecov. Updated on successful CI builds.

Rework Xavier to be more flexibility

vchuravy force-pushed the vc/xavier branch from 710dd01 to 6d367a1 Compare November 20, 2015 08:54

vchuravy added 2 commits November 20, 2015 17:55

rework Xavier to be more flexible

62c9703

xavier: rebuild documentation

6081fce

vchuravy force-pushed the vc/xavier branch from 6d367a1 to 6081fce Compare November 20, 2015 08:55

pluskid added a commit that referenced this pull request Nov 20, 2015

Merge pull request #32 from vchuravy/vc/xavier

ea85774

Rework Xavier to be more flexibility

pluskid merged commit ea85774 into dmlc:master Nov 20, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Rework Xavier to be more flexibility#32

Rework Xavier to be more flexibility#32
pluskid merged 2 commits intodmlc:masterfrom
vchuravy:vc/xavier

vchuravy commented Nov 19, 2015

Uh oh!

vchuravy commented Nov 19, 2015

Uh oh!

pluskid commented Nov 19, 2015

Uh oh!

vchuravy commented Nov 19, 2015

Uh oh!

vchuravy commented Nov 20, 2015

Uh oh!

codecov-io commented Nov 20, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

vchuravy commented Nov 19, 2015

Uh oh!

vchuravy commented Nov 19, 2015

Uh oh!

pluskid commented Nov 19, 2015

Uh oh!

vchuravy commented Nov 19, 2015

Uh oh!

vchuravy commented Nov 20, 2015

Uh oh!

codecov-io commented Nov 20, 2015

Current coverage is 76.39%

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Current coverage is `76.39%`