MSR weight filler by nickcarlevaris · Pull Request #1882 · BVLC/caffe

nickcarlevaris · 2015-02-16T22:24:11Z

This PR implements the weight initialization strategy from http://arxiv.org/abs/1502.01852. It is very similar to the Xavier filler---except designed for RLU instead of tanh non-linearities.

This would compliment #1880 which implements the PReLU layer also proposed in this paper.

…nclude glog at some point. Including caffe/common.hpp. (2) I often misconfigure some layer and softmax breaks when the dimensionality is too small for the input layers. Check and fail fast. (3) CV_LOAD_IMAGE_COLOR is deprecated and has been removed in OpenCV 3.0. Replaced with cv::IMREAD_COLOR.

[fix] build with OpenCV 3 and other minor fixes

Clean flaky code

Minor change: some namespace simplification

Give back to layer what is layer's, and to factory what is factory's

compilation on working systems This reverts commit 4587b2f.

some more namespace cleaning.

…then they might fail.

…at was recently removed.

…r. Removed two buffer copies in dereference operation for DB iterators.

…atched what we expected.

… a few bugs related to ReadOnly mode in Database in order to pass test cases.

…t-increment operator so that it forks off a copy of the LevelDB or LMDB iterator/cursor when necessary. Neither of these APIs allow you to directly copy an iterator or cursor, so I create a new iterator and seek to the key that the previous one was currently on. This means the pre-increment operator can be much cheaper than the post-increment operator.

…onditions inside open, put, get, and commit, these functions return a bool indicating whether or not the operation was successful or a failure. This means the caller is now responsible for error checking.

…or put and key by const reference for put. Additional copies are made for get and put in the LMDB implementation.

…y having each iterator hold a shared pointer to the DB. I manually specified a deconstructor for the LeveldbState to make it clear what order these two things need to be deallocated in.

The gradient checker fails on certain elements of the PowerLayer checks, but only 1-3 sometimes fail out of the 120 elements tested. This is not due to any numerical issue in the PowerLayer, but the distribution of the random inputs for the checks. boost 1.56 switched the normal distribution RNG engine from Box-Muller to Ziggurat.

Fix PowerLayer gradient check failures by reducing step size

Die on inputs that fail to load

GPU version of SoftmaxWithLossLayer

- keep current `DataTransformer` check so that datums can be transformed into a blob incrementally - standardize check messages - include opencv where needed, drop unneeded OS X guards TODO these tests need to be de-duped

Feed cv::Mats to MemoryDataLayer and set batch size on-the-fly.

uint64_t size_t

These layers are meant for generic inputs so they do not do transformations. As `transform_param` was pulled out of data layers' proto definitions for reuse, these layers silently ignored transformation parameters until now instead of signalling their refusal.

Bug in MemoryData that does not allow using arrays with n_ * size_ > 2^31

…t-fix Fix for CuDNN layer tests: only destroy handles if setup

HDF5_DATA + MEMORY_DATA refuse loudly to transform

LossLayer)

Gradient clipping

BlobMathTest: fix precision issues

Introduced by Layer type is a string BVLC#1694

… for use with LRUs instead of tanh. Based on paper: He et al, "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification," 2015

Kevin James Matzen and others added 30 commits October 11, 2014 15:24

OpenCV should be compiled using pkg-config options.

4587b2f

Added version dependent test for IMREAD_COLOR.

82b614b

Fixed CMakeList to work with OpenCV 3.

94394a0

Merge pull request BVLC#1261 from kmatzen/minor_changes

540ad6b

[fix] build with OpenCV 3 and other minor fixes

fix flaky test EXPECT_EQ code, using EXPECT_FLOAT_NEAR per Jeff

0160a0b

fix flaky math functions, remove unnecessary instantiations.

fc5ba94

Merge pull request BVLC#1264 from Yangqing/dev

e53aafa

Clean flaky code

some namespace simplification

94f00c0

Merge pull request BVLC#1269 from Yangqing/dev

aa4564c

Minor change: some namespace simplification

move registration code to corresponding cpp files.

2f88dab

Merge pull request BVLC#1270 from Yangqing/dev

98dbfd7

Give back to layer what is layer's, and to factory what is factory's

Revert "OpenCV should be compiled using pkg-config options." -- breaks

d6dbfc8

compilation on working systems This reverts commit 4587b2f.

some namespace cleaning.

370217b

Merge pull request BVLC#1277 from Yangqing/dev

39341b7

some more namespace cleaning.

Refactored leveldb and lmdb code.

6ad4f95

Some cleanup to make travis happy.

329e448

Updated interface to make fewer string copies.

7e504c0

Don't autocommit on close for the databases. If they were read-only, …

edff676

…then they might fail.

data layer test was relying on the autocommit on close db behavior th…

c7a5d8a

…at was recently removed.

Updated cifar10 build script to specify db backend.

b1150c9

Switched create_cifar10.sh output from leveldb to lmdb.

5a559f5

Updated extract_features to take a leveldb/lmdb config option.

a1ea5ba

Updated Database interface to use custom KV type rather than std::pai…

8fef285

…r. Removed two buffer copies in dereference operation for DB iterators.

Added a couple of sanity checks to make sure the datum buffer sizes m…

be2df84

…atched what we expected.

Added get interface to Database. Added test cases for Database. Fixed…

c31e444

… a few bugs related to ReadOnly mode in Database in order to pass test cases.

Updated Database interface so that rather than CHECKing for certain c…

73b16e4

…onditions inside open, put, get, and commit, these functions return a bool indicating whether or not the operation was successful or a failure. This means the caller is now responsible for error checking.

Updated Database interface to take key and value by const reference f…

e0b572d

…or put and key by const reference for put. Additional copies are made for get and put in the LMDB implementation.

The LevelDB iterator/DB deallocation order bug is pretty much fixed b…

2200a7a

…y having each iterator hold a shared pointer to the DB. I manually specified a deconstructor for the LeveldbState to make it clear what order these two things need to be deallocated in.

shelhamer and others added 27 commits February 6, 2015 01:37

Merge pull request BVLC#1840 from shelhamer/fix-power-test

a8f9e4b

Fix PowerLayer gradient check failures by reducing step size

Merge pull request BVLC#1837 from shelhamer/image-fail-die

a883ec6

Die on inputs that fail to load

Added GPU implementation of SoftmaxWithLossLayer.

4c894d0

Merge pull request BVLC#1789 from SaganBolliger/softmax_loss_gpu

4905366

GPU version of SoftmaxWithLossLayer

Added opencv vector<Mat> to memory data layer with tests

67c727a

MemoryDataLayer now accepts dynamic batch_size

cedefd7

MemoryDataLayer now correctly consumes batch_size elements

eb4ca16

small fixes

4901f83

removed needs_reshape_ and ChangeBatchSize is now set_batch_size

abe0fa2

groom BVLC#1416

a2fbea3

- keep current `DataTransformer` check so that datums can be transformed into a blob incrementally - standardize check messages - include opencv where needed, drop unneeded OS X guards TODO these tests need to be de-duped

Merge pull request BVLC#1416 from mtamburrano/matVector

02d9170

Feed cv::Mats to MemoryDataLayer and set batch size on-the-fly.

Allow using arrays with n_ * size_ > 2^31

d12101e

uint64_t size_t

Merge pull request BVLC#1838 from DmitryUlyanov/dev

7efcfe8

Bug in MemoryData that does not allow using arrays with n_ * size_ > 2^31

Fixes for CuDNN layers: only destroy handles if setup

8821c28

Merge pull request BVLC#1851 from jeffdonahue/cudnn-layer-factory-tes…

1504d89

…t-fix Fix for CuDNN layer tests: only destroy handles if setup

Merge pull request BVLC#1841 from shelhamer/no-memory-or-hdf5-transform

5e61f55

HDF5_DATA + MEMORY_DATA refuse loudly to transform

SoftmaxWithLossLayer fix: takes exactly 2 bottom blobs (inherited from

852a3c5

LossLayer)

Blob: add scale_{data,diff} methods and tests

c15d184

add Net::param_owners accessor for param sharing info

e235851

Add gradient clipping -- limit L2 norm of parameter gradients

f38ddef

Merge pull request BVLC#1757 from jeffdonahue/clip-grads

413ee83

Gradient clipping

BlobMathTest: fixes for numerical precision issues

6a309c1

Merge pull request BVLC#1874 from jeffdonahue/blob-math-test-precision

c09de35

BlobMathTest: fix precision issues

Fix Draw Net Problem BVLC#1709

1344d1b

Introduced by Layer type is a string BVLC#1694

Added MSR weight filler, which implements Xavier-like filler designed…

4ecea59

… for use with LRUs instead of tanh. Based on paper: He et al, "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification," 2015

nickcarlevaris closed this Feb 16, 2015

nickcarlevaris deleted the msr_init branch February 16, 2015 22:26

nickcarlevaris restored the msr_init branch February 16, 2015 22:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

MSR weight filler#1882

MSR weight filler#1882
nickcarlevaris wants to merge 380 commits intoBVLC:masterfrom
nickcarlevaris:msr_init

nickcarlevaris commented Feb 16, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Comments

Conversation

nickcarlevaris commented Feb 16, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants