Skip to content

Comments

MSR weight filler#1882

Closed
nickcarlevaris wants to merge 380 commits intoBVLC:masterfrom
nickcarlevaris:msr_init
Closed

MSR weight filler#1882
nickcarlevaris wants to merge 380 commits intoBVLC:masterfrom
nickcarlevaris:msr_init

Conversation

@nickcarlevaris
Copy link

This PR implements the weight initialization strategy from http://arxiv.org/abs/1502.01852. It is very similar to the Xavier filler---except designed for RLU instead of tanh non-linearities.

This would compliment #1880 which implements the PReLU layer also proposed in this paper.

Kevin James Matzen and others added 30 commits October 11, 2014 15:24
…nclude glog at some point. Including caffe/common.hpp. (2) I often misconfigure some layer and softmax breaks when the dimensionality is too small for the input layers. Check and fail fast. (3) CV_LOAD_IMAGE_COLOR is deprecated and has been removed in OpenCV 3.0. Replaced with cv::IMREAD_COLOR.
[fix] build with OpenCV 3 and other minor fixes
Minor change: some namespace simplification
Give back to layer what is layer's, and to factory what is factory's
compilation on working systems

This reverts commit 4587b2f.
some more namespace cleaning.
…r. Removed two buffer copies in dereference operation for DB iterators.
… a few bugs related to ReadOnly mode in Database in order to pass test cases.
…t-increment operator so that it forks off a copy of the LevelDB or LMDB iterator/cursor when necessary. Neither of these APIs allow you to directly copy an iterator or cursor, so I create a new iterator and seek to the key that the previous one was currently on. This means the pre-increment operator can be much cheaper than the post-increment operator.
…onditions inside open, put, get, and commit, these functions return a bool indicating whether or not the operation was successful or a failure. This means the caller is now responsible for error checking.
…or put and key by const reference for put. Additional copies are made for get and put in the LMDB implementation.
…y having each iterator hold a shared pointer to the DB. I manually specified a deconstructor for the LeveldbState to make it clear what order these two things need to be deallocated in.
shelhamer and others added 27 commits February 6, 2015 01:37
The gradient checker fails on certain elements of the PowerLayer checks,
but only 1-3 sometimes fail out of the 120 elements tested. This is not
due to any numerical issue in the PowerLayer, but the distribution of
the random inputs for the checks.

boost 1.56 switched the normal distribution RNG engine from Box-Muller
to Ziggurat.
Fix PowerLayer gradient check failures by reducing step size
- keep current `DataTransformer` check so that datums can be transformed
  into a blob incrementally
- standardize check messages
- include opencv where needed, drop unneeded OS X guards

TODO these tests need to be de-duped
Feed cv::Mats to MemoryDataLayer and set batch size on-the-fly.
These layers are meant for generic inputs so they do not do
transformations. As `transform_param` was pulled out of data layers'
proto definitions for reuse, these layers silently ignored
transformation parameters until now instead of signalling their refusal.
Bug in MemoryData that does not allow using arrays with n_ * size_ > 2^31
…t-fix

Fix for CuDNN layer tests: only destroy handles if setup
HDF5_DATA + MEMORY_DATA refuse loudly to transform
Introduced by  Layer type is a string BVLC#1694
… for use

with LRUs instead of tanh. Based on paper: He et al, "Delving Deep into
Rectifiers: Surpassing Human-Level Performance on ImageNet Classification," 2015
@nickcarlevaris nickcarlevaris deleted the msr_init branch February 16, 2015 22:26
@nickcarlevaris nickcarlevaris restored the msr_init branch February 16, 2015 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.