Incompatibilities in BatchNorm.

BatchNorm in NVIDIA/caffe is not compatible with BatchNorm in BVLC/caffe.

There is also no compatibility b/w engine:CAFFE and engine:CUDNN BatchNorm in NVIDIA/caffe itself. (Blob shapes are different).

Kindly fix these issues - so that we can use pre-trained models for fine tuning.

Please refer to:
https://github.com/NVIDIA/DIGITS/issues/629
and
https://github.com/BVLC/caffe/pull/3919
as well where similar issues are discussed.

I have some suggestions to fix these issues:

1. Rename the NVIDIA/caffe's BatchNorm to BatchNormScale, since it now includes Scaling as well.

2. Put a check/exit in CUDNN BatchNormScale reshape function, if the top and bottom blobs are same - so that the user will get a warning.

3. Fix the inconsistency in blob shape between engine:CAFFE and engine:CUDNN

4. Currenty I have to specify so many parameters in the new BatchNorm layer. Thi is un-necessary.
```
layer {
  name: "bn_conv1"
  bottom: "conv1"
  top: "conv1"
  type: "BatchNorm"
  param { #scale
    lr_mult: 1
    decay_mult: 1
  }
  param { #shift/bias
    lr_mult: 1
    decay_mult: 1
  } 
  param { #global mean
    lr_mult: 0
    decay_mult: 0
  }
  param { #global var
    lr_mult: 0
    decay_mult: 0
  }

  batch_norm_param {
    scale_filler {
    type: "constant"
    value: 1
    }
    bias_filler {
      type: "constant"
      value: 0
    }
    engine: CUDNN
  }
}
```
(4a). In BatchNormScale, If you change the oder of the blobs to: gloabl_mean, and global_variance, scale, bias, global_counter, then I don't have to specify 4 param fields for lr_mult and decay_mult - but only 2.

(4b). If the definition of scale and bias fields in BatchNormParameter is changed to:
optional float scale_filler = 5 [default = 1];
optional float bias_filler = 6 [default = 0];
Then I don't have to specify these also in the prototxt.

5. Keep the original BatchNorm from BVLC/caffe as it is, un-touched - so that compatibility to BVLC/caffe is not affected and old BVLC/caffe models can be used for fine tuning. If possible, provide a CUDNN version of this original BatchNorm without scaling as well, so that it can be accelerated.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incompatibilities in BatchNorm. #276

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incompatibilities in BatchNorm. #276

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions