Skip to content

Incompatibilities in BatchNorm. #276

@mathmanu

Description

@mathmanu

BatchNorm in NVIDIA/caffe is not compatible with BatchNorm in BVLC/caffe.

There is also no compatibility b/w engine:CAFFE and engine:CUDNN BatchNorm in NVIDIA/caffe itself. (Blob shapes are different).

Kindly fix these issues - so that we can use pre-trained models for fine tuning.

Please refer to:
NVIDIA/DIGITS#629
and
BVLC#3919
as well where similar issues are discussed.

I have some suggestions to fix these issues:

  1. Rename the NVIDIA/caffe's BatchNorm to BatchNormScale, since it now includes Scaling as well.

  2. Put a check/exit in CUDNN BatchNormScale reshape function, if the top and bottom blobs are same - so that the user will get a warning.

  3. Fix the inconsistency in blob shape between engine:CAFFE and engine:CUDNN

  4. Currenty I have to specify so many parameters in the new BatchNorm layer. Thi is un-necessary.

layer {
  name: "bn_conv1"
  bottom: "conv1"
  top: "conv1"
  type: "BatchNorm"
  param { #scale
    lr_mult: 1
    decay_mult: 1
  }
  param { #shift/bias
    lr_mult: 1
    decay_mult: 1
  } 
  param { #global mean
    lr_mult: 0
    decay_mult: 0
  }
  param { #global var
    lr_mult: 0
    decay_mult: 0
  }

  batch_norm_param {
    scale_filler {
    type: "constant"
    value: 1
    }
    bias_filler {
      type: "constant"
      value: 0
    }
    engine: CUDNN
  }
}

(4a). In BatchNormScale, If you change the oder of the blobs to: gloabl_mean, and global_variance, scale, bias, global_counter, then I don't have to specify 4 param fields for lr_mult and decay_mult - but only 2.

(4b). If the definition of scale and bias fields in BatchNormParameter is changed to:
optional float scale_filler = 5 [default = 1];
optional float bias_filler = 6 [default = 0];
Then I don't have to specify these also in the prototxt.

  1. Keep the original BatchNorm from BVLC/caffe as it is, un-touched - so that compatibility to BVLC/caffe is not affected and old BVLC/caffe models can be used for fine tuning. If possible, provide a CUDNN version of this original BatchNorm without scaling as well, so that it can be accelerated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions