Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

BatchNorm can not converge with scale=False #18475

@nttstar

Description

@nttstar

Description

BatchNorm operator with scale=False can not converge.

Error Message

No error message, but loss value and training accuracy is abnormal comparing with scale=True BatchNorm.

To Reproduce

We can try https://github.com/nttstar/arcface.np to train arcface. Add one BatchNorm op with scale=False after final embedding layer

What have you tried to solve it?

  1. Set Scale=True, it can work but with slightly worse test accuracy.

Environment

----------Python Info----------
Version : 3.6.9
Compiler : GCC 7.3.0
Build : ('default', 'Jul 30 2019 19:07:31')
Arch : ('64bit', '')
------------Pip Info-----------
Version : 19.3.1
Directory : /root/anaconda2/envs/py36/lib/python3.6/site-packages/pip
----------MXNet Info-----------
Version : 2.0.0
Directory : /root/anaconda2/envs/py36/lib/python3.6/site-packages/mxnet
Num GPUs : 8
Hashtag not found. Not installed from pre-built package.
----------System Info----------
Platform : Linux-3.10.0-327.el7.x86_64-x86_64-with-centos-7.5.1804-Core
system : Linux
node : gpu06
release : 3.10.0-327.el7.x86_64
version : #1 SMP Thu Nov 19 22:10:57 UTC 2015

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions