🐛 Describe the bug
following the section Gradient Accumulation on GeminiPlugin in https://colossalai.org/docs/features/gradient_accumulation_with_booster/ , I got the exception AssertionError: you are calculating the l2 norm twice at the second step.
Environment
Pytorch: 2.0.1
Colossalai: 0.3.4
🐛 Describe the bug
following the section
Gradient Accumulation on GeminiPluginin https://colossalai.org/docs/features/gradient_accumulation_with_booster/ , I got the exceptionAssertionError: you are calculating the l2 norm twiceat the second step.Environment
Pytorch: 2.0.1
Colossalai: 0.3.4