For now, ZeRO1 with gradient accumulation is not used in the plugin, which means users can not use booster and zero1+ga. We can use just `self.require_grad_sync = True` instead of interval. Moreover, supporting `plugin.no_sync()` is need as well.
For now, ZeRO1 with gradient accumulation is not used in the plugin, which means users can not use booster and zero1+ga.
We can use just
self.require_grad_sync = Trueinstead of interval. Moreover, supportingplugin.no_sync()is need as well.