Hi,
first of all thanks for the contribution. I'm trying to use DeepSpeed and Zero with a custom optimizer from https://github.com/mgrankin/over9000
More in the specific I'm using the implementation of the RangerLars (RAdam+LARS+LookAhead), which in that repo has been called Over9000.
Is there any way to wrap a custom optimizer like this in a Zero compatible one? I found this class in your repository:
class FP16_DeepSpeedZeroOptimizer(object):
"""
DeepSpeedZeroOptimizer designed to reduce the memory footprint
required for training large deep learning models.
For more details please see ZeRO: Memory Optimization Towards Training A Trillion Parameter Models
https://arxiv.org/abs/1910.02054
For usage examples, refer to TODO: DeepSpeed V2 Tutorial
"""
Can I use it for my purpose? If yes, is it possible to have a snippet of code?
Thank you,
Cal
Hi,
first of all thanks for the contribution. I'm trying to use DeepSpeed and Zero with a custom optimizer from https://github.com/mgrankin/over9000
More in the specific I'm using the implementation of the RangerLars (RAdam+LARS+LookAhead), which in that repo has been called Over9000.
Is there any way to wrap a custom optimizer like this in a Zero compatible one? I found this class in your repository:
Can I use it for my purpose? If yes, is it possible to have a snippet of code?
Thank you,
Cal