Now in the `step()`, zero uses the master param to comm, which leads to a long optimizer_step time.
Now in the
step(), zero uses the master param to comm, which leads to a long optimizer_step time.