[FEATURE]: support multiple (partial) backward passes for zero

### Describe the feature

In some vae training, users may use weight adaptive loss which may compute grad of some parameters twice, like
![image](https://github.com/hpcaitech/ColossalAI/assets/23111350/c729d68c-4b41-4cb7-8491-ee86c6ea506d)

This will trigger backward hook twice.

Based on pytorch's document, we may use post-grad-accumulation hook to solve this problem.

![image](https://github.com/hpcaitech/ColossalAI/assets/23111350/346f19e6-2a82-4644-9cfd-6f2446a32164)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE]: support multiple (partial) backward passes for zero #5601

Describe the feature

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE]: support multiple (partial) backward passes for zero #5601

Description

Describe the feature

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions