[zero] refactor low level zero for shard evenly#4030
[zero] refactor low level zero for shard evenly#4030ver217 merged 18 commits intohpcaitech:feature/zerofrom Gy-Lu:llzero
Conversation
|
The design is in #3954 |
This version is distributing each master parameter across devices instead of distributing the param list. Correct? Just be aware of tensor precision during communication in terms of efficiency. |
Right. |
|
The code coverage for the changed files is 83%. Click me to view the complete report |
|
The code coverage for the changed files is 75%. Click me to view the complete report |
|
The code coverage for the changed files is 84%. Click me to view the complete report |
* refactor low level zero * fix zero2 and support cpu offload * avg gradient and modify unit test * refactor grad store, support layer drop * refactor bucket store, support grad accumulation * fix and update unit test of zero and ddp * compatible with tp, ga and unit test * fix memory leak and polish * add zero layer drop unittest * polish code * fix import err in unit test * support diffenert comm dtype, modify docstring style * polish code * test padding and fix * fix unit test of low level zero * fix pad recording in bucket store * support some models * polish
* refactor low level zero * fix zero2 and support cpu offload * avg gradient and modify unit test * refactor grad store, support layer drop * refactor bucket store, support grad accumulation * fix and update unit test of zero and ddp * compatible with tp, ga and unit test * fix memory leak and polish * add zero layer drop unittest * polish code * fix import err in unit test * support diffenert comm dtype, modify docstring style * polish code * test padding and fix * fix unit test of low level zero * fix pad recording in bucket store * support some models * polish
📌 Checklist before creating the PR
[doc/gemini/tensor/...]: A concise description🚨 Issue number
#3954
📝 What does this PR do?
This PR has refactored
low level zerofor load balancing.💥 Checklist before requesting a review
⭐️ Do you enjoy contributing to Colossal-AI?
Tell us more if you don't enjoy contributing to Colossal-AI.