Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/en/Colossal-Auto/get_started/run_demo.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Detailed instructions can be found in its `README.md`.

### 2. Integration with activation checkpoint

Colossal-Auto's automatic search function for activation checkpointing finds the most efficient checkpoint within a given memory budget, rather than just aiming for maximum memory compression. To avoid a lengthy search process for an optimal activation checkpoint, Colossal-Auto has implemented a two-stage search process. This allows the system to find a feasible distributed training solution in a reasonable amount of time while still benefiting from activation checkpointing for memory management. The integration of activation checkpointing in Colossal-AI improves the efficiency and effectiveness of large model training. You can follow the [Resnet example](TBA).
Colossal-Auto's automatic search function for activation checkpointing finds the most efficient checkpoint within a given memory budget, rather than just aiming for maximum memory compression. To avoid a lengthy search process for an optimal activation checkpoint, Colossal-Auto has implemented a two-stage search process. This allows the system to find a feasible distributed training solution in a reasonable amount of time while still benefiting from activation checkpointing for memory management. The integration of activation checkpointing in Colossal-AI improves the efficiency and effectiveness of large model training. You can follow the [Resnet example](https://github.com/hpcaitech/ColossalAI/tree/main/examples/tutorial/auto_parallel).
Detailed instructions can be found in its `README.md`.

<figure style={{textAlign: "center"}}>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Colossal-Auto 可被用于为每一次操作寻找一个包含数据、张量(

### 2. 与 activation checkpoint 结合

作为大模型训练中必不可少的显存压缩技术,Colossal-AI 也提供了对于 activation checkpoint 的自动搜索功能。相比于大部分将最大显存压缩作为目标的技术方案,Colossal-AI 的搜索目标是在显存预算以内,找到最快的 activation checkpoint 方案。同时,为了避免将 activation checkpoint 的搜索一起建模到 SPMD solver 中导致搜索时间爆炸,Colossal-AI 做了 2-stage search 的设计,因此可以在合理的时间内搜索到有效可行的分布式训练方案。 您可参考 [Resnet 示例](TBA)。
作为大模型训练中必不可少的显存压缩技术,Colossal-AI 也提供了对于 activation checkpoint 的自动搜索功能。相比于大部分将最大显存压缩作为目标的技术方案,Colossal-AI 的搜索目标是在显存预算以内,找到最快的 activation checkpoint 方案。同时,为了避免将 activation checkpoint 的搜索一起建模到 SPMD solver 中导致搜索时间爆炸,Colossal-AI 做了 2-stage search 的设计,因此可以在合理的时间内搜索到有效可行的分布式训练方案。 您可参考 [Resnet 示例](https://github.com/hpcaitech/ColossalAI/tree/main/examples/tutorial/auto_parallel)。
详细的操作指引见其 `README.md`。

<figure style={{textAlign: "center"}}>
Expand Down