Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
Author: Hongxin Liu, Yongbin Li

**Example Code**
- [ColossalAI-Examples GPT2](https://github.com/hpcaitech/ColossalAI-Examples/tree/main/language/gpt_2)
- [ColossalAI-Examples GPT3](https://github.com/hpcaitech/ColossalAI-Examples/tree/main/language/gpt_3)
- [ColossalAI-Examples GPT2](https://github.com/hpcaitech/ColossalAI-Examples/tree/main/language/gpt)
- [ColossalAI-Examples GPT3](https://github.com/hpcaitech/ColossalAI-Examples/tree/main/language/gpt)

**Related Paper**
- [Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training](https://arxiv.org/abs/2110.14883)
Expand Down
4 changes: 2 additions & 2 deletions docs/source/en/features/zero_with_chunk.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Author: [Hongxiu Liu](https://github.com/ver217), [Jiarui Fang](https://github.c

**Example Code**

- [Train GPT with Colossal-AI](https://github.com/hpcaitech/ColossalAI/tree/main/examples/language/gpt)
- [Train GPT with Colossal-AI](../../../../examples/language/gpt/README.md)

**Related Paper**

Expand Down Expand Up @@ -242,6 +242,6 @@ def main():
torch.cuda.synchronize()
```
> ⚠️ Note: If you want to use the Gemini module, please do not use the [Gradient Accumulation](../features/gradient_accumulation.md) we mentioned before。
The complete example can be found on [Train GPT with Colossal-AI](https://github.com/hpcaitech/ColossalAI/tree/main/examples/language/gpt).
The complete example can be found on [Train GPT with Colossal-AI](../../../../examples/language/gpt/README.md).

<!-- doc-test-command: torchrun --standalone --nproc_per_node=1 zero_with_chunk.py -->
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
作者: Hongxin Liu, Yongbin Li

**示例代码**
- [ColossalAI-Examples GPT2](https://github.com/hpcaitech/ColossalAI-Examples/tree/main/language/gpt_2)
- [ColossalAI-Examples GPT3](https://github.com/hpcaitech/ColossalAI-Examples/tree/main/language/gpt_3)
- [ColossalAI-Examples GPT2](https://github.com/hpcaitech/ColossalAI-Examples/tree/main/language/gpt)
- [ColossalAI-Examples GPT3](https://github.com/hpcaitech/ColossalAI-Examples/tree/main/language/gpt)

**相关论文**
- [Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training](https://arxiv.org/abs/2110.14883)
Expand Down
4 changes: 2 additions & 2 deletions docs/source/zh-Hans/features/zero_with_chunk.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

**示例代码**

- [Train GPT with Colossal-AI](https://github.com/hpcaitech/ColossalAI/tree/main/examples/language/gpt)
- [Train GPT with Colossal-AI](../../../../examples/language/gpt/README.md)

**相关论文**

Expand Down Expand Up @@ -244,6 +244,6 @@ def main():
torch.cuda.synchronize()
```
> ⚠️ 注意:如果你使用Gemini模块的话,请不要使用我们之前提到过的[梯度累加](../features/gradient_accumulation.md)。
完整的例子代码可以在 [Train GPT with Colossal-AI](https://github.com/hpcaitech/ColossalAI/tree/main/examples/language/gpt). 获得。
完整的例子代码可以在 [Train GPT with Colossal-AI](../../../../examples/language/gpt/README.md). 获得。

<!-- doc-test-command: torchrun --standalone --nproc_per_node=1 zero_with_chunk.py -->