Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion colossalai/booster/plugin/torch_fsdp_plugin.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@

import torch
import torch.nn as nn
import warnings
from packaging import version
from torch.distributed import ProcessGroup


if version.parse(torch.__version__) >= version.parse('1.12.0'):
from torch.distributed.fsdp import FullStateDictConfig
from torch.distributed.fsdp import FullyShardedDataParallel as FSDP
Expand Down Expand Up @@ -202,6 +202,11 @@ def configure(

# wrap the model with PyTorch FSDP
fsdp_model = TorchFSDPModel(model, device_id=torch.cuda.current_device(), **self.fsdp_kwargs)

if len(optimizer.param_groups) > 1:
warnings.warn(
'TorchFSDPPlugin does not support optimizer that use multi param groups. The results may not be as expected if used.'
)
optimizer.__init__(fsdp_model.parameters(), **optimizer.defaults)

if not isinstance(optimizer, FSDPOptimizerWrapper):
Expand Down
3 changes: 3 additions & 0 deletions docs/source/en/basics/booster_plugins.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,11 @@ More details can be found in [Pytorch Docs](https://pytorch.org/docs/main/genera
### Torch FSDP Plugin

> ⚠ This plugin is not available when torch version is lower than 1.12.0.

> ⚠ This plugin does not support save/load sharded model checkpoint now.

> ⚠ This plugin does not support optimizer that use multi params group.

More details can be found in [Pytorch Docs](https://pytorch.org/docs/main/fsdp.html).

{{ autodoc:colossalai.booster.plugin.TorchFSDPPlugin }}
Expand Down
3 changes: 3 additions & 0 deletions docs/source/zh-Hans/basics/booster_plugins.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,11 @@ Zero-2 不支持局部梯度累积。如果您坚持使用,虽然可以积累
### Torch FSDP 插件

> ⚠ 如果 torch 版本低于 1.12.0,此插件将不可用。

> ⚠ 该插件现在还不支持保存/加载分片的模型 checkpoint。

> ⚠ 该插件现在还不支持使用了multi params group的optimizer。

更多详细信息,请参阅 [Pytorch 文档](https://pytorch.org/docs/main/fsdp.html).

{{ autodoc:colossalai.booster.plugin.TorchFSDPPlugin }}
Expand Down