Skip to content

[BUG]: hybrid_parallel_plugin 模型保存问题 #4547

@zryowen123

Description

@zryowen123

🐛 Describe the bug

使用shardformer分支中hybrid_parallel_plugin的模型保存代码测试,模型保存不完整
原始模型llama2-13b 大约24G大小
使用hybrid_parallel_plugin训练完后保存只有13G大小
训练设置 pp=2, tp=2 , 4 *a800

Environment

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions