[V1 Loader] Support qwen2(bf16) #3502
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
pcard-71500
注意:V1 Loader用到了Paddle develop版本对set_value和copy_的新特性,因此需要使用Paddle develop或者3.2(暂未发布)及以上的版本。
本PR为Qwen2模型适配V1 Loader加载。
目前在bf16下验证,精度与develop对齐,且保证旧版和新版的load都能正常工作。
接入V1 Loader后,在Qwen2-7B-Instruct中,TP4的模型加载时内存占用下降为原本的29%,加载时间持平。
修改内容:
测试记录:
模型:Qwen2-7B-Instruct
性能记录
精度记录
测试脚本: