-
Notifications
You must be signed in to change notification settings - Fork 690
[Loader][BugFix] Fix some parameters place on CPU in PaddleOCR-VL #5413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Loader][BugFix] Fix some parameters place on CPU in PaddleOCR-VL #5413
Conversation
|
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
此 PR 修复了 PaddleOCR-VL 模型加载时参数被错误放置在 CPU 上的问题。由于 PR #4532 移除了 get_tensor 的强制 H2D 调用,导致 LazyGuard 下未初始化的参数在使用 copy_ 后仍停留在 CPU,严重影响性能。
主要变更
- 在所有 weight_loader 方法中添加参数初始化检查,确保未初始化的参数先调用
initialize() - 统一使用
h2d_copy替代param.copy_以保证正确的设备放置 - 修复了 3 个不同的 weight_loader 方法(SiglipAttention.out_proj_weight_loader、SiglipMLP.weight_loader、Projector.weight_loader)
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| fastdeploy/model_executor/models/paddleocr_vl/siglip.py | 为 SiglipAttention 和 SiglipMLP 的 weight_loader 添加参数初始化逻辑,并替换为 h2d_copy;新增 get_tensor 导入但未使用 |
| fastdeploy/model_executor/models/paddleocr_vl/projector.py | 为 Projector 的 weight_loader 添加参数初始化逻辑,并替换为 h2d_copy |
| import paddle.nn.functional as F | ||
| from paddleformers.transformers.model_utils import PretrainedModel | ||
|
|
||
| from fastdeploy.model_executor.layers.utils import get_tensor |
Copilot
AI
Dec 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The get_tensor import is unused in this file. Since PR #4532 removed the usage of get_tensor and this PR replaces it with h2d_copy (which internally calls get_tensor when needed), this import can be safely removed.
| from fastdeploy.model_executor.layers.utils import get_tensor |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #5413 +/- ##
==========================================
Coverage ? 58.26%
==========================================
Files ? 327
Lines ? 40566
Branches ? 6157
==========================================
Hits ? 23636
Misses ? 15098
Partials ? 1832
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Motivation
LazyGuard下的参数是未初始化的,而uninit_tensor.copy_(data)结果会是 CPU 的,这导致了 OCR 里的参数加载后基本都是 CPU 的此前使用了
get_tensor强制 H2D,因此即便没有初始化仍然正确放在 GPU但是 #4532 移除了
get_tensor,这就导致这些参数最后都在 CPU,性能惨不忍睹另外 #4532 影响面不确定,本 PR 只测了 OCR 模型,其他的建议都查查
Modifications
补全 OCR 中缺失的针对未初始化参数的
init逻辑,并统一使用h2d_copy不过这个
h2d_copy名字感觉不太合适啊,看着像强制 h2d,但是实现并不是Usage or Command
Accuracy Tests
无
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.