-
Notifications
You must be signed in to change notification settings - Fork 692
[Cherry-Pick][CI]Fix multistep MTP in splitewise-prefill mode (#5723) #5724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Cherry-Pick][CI]Fix multistep MTP in splitewise-prefill mode (#5723) #5724
Conversation
|
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
这是一个Cherry-Pick PR,旨在修复MTP(多步投机解码)在splitwise-prefill模式下的问题。该修复在配置后处理阶段添加了逻辑,当使用MTP方法且处于prefill角色时,将投机解码的token数量和模型步数限制为1。
主要变更:
- 在配置后处理方法中添加了MTP在prefill模式下的特殊处理逻辑
- 当检测到speculative_config使用"mtp"方法且splitwise_role为"prefill"时,自动调整配置参数
fastdeploy/config.py
Outdated
|
|
||
|
|
||
| # adjust speculative config | ||
| if self.speculative_config is not None and self.speculative_config.method == "mtp": | ||
| if self.scheduler_config.splitwise_role == "prefill: |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
缺少闭合引号。字符串 "prefill 后面应该有一个双引号来闭合字符串。这会导致语法错误。应该改为 "prefill"。
| # adjust speculative config | |
| if self.speculative_config is not None and self.speculative_config.method == "mtp": | |
| if self.scheduler_config.splitwise_role == "prefill: | |
| # adjust speculative config | |
| if self.speculative_config is not None and self.speculative_config.method == "mtp": | |
| if self.scheduler_config.splitwise_role == "prefill": |
fastdeploy/config.py
Outdated
|
|
||
|
|
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
此处有两个连续的空行(1707和1708),而代码库的其他部分通常只使用一个空行来分隔代码块。为保持一致性,建议只保留一个空行。
| if self.speculative_config is not None and self.speculative_config.method == "mtp": | ||
| if self.scheduler_config.splitwise_role == "prefill: | ||
| self.speculative_config.num_speculative_tokens = 1 | ||
| self.speculative_config.num_model_steps = 1 |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
新增的MTP在splitwise prefill模式下的配置调整逻辑缺少相应的单元测试。建议在tests/utils/test_config.py中添加测试用例,验证当speculative_config.method为"mtp"且scheduler_config.splitwise_role为"prefill"时,num_speculative_tokens和num_model_steps是否正确设置为1。
| if self.speculative_config is not None and self.speculative_config.method == "mtp": | ||
| if self.scheduler_config.splitwise_role == "prefill: | ||
| self.speculative_config.num_speculative_tokens = 1 | ||
| self.speculative_config.num_model_steps = 1 |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR描述中缺少关键信息。根据要求,PR描述至少应说明为什么进行这些修改以及正在解决什么问题。当前PR描述只有模板内容,没有填写Motivation(动机)、Modifications(修改内容)等必要信息。建议补充:1) 此修复解决的具体问题是什么;2) 为什么在splitwise prefill模式下需要将num_speculative_tokens和num_model_steps设置为1;3) 此修复如何解决问题。
268b468 to
3980252
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## release/online/20251131 #5724 +/- ##
==========================================================
Coverage ? 59.08%
==========================================================
Files ? 319
Lines ? 39106
Branches ? 5893
==========================================================
Hits ? 23105
Misses ? 14148
Partials ? 1853
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
b018c49
into
PaddlePaddle:release/online/20251131
Motivation
Modifications
Usage or Command
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.