-
Notifications
You must be signed in to change notification settings - Fork 693
Revert "[RL] Support Rollout Routing Replay" #5402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This reverts commit 96d2d48.
|
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR reverts the "Rollout Routing Replay" feature (PR #5321), systematically removing all related code, configuration, and test changes. The revert cleanly removes the routing replay functionality that was added for RL training scenarios.
Key changes:
- Removed
RoutingReplayConfigclass and all routing replay configuration parameters from the configuration system - Deleted the entire
routing_indices_cache.pyfile containingRoutingReplayManagerand routing store implementations - Removed
topk_ids_hookfunccallback parameter from all MoE backend implementations - Cleaned up routing replay manager initialization and usage in
gpu_model_runner.py
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
fastdeploy/config.py |
Removed RoutingReplayConfig class and routing_replay_config parameter from FDConfig |
fastdeploy/engine/args_utils.py |
Removed routing_replay_config CLI argument and create_routing_repaly_config method |
fastdeploy/engine/engine.py |
Removed routing_replay_config from worker service startup arguments |
fastdeploy/worker/worker_process.py |
Removed RoutingReplayConfig import and routing_replay_config initialization |
fastdeploy/worker/gpu_model_runner.py |
Removed RoutingReplayManager initialization, is_chunk_step tracking, and routing replay logic |
fastdeploy/model_executor/forward_meta.py |
Removed routing_replay_table field from ForwardMeta |
fastdeploy/model_executor/layers/moe/routing_indices_cache.py |
Deleted entire file containing routing replay functionality |
fastdeploy/model_executor/layers/moe/moe.py |
Removed enable_routing_replay flag, topk_ids_hookfunc parameter, and routing replay logic from forward methods |
fastdeploy/model_executor/layers/moe/fused_moe_*.py |
Removed topk_ids_hookfunc parameter from all backend apply methods (cutlass, deepgemm, marlin, triton, wint2) |
fastdeploy/model_executor/layers/backends/* |
Removed topk_ids_hookfunc parameter from backend-specific MoE implementations (metax, gcu, dcu) |
fastdeploy/model_executor/models/glm4_moe.py |
Removed default parameter from Glm4Moe.forward signature |
fastdeploy/rl/rollout_config.py |
Removed routing_replay_config parameter |
tests/layers/test_*.py |
Removed RoutingReplayConfig imports and initialization from test files |
tests/e2e/test_EB_Lite_serving.py |
Removed --routing-replay-config command line arguments |
tests/distributed/chunked_moe.py |
Removed routing_replay_manager and enable_routing_replay from mock objects |
| hidden_states = self.mlp( | ||
| hidden_states, | ||
| forward_meta, | ||
| ) |
Copilot
AI
Dec 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] This formatting change (splitting the function call across multiple lines) is not necessary for the revert and adds noise to the diff. The original single-line format hidden_states = self.mlp(hidden_states, forward_meta) would be sufficient. Consider keeping the formatting consistent with the rest of the codebase unless there's a specific style guide reason for the change.
| hidden_states = self.mlp( | |
| hidden_states, | |
| forward_meta, | |
| ) | |
| hidden_states = self.mlp(hidden_states, forward_meta) |
| ) | ||
|
|
||
| def forward(self, x, forward_meta: ForwardMeta = None): | ||
| def forward(self, x, forward_meta): |
Copilot
AI
Dec 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The forward_meta parameter lacks a type annotation. For consistency with the rest of the codebase (e.g., FusedMoE.forward in moe.py line 625 has forward_meta: ForwardMeta), consider adding the type hint: def forward(self, x, forward_meta: ForwardMeta):
| def forward(self, x, forward_meta): | |
| def forward(self, x, forward_meta: ForwardMeta): |
* [RL] Support Rollout Routing Replay * add routing indices cache * fix config bug and moe forward bug * R3 Support GLM * support eb4.5 * fix merge bug * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * add routing replay ci * support glm topk * support orther top_k * fix ci bug * pre-commit * only support chatcmpl * Revert "Revert "[RL] Support Rollout Routing Replay (#5321)" (#5402)" This reverts commit c45e064. * Fix XPU and NPU bug --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Yuanle Liu <yuanlehome@163.com>
* [RL] Support Rollout Routing Replay * add routing indices cache * fix config bug and moe forward bug * R3 Support GLM * support eb4.5 * fix merge bug * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * add routing replay ci * support glm topk * support orther top_k * fix ci bug * pre-commit * only support chatcmpl * Revert "Revert "[RL] Support Rollout Routing Replay (#5321)" (#5402)" This reverts commit c45e064. * Fix XPU and NPU bug --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Yuanle Liu <yuanlehome@163.com>
* [RL] Support Rollout Routing Replay * add routing indices cache * fix config bug and moe forward bug * R3 Support GLM * support eb4.5 * fix merge bug * Apply suggestion from @Copilot * Apply suggestion from @Copilot * Apply suggestion from @Copilot * Apply suggestion from @Copilot * add routing replay ci * support glm topk * support orther top_k * fix ci bug * pre-commit * only support chatcmpl * Revert "Revert "[RL] Support Rollout Routing Replay (#5321)" (#5402)" This reverts commit c45e064. * Fix XPU and NPU bug --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Yuanle Liu <yuanlehome@163.com>
… tools (#5418) * feat(fmq): add ZMQ-based FMQ implementation and benchmark tools * move FMQ_CONFIG_JSON to envs * fix top_p_candidates (#5400) Co-authored-by: freeliuzc <lzc842650834@gmail.com> * [RL] Support Rollout Routing Replay (#5321) * [RL] Support Rollout Routing Replay * add routing indices cache * fix config bug and moe forward bug * R3 Support GLM * support eb4.5 * fix merge bug * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * add routing replay ci * support glm topk * support orther top_k * fix ci bug * pre-commit * only support chatcmpl --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Yuanle Liu <yuanlehome@163.com> * [Bug fix] Fix the multi-input accuracy issue in the pooling model. (#5374) * fix multi-inputs * fix threshold * fix threshold * fix * [BugFix]remove _execute_empty_input (#5396) * Revert "[RL] Support Rollout Routing Replay (#5321)" (#5402) This reverts commit 96d2d48. * [New][RL] Support Rollout Routing Replay (#5405) * [RL] Support Rollout Routing Replay * add routing indices cache * fix config bug and moe forward bug * R3 Support GLM * support eb4.5 * fix merge bug * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * add routing replay ci * support glm topk * support orther top_k * fix ci bug * pre-commit * only support chatcmpl * Revert "Revert "[RL] Support Rollout Routing Replay (#5321)" (#5402)" This reverts commit c45e064. * Fix XPU and NPU bug --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Yuanle Liu <yuanlehome@163.com> * bf16 deepseek (#5379) * fix deepseek (#5410) * Update tests/inter_communicator/test_fmq_factory.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update benchmarks/benchmark_fmq.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update fastdeploy/inter_communicator/fmq.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: GoldPancake <56388518+Deleter-D@users.noreply.github.com> Co-authored-by: freeliuzc <lzc842650834@gmail.com> Co-authored-by: RAM <gstian5555@outlook.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Yuanle Liu <yuanlehome@163.com> Co-authored-by: lizexu123 <39205361+lizexu123@users.noreply.github.com> Co-authored-by: 周周周 <39978853+zhoutianzi666@users.noreply.github.com> Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com> Co-authored-by: bukejiyu <52310069+bukejiyu@users.noreply.github.com>
Reverts #5321