-
Notifications
You must be signed in to change notification settings - Fork 693
[WIP][R3] Support Full Async R3 and PrefixCache #6313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
gongshaotian
wants to merge
165
commits into
PaddlePaddle:develop
from
gongshaotian:r3_prefix_2.4_opt
Closed
[WIP][R3] Support Full Async R3 and PrefixCache #6313
gongshaotian
wants to merge
165
commits into
PaddlePaddle:develop
from
gongshaotian:r3_prefix_2.4_opt
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…addle#5408) * [RL] Support Rollout Routing Replay * add routing indices cache * fix config bug and moe forward bug * R3 Support GLM * support eb4.5 * fix merge bug * Apply suggestion from @Copilot * Apply suggestion from @Copilot * Apply suggestion from @Copilot * Apply suggestion from @Copilot * add routing replay ci * support glm topk * support orther top_k * fix ci bug * pre-commit * only support chatcmpl * Revert "Revert "[RL] Support Rollout Routing Replay (PaddlePaddle#5321)" (PaddlePaddle#5402)" This reverts commit c45e064. * Fix XPU and NPU bug --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Yuanle Liu <yuanlehome@163.com>
…dleOCR-VL (PaddlePaddle#5413) (PaddlePaddle#5414) * [BugFix] Fix some parameter place on CPU in PaddleOCR-VL * clean log * fix codestyle
…#5423) * fix bug * fix bug
…cess_group for RL (PaddlePaddle#5433) (PaddlePaddle#5434) * [fix] remove shutdown_process_group/restart_process_group for RL * [chore] remove log * [chore] remove log * [chore] set log to debug level
* [BugFix] fix instability after clearing weight * [chore] add todo
…Paddle#5492)(PaddlePaddle#5499) (PaddlePaddle#5498) * [BugFix] fix hung when n>1 and --enable-logprob (PaddlePaddle#5492) * check * check * check
…ing is done (PaddlePaddle#5527) (PaddlePaddle#5523) * [fix] fix ep loop * [fix] another try * [fix] again
…addlePaddle#5519) * fix dyname load bug * update * update
…ePaddle#5578) (PaddlePaddle#5583) * [CI] Remove test_metrics.py due to incompatible forced merge (PaddlePaddle#5578) * [CI] Adapt vl_model baseline changes due to Paddle update (PaddlePaddle#5576)
…dle#5468) * [RL] R3 support rdma store * refine code * refine notes * disable prefix cache * fix ci bug * support preempted task and put cpu tensor
…ddlePaddle#5568) (PaddlePaddle#5597) * fix mtp entropy drop in RL * optimize usage and fix unit test * optimize padding_sampling_params speed(vectorized)
…addlePaddle#5491) (PaddlePaddle#5617) * [liuzichang spend 10 dyas]fix write qknorm cache bug * fix 'fix cachekv bug''
…monitoring.(PaddlePaddle#5518) (PaddlePaddle#5614) * support spec metrics monitor per request
* [Model] tp+ep support v1_loader * fix * fix mtp_linear * fix mtp_linear * fix * fix * fix v0 loader * fix * Add get_tensor for EP * fix linear weight_loader * fix typo * fix
* Update download_dependencies.sh
…lash_mask_attn PaddlePaddle#6238 (PaddlePaddle#6232) * fash_mask_attn support mixed * enhance deep_ep and fix bug * update * fix
…addlePaddle#6193) * cherry pick * bug fix tool_calls (PaddlePaddle#6166) * fix image gen (PaddlePaddle#6175) * fix unit test
This reverts commit da9b356.
…#6096 (#…" (PaddlePaddle#6253) This reverts commit c424287.
…PaddlePaddle#6120) * fused put routing * fix bug * [draft commit]dynamic dtype * Updated to accommodate uint8 baseline changes * fix async put & numpy bug --------- Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
…ePaddle#6256) * support glm mtp rl model * update baseline
…itly to avoid pip cache issues (PaddlePaddle#6265)
|
Thanks for your contribution! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
Modifications
Usage or Command
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.