-
Notifications
You must be signed in to change notification settings - Fork 694
[Cherry-Pick][BugFix] Cp fix c8 bug(#5544) #5545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* update enable chunked_prefill * update code * update code * update code
…ddle#3794) Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>
* Update config.py * Update ep.py * Update fused_moe_backend_base.py * Update dynamic_weight_manager.py * Update worker_process.py * fix ci
* Update serving_chat.py * Update serving_completion.py * Update serving_completion.py
…) (PaddlePaddle#3804) * 延迟 import Config * support chunked_prefill * support chunked_prefill
…ePaddle#3810) * speed up eb45 * update
* fix scheduler bug * fix
* add moe noaux_tc tatics in trition backend * fix * add dp config
* Update no_proxy environment variable in CI workflow * Install lsof and kill api_server processes Install lsof tool and kill processes using it.
…se (PaddlePaddle#3855) * Update no_proxy environment variable in CI workflow * Install lsof and kill api_server processes Install lsof tool and kill processes using it. * Update dependency versions for stable release * Update CI script to use stable dependencies
…le#3771) (PaddlePaddle#3835) * fix w4afp8 * 增加集中式配置 * codestyle * fix fa3 append attn
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
* Support for async processor added. * remove yappi code
* [Feature] Set scheduler v1 as default * [Feature] Set scheduler v1 as default * [Feature] Set scheduler v1 as default * [Feature] Set scheduler v1 as default * [Feature] Set scheduler v1 as default * [Feature] Set scheduler v1 as default
* fix scheduler bug * fix * Update api_server.py
* add reasoning parser plugin * fix finish reason --------- Co-authored-by: Yuanle Liu <yuanlehome@163.com>
* [DEBUG] Adapt validation for paddleformers==0.2 in release/2.2 * [CI] update paddleformers==0.2 in release/2.2
* disable scheduler v1 in guided decoding * disable scheduler v1 in guided decoding
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com> Co-authored-by: gaoziyuan <88373061+gzy19990617@users.noreply.github.com>
…addlePaddle#5051) Co-authored-by: liqinrui <liqinrui@baidu.com>
* add async download * update code * fix bug * update code * update code * fix bugs * update code --------- Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
Co-authored-by: liqinrui <liqinrui@baidu.com>
…51112(PaddlePaddle#5…" (PaddlePaddle#5099) This reverts commit 59eeb9e.
* mm prefix cache * add _revert_match_blocks * update code * update code * update code * fix bugs * add test case * fix bug * update code * update reserved_dec_block_ids
Remove detailed string representation from Request class.
* [CP][BugFix]Dev fix custom ar unstable result (PaddlePaddle#4437) * code check * revert delete * check * pre_commit
…path(PaddlePaddle#5205) (PaddlePaddle#5232) * merge code * fix Request CONFLICT * remove unuse unittest --------- Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
* c8 prefix caching * update code * update code * update cache trans * update code * update code
* fix skip_quant * fix
* support r3 * update * support tp>1&&ep>1 * support cudagraph padding * support all backends * replace env with options * modularize * update * Add RoutingStore and refine code * add routing replay cofig * add routing repaly config * success run routing store * convert request id as rollout id * fix rollout config bug * unify code * use rollout_id to replace request_id in routing store * delete code --------- Co-authored-by: yuanlehome <yuanlehome@163.com>
…o feature/experimental_feature_20250908
…dle#5389) * remove close prefix cache * fix mtp+dy-c8+prefixcache bug
…o feature/experimental_feature_20250908
|
Thanks for your contribution! |
|
|
Motivation
修复 dy-c8 cache传输时类型不匹配的bug
cp from #5544
Modifications
Usage or Command
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.