[BugFix] Add prefill restrictions for chunked_prefill+VL #2983

zeroRains · 2025-07-23T07:53:37Z

pcard-71500

当前VL和纯文本的执行流程已经合并到worker_process中，但合入时忽略了chunked_prefill场景下，纯文本与VL执行prefill的差异（VL每次只能有一个batch在做prefill，但纯文本可以有多个）。

本PR为执行流程添加了VL的限制，经过1000条数据的100并发测试，无报错内容且精度没问题。

paddle-bot · 2025-07-23T07:53:42Z

Thanks for your contribution!

* [MTP Fix] Fix code and register cpp operators (PaddlePaddle#2965) * fix rl config local rank (PaddlePaddle#2957) * [FIX]fix rejection sampling when topp=0 using _SAMPLING_EPS (PaddlePaddle#2967) * fix rejection sampling when topp=0 * fix * [SOT] Add sot warmup (NVIDIA GPU Only) (PaddlePaddle#2929) * add sot warmup * fix code style * change batch_size list * add param to config * rm free_list settings && set sot_warmup_sizes * finish debug with dynamic dims by type annotations * add profile_run guard * rm sth useless * support chunk_prefill in fa3 * 【Infer】Improve the performance block_wise_fp8 of triton_moe_backend (PaddlePaddle#2942) * Update README.md * Update README.md * delete max-len (PaddlePaddle#2959) * [CI] add codestyle_check action (PaddlePaddle#2972) * [CI] add codestyle_check action * [CI] Integrate codestyle check via pre-commit in GitHub Actions * fix mtp bug in pd-split mode (PaddlePaddle#2970) * [BugFix] Add prefill restrictions for chunked_prefill+VL (PaddlePaddle#2983) * Fix performance degradation bug of custom_all_reduce (PaddlePaddle#2981) * FA3 fix bug (PaddlePaddle#2987) * polish code for prefill restrictions (PaddlePaddle#2991) * [Feature] Support block scheduler v1 for FD (PaddlePaddle#2928) * Support FD block scheduler v1 * Support FD block scheduler v1 * Support FD block scheduler v1 * Fix according to copilot review * Fix according to review * Remove is_dummy * Fix bug when real_bsz=1 * Fix infer first token cost time --------- Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com> * update (PaddlePaddle#2978) * [Code Simplification] fix init_distributed_environment() (PaddlePaddle#2982) * support c4 attn && fix cache * fix chunk_prefill * [benchmark] add quantization for benchmark yaml (PaddlePaddle#2995) * [Fix] fix mm ep empty run (PaddlePaddle#2999) * add ci reuse action (PaddlePaddle#2968) * add ci reuse action * fix code formatting * update * [Feature] multi-source download (PaddlePaddle#2986) * multi-source download * multi-source download * huggingface download revision * requirement * style * add revision arg * test * pre-commit * [LLM] update function name (PaddlePaddle#2985) * [LLM] update function name * [BugFix] fix multinode deployment (PaddlePaddle#2977) * Update benchmark tools (PaddlePaddle#3004) * update benchmark tools * update benchmark tools * update flake8 version to support pre-commit in python3.12 (PaddlePaddle#3000) * update flake8 version to support pre-commit in python3.12 * polish code * [Feature] multi source download (PaddlePaddle#3005) * multi-source download * multi-source download * huggingface download revision * requirement * style * add revision arg * test * pre-commit * Change default download * change requirements.txt * modify English Documentation * documentation * [GCU] Update to develop (PaddlePaddle#2988) * [Model] Provide clearer error for missing KV cache quantization scales (PaddlePaddle#3007) * [Feature] Support_eplb (PaddlePaddle#2997) * [Feature] support_eplb * [Feature] support_eplb * [Fix] fix mm ep * Update setup.py * [feat] add disable_chat_template in chat api as a substitute for previous raw_request (PaddlePaddle#3023) * [feat] add disable_chat_template in chat api as a substitute for previous raw_request * [fix] pre-commit code check --------- Co-authored-by: GoldPancake <56388518+Deleter-D@users.noreply.github.com> Co-authored-by: gaoziyuan <88373061+gzy19990617@users.noreply.github.com> Co-authored-by: Sunny-bot1 <68891411+Sunny-bot1@users.noreply.github.com> Co-authored-by: Ryan <zihaohuang@aliyun.com> Co-authored-by: lizhenyun01 <1500424927@qq.com> Co-authored-by: chen <103103266+ckl117@users.noreply.github.com> Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com> Co-authored-by: lizexu123 <39205361+lizexu123@users.noreply.github.com> Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com> Co-authored-by: freeliuzc <lzc842650834@gmail.com> Co-authored-by: Zero Rains <linjunlu@zerorains.top> Co-authored-by: zhink <33270771+zhink@users.noreply.github.com> Co-authored-by: chenjian <1435317881@qq.com> Co-authored-by: bukejiyu <52310069+bukejiyu@users.noreply.github.com> Co-authored-by: xiegegege <46314656+xiegegege@users.noreply.github.com> Co-authored-by: xiaoxiaohehe001 <49090790+xiaoxiaohehe001@users.noreply.github.com> Co-authored-by: YUNSHEN XIE <1084314248@qq.com> Co-authored-by: Yzc216 <101054010+Yzc216@users.noreply.github.com> Co-authored-by: ltd0924 <32387785+ltd0924@users.noreply.github.com> Co-authored-by: Zhang Yulong <35552275+ZhangYulongg@users.noreply.github.com> Co-authored-by: EnflameGCU <118410644+EnflameGCU@users.noreply.github.com> Co-authored-by: littledgg <61149469+littledgg@users.noreply.github.com> Co-authored-by: 李泳桦 <39643373+liyonghua0910@users.noreply.github.com>

Add prefill restrictions for chunked_prefill+VL

40b222f

zeroRains mentioned this pull request Jul 23, 2025

[Cherry-Pick]Add prefill restrictions for chunked_prefill+VL #2984

Merged

yuanlehome approved these changes Jul 23, 2025

View reviewed changes

yuanlehome merged commit 850c9d9 into PaddlePaddle:develop Jul 23, 2025
4 of 6 checks passed

yuanlehome changed the title ~~Add prefill restrictions for chunked_prefill+VL~~ [BugFix] Add prefill restrictions for chunked_prefill+VL Jul 23, 2025

zeroRains deleted the vl+chunked branch July 23, 2025 11:38

zeroRains mentioned this pull request Jul 23, 2025

polish code for prefill restrictions #2991

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Add prefill restrictions for chunked_prefill+VL #2983

[BugFix] Add prefill restrictions for chunked_prefill+VL #2983

Uh oh!

zeroRains commented Jul 23, 2025

Uh oh!

paddle-bot bot commented Jul 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[BugFix] Add prefill restrictions for chunked_prefill+VL #2983

[BugFix] Add prefill restrictions for chunked_prefill+VL #2983

Uh oh!

Conversation

zeroRains commented Jul 23, 2025

Uh oh!

paddle-bot bot commented Jul 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants