[Cherry-Pick]Add prefill restrictions for chunked_prefill+VL #2984
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
pcard-71500
原pr #2983
当前VL和纯文本的执行流程已经合并到worker_process中,但合入时忽略了chunked_prefill场景下,纯文本与VL执行prefill的差异(VL每次只能有一个batch在做prefill,但纯文本可以有多个)。
本PR为执行流程添加了VL的限制,经过1000条数据的100并发测试,无报错内容且精度没问题。