-
Notifications
You must be signed in to change notification settings - Fork 694
[Optimization] compulte real max_logprobs in batch #5430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for your contribution! |
| self.top_p_normalized_logprobs = True | ||
| self.prompt_logprobs_reqs: dict[str, Request] = {} | ||
| self.in_progress_prompt_logprobs: dict[str, LogprobsTensors] = {} | ||
| self.forward_batch_reqs_list: list[Request] = [None for _ in range(self.scheduler_config.max_num_seqs)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clear_requests中清理一下
| logprobs = d.get("logprobs", None) | ||
| if logprobs is not None: | ||
| if logprobs is True: | ||
| sampling_params.logprobs = d.get("top_logprobs", None) | ||
| elif logprobs is False: | ||
| sampling_params.logprobs = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
精简一下
| logprobs = d.get("logprobs", None) | |
| if logprobs is not None: | |
| if logprobs is True: | |
| sampling_params.logprobs = d.get("top_logprobs", None) | |
| elif logprobs is False: | |
| sampling_params.logprobs = None | |
| logprobs = d.get("logprobs", None) | |
| if logprobs: | |
| sampling_params.logprobs = d.get("top_logprobs", None) | |
| else: | |
| sampling_params.logprobs = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logprobs可能为true、false和int值[-1, 0, 1, 2,....],chat接口需要将bool类型映射到数字或者None。
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #5430 +/- ##
==========================================
Coverage ? 59.59%
==========================================
Files ? 327
Lines ? 40666
Branches ? 6175
==========================================
Hits ? 24233
Misses ? 14555
Partials ? 1878
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
Motivation
每次根据batch 请求中的真实的logprob计算,相比每次按照最大20计算,端到端性能提升10%
Modifications
无改变
Usage or Command
无改变
Accuracy Tests
已存在
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.