-
Notifications
You must be signed in to change notification settings - Fork 693
[Feature] Multimodal Scheduler V1 #3019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for your contribution! |
| // limitations under the License. | ||
|
|
||
| #include "paddle/extension.h" | ||
| #include <map> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
map没用到的话可以省去
| old_end_idx = request.num_computed_tokens | ||
| new_end_idx = old_end_idx + num_new_tokens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
old/new,是不是换成prev/cur更好理解点
| return {out}; | ||
| } | ||
|
|
||
| PD_BUILD_OP(get_img_boundaries) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PD_BUILD_OP是不是没走pybind啊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个算子用在服务上层,和推理引擎是异步的,不用也影响不大
|
Update fix according to #3071 |
Background
Adapt Scheduler V1 for multimodality on the basis of #2928.
Perf
Near to Scheduler V0.
Effect
Config kv_cache_ratio is deprecated, and the
recovery stopissue will no longer occur in the multimodal service.How to enable
Set environment variable ENABLE_V1_KVCACHE_SCHEDULER to 1 to enable scheduler v1.
export ENABLE_V1_KVCACHE_SCHEDULER=1