-
Notifications
You must be signed in to change notification settings - Fork 350
[Feat] Adding support to turn on/off engine deployment #311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
39c117e to
639c847
Compare
|
hey @dumb0002 thanks for your PR! This is great! Will take a look soon. |
|
Hey @dumb0002 — thanks again for the PR! 🙌 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears that when enableEngine is set to true, the intended behavior is to skip or exclude all fields related to .servingEngineSpec. However, I noticed that files like secrets.yaml and the vllmApiKey field have not been modified accordingly. Could you please clarify whether these should also be skipped or if there's a specific reason they remain unchanged?
|
Btw when engine instances are already running, does launching the router pod ensure that it detects these existing engines through the service discovery mechanism (for example, via |
@Shaoting-Feng, yes, the router is able to detect the existing engines by setting the flag |
3c0c589 to
c239ccf
Compare
@Shaoting-Feng, you're correct. The intended behavior is to skip or exclude all fields related to Now, if engine instances are already running and secured with the api key |
@YuhanLiu11, is this attached example log what you're looking for? thanks! |
|
FYI, I see this PR is failing one of the tests - I checked it and it seems unrelated to it, please let me know if otherwise. I see the following error from the test logs: X Exiting due to MK_USAGE: loading profile: cluster "minikube" does not exist |
Could you rebase (merge) the main branch into your branch? There is some update regarding the CI workflow (PR #325 ). Thanks! |
c239ccf to
a689fef
Compare
Done. Thanks! |
helm/values.yaml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One final minor comment :): Can you change this "VLMM" to "vLLM"? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh! that was a typo - fixed now. Thanks!
Signed-off-by: Braulio Dumba <brauliodumba@gmail.com>
11db788 to
586cc81
Compare
YuhanLiu11
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Signed-off-by: Braulio Dumba <brauliodumba@gmail.com> Signed-off-by: Luke Tux <lwtucker@uchicago.edu>
Signed-off-by: Braulio Dumba <brauliodumba@gmail.com>
Signed-off-by: Braulio Dumba <brauliodumba@gmail.com> Signed-off-by: allytotheson <82621261+allytotheson@users.noreply.github.com>
This PR updates the Helm chart to easily allow to turn on/off the deployment of the vLLM engine. This supports the use-cases when there are already engine instances running and we only need the deployment of the router with the right configurations.
BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE
-swhen doinggit commit[Bugfix],[Feat], and[CI].Detailed Checklist (Click to Expand)
Thank you for your contribution to production-stack! Before submitting the pull request, please ensure the PR meets the following criteria. This helps us maintain the code quality and improve the efficiency of the review process.
PR Title and Classification
Please try to classify PRs for easy understanding of the type of changes. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:
[Bugfix]for bug fixes.[CI/Build]for build or continuous integration improvements.[Doc]for documentation fixes and improvements.[Feat]for new features in the cluster (e.g., autoscaling, disaggregated prefill, etc.).[Router]for changes to thevllm_router(e.g., routing algorithm, router observability, etc.).[Misc]for PRs that do not fit the above categories. Please use this sparingly.Note: If the PR spans more than one category, please include all relevant prefixes.
Code Quality
The PR need to meet the following code quality standards:
pre-committo format your code. SeeREADME.mdfor installation.DCO and Signed-off-by
When contributing changes to this project, you must agree to the DCO. Commits must include a
Signed-off-by:header which certifies agreement with the terms of the DCO.Using
-swithgit commitwill automatically add this header.What to Expect for the Reviews
We aim to address all PRs in a timely manner. If no one reviews your PR within 5 days, please @-mention one of YuhanLiu11
, Shaoting-Feng or ApostaC.