Hardcode number of forwarder workers #334

yunfeng-scale · 2023-10-19T21:29:43Z

hardcode to use 2 http forwarder workers:

we're only assigning 500m CPU to forwarder
given our traffic don't think we need more than 2

yunfeng-scale · 2023-10-20T00:12:57Z

charts/model-engine/templates/service_template_config_map.yaml

                {{- end }}
                - --num-workers
-                - "${PER_WORKER}"
+                - "${FORWARDER_WORKER_COUNT}"


we always just have one celery worker and per_worker is used to determine the concurrency https://github.com/scaleapi/llm-engine/blob/main/model-engine/model_engine_server/inference/forwarding/celery_forwarder.py#L141

imo we could think of ways to maybe increase the number of celery tasks in flight to increase async task throughput, which may involve increasing number of workers, maybe not super necessary now though

Hardcode number of forwarder workers

604906c

yunfeng-scale requested a review from a team October 19, 2023 21:31

yixu34 approved these changes Oct 19, 2023

View reviewed changes

revert worker count change for async endpoints

d7af84e

yunfeng-scale commented Oct 20, 2023

View reviewed changes

yunfeng-scale and others added 2 commits October 19, 2023 17:16

type

c03a7aa

Merge branch 'main' into yunfeng-hardcode-forwarder-count

4ab9299

yunfeng-scale enabled auto-merge (squash) October 20, 2023 00:19

yunfeng-scale merged commit 1a3b5e0 into main Oct 20, 2023

yunfeng-scale deleted the yunfeng-hardcode-forwarder-count branch October 20, 2023 00:43

yunfeng-scale mentioned this pull request Nov 1, 2023

Integrate TensorRT-LLM #358

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hardcode number of forwarder workers #334

Hardcode number of forwarder workers #334

yunfeng-scale commented Oct 19, 2023 •

edited

Loading

Uh oh!

yunfeng-scale Oct 20, 2023

Uh oh!

seanshi-scale Oct 20, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Hardcode number of forwarder workers #334

Hardcode number of forwarder workers #334

Conversation

yunfeng-scale commented Oct 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yunfeng-scale Oct 20, 2023

Choose a reason for hiding this comment

Uh oh!

seanshi-scale Oct 20, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yunfeng-scale commented Oct 19, 2023 •

edited

Loading