Some batch inference improvements #460

yunfeng-scale · 2024-02-28T01:57:07Z

Pull Request Summary

Add batch inference to llm engine guides
Cache vllm batch inference docker image
Use lower GPU mem utilization to avoid random OOM, also some logging if there's abnormal GPU mem usage at start

Test Plan and Usage Guide

How did you validate that your PR works correctly? How do you run or demo the code? Provide enough detail so a reviewer can reasonably reproduce the testing procedure. Paste example command line invocations if applicable.

docs/guides/completions.md

ian-scale · 2024-02-29T21:58:49Z

model-engine/model_engine_server/inference/batch_inference/Dockerfile_vllm


 RUN apt-get update && \
-    apt-get install -y dumb-init && \
+    apt-get install -y dumb-init psmisc && \


just curious, what does psmisc do?

psmisc provides fuser command

Some batch inference improvements

6af2c99

yunfeng-scale requested a review from a team February 28, 2024 01:59

yunfeng-scale and others added 3 commits February 27, 2024 18:09

fix unit test

a84189a

coverage

6e1fbf0

Merge branch 'main' into yunfeng-batch-infer-improv

3f6901a

ian-scale approved these changes Feb 29, 2024

View reviewed changes

docs/guides/completions.md Outdated Show resolved Hide resolved

ian-scale reviewed Feb 29, 2024

View reviewed changes

yunfeng-scale added 2 commits March 1, 2024 23:16

integration test

bba23fb

fix

1f96367

yunfeng-scale merged commit 39ef7c4 into main Mar 2, 2024

yunfeng-scale deleted the yunfeng-batch-infer-improv branch March 2, 2024 02:17

yunfeng-scale mentioned this pull request Mar 6, 2024

Fix cacher #462

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Some batch inference improvements #460

Some batch inference improvements #460

Uh oh!

yunfeng-scale commented Feb 28, 2024 •

edited

Loading

Uh oh!

Uh oh!

ian-scale Feb 29, 2024

Uh oh!

yunfeng-scale Feb 29, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Some batch inference improvements #460

Some batch inference improvements #460

Uh oh!

Conversation

yunfeng-scale commented Feb 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Summary

Test Plan and Usage Guide

Uh oh!

Uh oh!

ian-scale Feb 29, 2024

Choose a reason for hiding this comment

Uh oh!

yunfeng-scale Feb 29, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yunfeng-scale commented Feb 28, 2024 •

edited

Loading