[infer]Add and optimize vllm Continous Batching and Pageattention#4755

Merged

isky-cd merged 10 commits intohpcaitech:feature/vllm-continupus-batchingfrom

isky-cd:vllm_continous_batching

Sep 22, 2023

Contributor

isky-cd commented Sep 19, 2023

📌 Checklist before creating the PR

I have created an issue for this PR for traceability
The title follows the standard format: [doc/gemini/tensor/...]: A concise description
I have added relevant tags if possible for us to better distinguish different PRs

🚨 Issue number

Link this PR to your issue with words like fixed to automatically close the linked issue upon merge

e.g. fixed #1234, closed #1234, resolved #1234

📝 What does this PR do?

Summarize your work here.
if you have any plots/diagrams/screenshots/tables, please attach them here.

💥 Checklist before requesting a review

I have linked my PR to an issue (instruction)
My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
I have performed a self-review of my code
I have added thorough tests.
I have added docstrings for all the functions/methods I implemented

⭐️ Do you enjoy contributing to Colossal-AI?

🌝 Yes, I do.
🌚 No, I don't.

Tell us more if you don't enjoy contributing to Colossal-AI.

isky-cd added 3 commits

September 19, 2023 14:50


          Add continous batching and pageattention

611fdba


          fix bug when calling funtion

a099a7b


          adapted to vllm

201cfff

isky-cd marked this pull request as ready for review

September 20, 2023 10:30

yuanheng-zhao reviewed

View reviewed changes

colossalai/inference/tensor_parallel/utils.py Outdated

yuanheng-zhao reviewed

View reviewed changes

colossalai/inference/tensor_parallel/engine.py Outdated

yuanheng-zhao reviewed

View reviewed changes

colossalai/inference/tensor_parallel/engine.py Outdated

yuanheng-zhao reviewed

View reviewed changes

colossalai/inference/tensor_parallel/engine.py Outdated

yuanheng-zhao reviewed

View reviewed changes

colossalai/inference/tensor_parallel/utils.py Outdated

yuanheng-zhao reviewed

View reviewed changes

colossalai/inference/tensor_parallel/utils.py Outdated

yuanheng-zhao reviewed

View reviewed changes

colossalai/inference/tensor_parallel/engine.py Outdated

yuanheng-zhao reviewed

View reviewed changes

colossalai/inference/tensor_parallel/engine.py


          resolve pr comments

d60c662

Xu-Kai reviewed

View reviewed changes

colossalai/inference/tensor_parallel/engine.py Outdated

Xu-Kai reviewed

View reviewed changes

examples/inference/test_continous_batching.py


          resolve pr comments

b0ace60

CjhHa1 reviewed

View reviewed changes

examples/inference/bench_llama_continous_batching.py


          handle vllm's import issues

c1102d3

yuanheng-zhao reviewed

View reviewed changes

tests/test_infer/test_bloom_infer.py

yuanheng-zhao reviewed

View reviewed changes

tests/test_infer/test_infer_engine.py

yuanheng-zhao reviewed

View reviewed changes

tests/test_infer/test_llama_infer.py


          fix ci bugs

07f2fca

yuanheng-zhao approved these changes

View reviewed changes

isky-cd added 3 commits

September 22, 2023 11:21


          fix some bugs

86cd231


          fix some bugs

b67cc2c


          remove code for skipping ci test

406ba07

Contributor

github-actions Bot commented Sep 22, 2023

The code coverage for the changed files is 33%.

Click me to view the complete report

Name                                             Stmts   Miss  Cover
--------------------------------------------------------------------
colossalai/inference/tensor_parallel/engine.py     203    165    19%
colossalai/inference/tensor_parallel/utils.py       36     26    28%
tests/test_infer/test_bloom_infer.py                43     16    63%
tests/test_infer/test_infer_engine.py               64     35    45%
tests/test_infer/test_llama_infer.py                63     33    48%
--------------------------------------------------------------------
TOTAL                                              409    275    33%

isky-cd merged commit 78158cc into hpcaitech:feature/vllm-continupus-batching

isky-cd changed the title ~~[infer]Add Continous batching and pageattention~~ [infer]Add vllm Continous batching and pageattention

isky-cd changed the title ~~[infer]Add vllm Continous batching and pageattention~~ [infer]Add and optimize vllm Continous batching and pageattention

isky-cd changed the title ~~[infer]Add and optimize vllm Continous batching and pageattention~~ [infer]Add and optimize vllm Continous Batching and Pageattention

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet