[Pipeline inference] support llama pipeline inference by FoolPlayer · Pull Request #4647 · hpcaitech/ColossalAI

FoolPlayer · 2023-09-07T03:28:21Z

📌 Checklist before creating the PR

I have created an issue for this PR for traceability
The title follows the standard format: [doc/gemini/tensor/...]: A concise description
I have added relevant tags if possible for us to better distinguish different PRs

🚨 Issue number

Link this PR to your issue with words like fixed to automatically close the linked issue upon merge

e.g. fixed #1234, closed #1234, resolved #1234

📝 What does this PR do?

Summarize your work here.
if you have any plots/diagrams/screenshots/tables, please attach them here.
support llama pipeline inference

💥 Checklist before requesting a review

I have linked my PR to an issue (instruction)
My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
I have performed a self-review of my code
I have added thorough tests.
I have added docstrings for all the functions/methods I implemented

⭐️ Do you enjoy contributing to Colossal-AI?

🌝 Yes, I do.
🌚 No, I don't.

Tell us more if you don't enjoy contributing to Colossal-AI.

github-actions · 2023-09-07T05:39:36Z

The code coverage for the changed files is 76%.

Click me to view the complete report

Name                                                            Stmts   Miss  Cover
-----------------------------------------------------------------------------------
colossalai/inference/__init__.py                                    2      0   100%
colossalai/inference/pipeline/__init__.py                           2      0   100%
colossalai/inference/pipeline/engine.py                            34      0   100%
colossalai/inference/pipeline/microbatch_manager.py               112      4    96%
colossalai/inference/pipeline/modeling/__init__.py                  0      0   100%
colossalai/inference/pipeline/modeling/gpt2.py                    124     43    65%
colossalai/inference/pipeline/modeling/llama.py                    91     91     0%
colossalai/inference/pipeline/policy/gpt2_ppinfer.py               43      5    88%
colossalai/inference/pipeline/utils.py                             15     10    33%
colossalai/pipeline/schedule/generate.py                           83      1    99%
colossalai/pipeline/stage_manager.py                               50      0   100%
tests/test_checkpoint_io/test_low_level_zero_checkpoint_io.py      48      1    98%
tests/test_generate/test_pipeline_infer.py                         43      1    98%
-----------------------------------------------------------------------------------
TOTAL                                                             647    156    76%

github-actions · 2023-09-07T07:24:38Z

The code coverage for the changed files is 76%.

Click me to view the complete report

Name                                                            Stmts   Miss  Cover
-----------------------------------------------------------------------------------
colossalai/inference/__init__.py                                    2      0   100%
colossalai/inference/pipeline/__init__.py                           2      0   100%
colossalai/inference/pipeline/engine.py                            34      0   100%
colossalai/inference/pipeline/microbatch_manager.py               112      4    96%
colossalai/inference/pipeline/modeling/__init__.py                  0      0   100%
colossalai/inference/pipeline/modeling/gpt2.py                    124     43    65%
colossalai/inference/pipeline/modeling/llama.py                    91     91     0%
colossalai/inference/pipeline/policy/gpt2_ppinfer.py               43      5    88%
colossalai/inference/pipeline/utils.py                             15     10    33%
colossalai/pipeline/schedule/generate.py                           83      1    99%
colossalai/pipeline/stage_manager.py                               50      0   100%
tests/test_checkpoint_io/test_low_level_zero_checkpoint_io.py      48      1    98%
tests/test_generate/test_pipeline_infer.py                         43      1    98%
-----------------------------------------------------------------------------------
TOTAL                                                             647    156    76%

* support llama pipeline inference * remove tie weight operation

* [pipeline inference] pipeline inference (#4492) * add pp stage manager as circle stage * fix a bug when create process group * add ppinfer basic framework * add micro batch manager and support kvcache-pp gpt2 fwd * add generate schedule * use mb size to control mb number * support generate with kv cache * add output, remove unused code * add test * reuse shardformer to build model * refactor some code and use the same attribute name of hf * fix review and add test for generation * remove unused file * fix CI * add cache clear * fix code error * fix typo * [Pipeline inference] Modify to tieweight (#4599) * add pp stage manager as circle stage * fix a bug when create process group * add ppinfer basic framework * add micro batch manager and support kvcache-pp gpt2 fwd * add generate schedule * use mb size to control mb number * support generate with kv cache * add output, remove unused code * add test * reuse shardformer to build model * refactor some code and use the same attribute name of hf * fix review and add test for generation * remove unused file * modify the way of saving newtokens * modify to tieweight * modify test * remove unused file * solve review * add docstring * [Pipeline inference] support llama pipeline inference (#4647) * support llama pipeline inference * remove tie weight operation * [pipeline inference] Fix the blocking of communication when ppsize is 2 (#4708) * add benchmark verbose * fix export tokens * fix benchmark verbose * add P2POp style to do p2p communication * modify schedule as p2p type when ppsize is 2 * remove unused code and add docstring * [Pipeline inference] Refactor code, add docsting, fix bug (#4790) * add benchmark script * update argparse * fix fp16 load * refactor code style * add docstring * polish code * fix test bug * [Pipeline inference] Add pipeline inference docs (#4817) * add readme doc * add a ico * Add performance * update table of contents * refactor code (#4873)

…h#4820) * [pipeline inference] pipeline inference (hpcaitech#4492) * add pp stage manager as circle stage * fix a bug when create process group * add ppinfer basic framework * add micro batch manager and support kvcache-pp gpt2 fwd * add generate schedule * use mb size to control mb number * support generate with kv cache * add output, remove unused code * add test * reuse shardformer to build model * refactor some code and use the same attribute name of hf * fix review and add test for generation * remove unused file * fix CI * add cache clear * fix code error * fix typo * [Pipeline inference] Modify to tieweight (hpcaitech#4599) * add pp stage manager as circle stage * fix a bug when create process group * add ppinfer basic framework * add micro batch manager and support kvcache-pp gpt2 fwd * add generate schedule * use mb size to control mb number * support generate with kv cache * add output, remove unused code * add test * reuse shardformer to build model * refactor some code and use the same attribute name of hf * fix review and add test for generation * remove unused file * modify the way of saving newtokens * modify to tieweight * modify test * remove unused file * solve review * add docstring * [Pipeline inference] support llama pipeline inference (hpcaitech#4647) * support llama pipeline inference * remove tie weight operation * [pipeline inference] Fix the blocking of communication when ppsize is 2 (hpcaitech#4708) * add benchmark verbose * fix export tokens * fix benchmark verbose * add P2POp style to do p2p communication * modify schedule as p2p type when ppsize is 2 * remove unused code and add docstring * [Pipeline inference] Refactor code, add docsting, fix bug (hpcaitech#4790) * add benchmark script * update argparse * fix fp16 load * refactor code style * add docstring * polish code * fix test bug * [Pipeline inference] Add pipeline inference docs (hpcaitech#4817) * add readme doc * add a ico * Add performance * update table of contents * refactor code (hpcaitech#4873)

support llama pipeline inference

7cb9cab

ver217 reviewed Sep 7, 2023

View reviewed changes

Comment thread colossalai/inference/pipeline/policy/llama_ppinfer.py Outdated

remove tie weight operation

920bc36

ver217 approved these changes Sep 7, 2023

View reviewed changes

FoolPlayer merged commit 9abce92 into hpcaitech:feature/pipeline-infer Sep 7, 2023

FoolPlayer deleted the ppinfer-llama branch September 7, 2023 11:10

FoolPlayer added a commit that referenced this pull request Sep 27, 2023

[Pipeline inference] support llama pipeline inference (#4647)

84e76c1

* support llama pipeline inference * remove tie weight operation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pipeline inference] support llama pipeline inference#4647

[Pipeline inference] support llama pipeline inference#4647
FoolPlayer merged 2 commits intohpcaitech:feature/pipeline-inferfrom
FoolPlayer:ppinfer-llama

FoolPlayer commented Sep 7, 2023

Uh oh!

Uh oh!

github-actions Bot commented Sep 7, 2023

Uh oh!

github-actions Bot commented Sep 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

FoolPlayer commented Sep 7, 2023

📌 Checklist before creating the PR

🚨 Issue number

📝 What does this PR do?

💥 Checklist before requesting a review

⭐️ Do you enjoy contributing to Colossal-AI?

Uh oh!

Uh oh!

github-actions Bot commented Sep 7, 2023

Uh oh!

github-actions Bot commented Sep 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants