[kernel] add kernels for llama inference by tiandiao123 · Pull Request #4462 · hpcaitech/ColossalAI

tiandiao123 · 2023-08-17T03:47:53Z

📌 Checklist before creating the PR

I have created an issue for this PR for traceability
The title follows the standard format: [doc/gemini/tensor/...]: A concise description
I have added relevant tags if possible for us to better distinguish different PRs

🚨 Issue number

Link this PR to your issue with words like fixed to automatically close the linked issue upon merge

e.g. fixed #1234, closed #1234, resolved #1234

📝 What does this PR do?

This PR is used to add llama useful cuda kernels for inference.

💥 Checklist before requesting a review

I have linked my PR to an issue (instruction)
My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
I have performed a self-review of my code
I have added thorough tests.
I have added docstrings for all the functions/methods I implemented

⭐️ Do you enjoy contributing to Colossal-AI?

🌝 Yes, I do.
🌚 No, I don't.

Tell us more if you don't enjoy contributing to Colossal-AI.

kurisusnowdeng

Any differece from the original codes? For example, I quickly checked pos_encoding & layernorm kernels but they seem exactly the same as the codes in vllm's repository.
If this is just integrating some external codes into our project, I recommend just to add the external tool as a dependency and import the necessary API for our needs.

tiandiao123 · 2023-08-17T07:18:16Z

Any differece from the original codes? For example, I quickly checked pos_encoding & layernorm kernels but they seem exactly the same as the codes in vllm's repository. If this is just integrating some external codes into our project, I recommend just to add the external tool as a dependency and import the necessary API for our needs.

There is some minor changes. I only integrate useful parts which will be integrated into our llama-inference attention over here.

github-actions · 2023-08-17T07:27:00Z

The code coverage for the changed files is %.

Click me to view the complete report

Name                                               Stmts   Miss  Cover
----------------------------------------------------------------------
op_builder/__init__.py                                11      0   100%
op_builder/rmsnorm.py                                 21      9    57%
op_builder/rotary_embedding.py                        42     25    40%
setup.py                                              73     73     0%
tests/test_kernels/cuda/test_rmsnorm.py               42      4    90%
tests/test_kernels/cuda/test_rotary_embedding.py      75      4    95%
tests/test_kernels/triton/test_self_attention.py      74     56    24%
tests/test_kernels/triton/test_softmax.py             17      8    53%
----------------------------------------------------------------------
TOTAL                                                355    179    50%

tiandiao123 · 2023-08-17T08:59:21Z

Adding llama shardformer demo, try to directly import third-party useful ops @kurisusnowdeng

tiandiao123 · 2023-08-22T07:10:09Z

use this PR instead: #4485 @kurisusnowdeng

tiandiao123 · 2023-08-25T04:17:47Z

temporarily closed this.

tiandiao123 added 30 commits August 10, 2023 16:34

adding kernels

064a8f2

update cutlass

68a4735

update

c70dcbe

adding kernels

a8938ba

delete useless files

b9b4396

clean codes

7a236e3

added cuda test

54ac1e1

added tests

e48a143

added tests

49bb1e3

add comments

0b3cffa

refactoring

c171f43

fix tests

0e0594e

change flash-attention as thrid-party directly

9393dd1

remove cutlass

c524925

delete cutlass

8f51549

cleaned codes

9efe2e9

add

caefcba

add norm

ad3fa46

delete useless files

7ba2d61

chnage setup

03e4149

change intp build class

746c1c9

added info

b1a2c19

added lisense

d09a81a

added lisense

0b0b28c

added lisense

8dd322f

change name

1d2294e

change flash version

1b53af9

change req

9313374

added it

90859c6

merged

559f2d5

tiandiao123 requested review from FrankLeeeee, Fridge003, flybird11111, kurisusnowdeng and ver217 August 17, 2023 04:44

ver217 reviewed Aug 17, 2023

View reviewed changes

Comment thread requirements/requirements.txt Outdated

modify req

b921fa2

ver217 approved these changes Aug 17, 2023

View reviewed changes

kurisusnowdeng reviewed Aug 17, 2023

View reviewed changes

ver217 changed the title ~~Feature/llama kernel~~ [kernel] add kernels for llama inference Aug 17, 2023

added _vllm_rmsnorm_forward and fix llama forward

e4341de

tiandiao123 closed this Aug 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[kernel] add kernels for llama inference#4462

[kernel] add kernels for llama inference#4462
tiandiao123 wants to merge 32 commits intohpcaitech:mainfrom
tiandiao123:feature/llama-kernel

tiandiao123 commented Aug 17, 2023 •

edited

Loading

Uh oh!

Uh oh!

kurisusnowdeng left a comment

Uh oh!

tiandiao123 commented Aug 17, 2023

Uh oh!

github-actions Bot commented Aug 17, 2023

Uh oh!

tiandiao123 commented Aug 17, 2023

Uh oh!

tiandiao123 commented Aug 22, 2023

Uh oh!

tiandiao123 commented Aug 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tiandiao123 commented Aug 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 Checklist before creating the PR

🚨 Issue number

📝 What does this PR do?

💥 Checklist before requesting a review

⭐️ Do you enjoy contributing to Colossal-AI?

Uh oh!

Uh oh!

kurisusnowdeng left a comment

Choose a reason for hiding this comment

Uh oh!

tiandiao123 commented Aug 17, 2023

Uh oh!

github-actions Bot commented Aug 17, 2023

Uh oh!

tiandiao123 commented Aug 17, 2023

Uh oh!

tiandiao123 commented Aug 22, 2023

Uh oh!

tiandiao123 commented Aug 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tiandiao123 commented Aug 17, 2023 •

edited

Loading