Add gpu optimizations to base model by jlamypoirier · Pull Request #14 · bigcode-project/transformers

jlamypoirier · 2023-02-28T18:46:01Z

This allows KV cache pre-allocation and key length padding outside of the inference runner. With this, the inference runner is exclusively a CPU optimization (except for small GPU gains from cuda graphs)

Separate PR for now because the inference runner needs to be adapted.

* Added onnx config whisper * added whisper support onnx * add audio input data * added whisper support onnx * fixed the seqlength value * Updated the whisper onnx ocnfig * restore files to old version * removed attention mask from inputs * Updated get_dummy_input_onnxruntime docstring * Updated relative imports and token generation * update docstring

* Add ESMFold code sample * sorry sylvain * make fixup * sorry sylvain again

Adds tag v4.24.0 for pypi

…into mayank/multi_query

bigximik and others added 30 commits August 31, 2022 04:41

add: 2 variants of multi query implementation; printing some details

31aad90

Unpin PyTorch

9c13b66

Release v4.24.0

1ebb3f7

Remove pin temporarily to get tests

502d3b6

Add ESMFold code sample (#20000)

8f95346

* Add ESMFold code sample * sorry sylvain * make fixup * sorry sylvain again

Unpin PyTorch for the release

94b3f54

Merge tag 'v4.24.0' into mayank/multi_query

a367bc0

Adds tag v4.24.0 for pypi

fix saving

b7e2124

Merge branch 'main' of github.com:bigcode-collaboration/transformers …

5171b4f

…into mayank/multi_query

added Raymond MQA variant

357ba81

chg: tensor vs fill acc to comments by Joel

a96771f

Style and fix

14f2249

cleanup

4dc821c

cleanup

129e8c9

cleanup

303e1b8

cleanup

d0b58e9

Fixes and cleanup

a1e9182

Fixes and cleanup

a57ca7a

Fixes and merge implementations

e152e94

Fixes and improvements

2e32a95

simplify and fix

93b42d2

Fixes, optimization and comments

82b11df

Best GeLU]

98319da

simpler gelu

d81e46f

Merge branch 'main' into joel-mqa

33ae645

Merge branch 'gpt2_bigcode' into joel-mqa

2f299a2

Move code

a1d7a95

fix

138fefb

Merge branch 'gpt2_bigcode' into joel-mqa

732e447

jlamypoirier added 24 commits February 22, 2023 17:08

cleanup

7c258b8

More optimizations

0f27985

cleanup

0a6c8fe

conversion script

cdf610b

cleanup

718ab39

cleanup

eb80372

fixes

7da239c

Merge branch 'more_optimizations' into fast_inference

be288ed

Memory usage and alignment

98b99db

Use separate file for now

f5a2d7f

Merge branch 'main' into more_optimizations

9fefeaf

Merge branch 'more_optimizations' into fast_inference

04538bc

style

60a41a1

Merge branch 'more_optimizations' into fast_inference

c997996

enum

435b923

Merge branch 'more_optimizations' into fast_inference

0d09856

enum

efae11f

Pad key length

c560bdc

Move gpu optimization to base model

d6a1ccd

Adapt inference runner

77b9ccf

bugfix

f3c45f1

Merge branch 'main' into more_optimizations

9b8d550

Merge branch 'more_optimizations' into fast_inference

3741f9d

Merge branch 'main' into fast_inference

a58bba8

Base automatically changed from fast_inference to main March 2, 2023 19:26

jlamypoirier added 2 commits March 2, 2023 14:27

Merge branch 'fast_inference' into fast_inference_base

352e36f

Merge branch 'main' into fast_inference_base

80ff910

jlamypoirier marked this pull request as ready for review March 2, 2023 19:28

jlamypoirier merged commit 9c3c548 into main Mar 2, 2023

jlamypoirier deleted the fast_inference_base branch March 2, 2023 19:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add gpu optimizations to base model#14

Add gpu optimizations to base model#14
jlamypoirier merged 79 commits intomainfrom
fast_inference_base

jlamypoirier commented Feb 28, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

jlamypoirier commented Feb 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jlamypoirier commented Feb 28, 2023 •

edited

Loading