Transformers compatability#115
Conversation
|
To test with current transformers PR being open: |
66d37d9 to
e74d852
Compare
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
* Add DuoAttentionPress * Fix tests and compression_ratio * Address feedback * Update plot * Update version Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: SimJeg <sjegou@nvidia.com> Co-authored-by: giulio98 <corallo.giulio@yahoo.it> Co-authored-by: miriam-16 <miriam.lamari2@gmail.com> Co-authored-by: FaureElia <s283469@studenti.polito.it> Co-authored-by: YuhuiXu <yuhuixu1993@126.com> Co-authored-by: win10 <doss72180@gmail.com> Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
|
Also note that now we compute the flash attention varlen kwargs early in generate: to avoid expensive recomputations on each attention pass. This is the case since huggingface/transformers#39474 and huggingface/transformers#40002 Unfortunately, this means kvpress might cause an out of bound access, since the condition Have a look at 9986c31 as a possible fix. This script tests for such failure: |
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
|
Thanks a lot for the comments @manueldeprada, this is really appreciated! |
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
|
/ok to test f37b768 |
|
Thanks a lot @maxjeblick for handling this, and @Jack-Yu-815 for the review! |
PR description
Updates kvpress to upcoming transformers version 4.56.0.
Follwoing issues/PRs need to be fixed before merging this PR:
Flash Attention] Fix flash attention integration huggingface/transformers#40002PR was tested on huggingface/transformers#40002
TODOS:
Checklist
make test)make style, on errors try fix withmake format)git commit -smypress_press.pyis in thepressesdirectoryMyPressis in__init__.pyREADME.mdis updated with a 1 liner about the new press in the Available presses sectiondefault_presseslist intests/default_presses.py