Skip to content

Add DuoAttentionPress#50

Merged
SimJeg merged 5 commits intomainfrom
simon/duo-attention-2
Feb 18, 2025
Merged

Add DuoAttentionPress#50
SimJeg merged 5 commits intomainfrom
simon/duo-attention-2

Conversation

@SimJeg
Copy link
Copy Markdown
Collaborator

@SimJeg SimJeg commented Feb 17, 2025

Implementation of DuoAttention paper (https://arxiv.org/abs/2410.10819). This implementation is not efficient as it relies on the attention patch for head wise compression as AdaKV. A more efficient implementation is available here but it's much harder to read so this version has been preferred.

Will report results before removing draft.

@SimJeg SimJeg marked this pull request as draft February 17, 2025 17:30
@SimJeg SimJeg changed the title Draft: Add DuoAttentionPress Add DuoAttentionPress Feb 17, 2025
@SimJeg SimJeg requested a review from maxjeblick February 18, 2025 07:54
@SimJeg SimJeg marked this pull request as ready for review February 18, 2025 07:55
Comment thread kvpress/presses/duo_attention_press.py
Comment thread kvpress/presses/duo_attention_press.py
@SimJeg
Copy link
Copy Markdown
Collaborator Author

SimJeg commented Feb 18, 2025

Here is an extended version of the plot in the README:

image

Copy link
Copy Markdown
Collaborator

@maxjeblick maxjeblick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot!

@SimJeg SimJeg merged commit a94a78d into main Feb 18, 2025
@SimJeg SimJeg deleted the simon/duo-attention-2 branch February 18, 2025 16:50
maxjeblick pushed a commit that referenced this pull request Aug 12, 2025
* Add DuoAttentionPress

* Fix tests and compression_ratio

* Address feedback

* Update plot

* Update version

Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants