Skip to content

Integrating CriticalKV with Existing Methods#46

Merged
SimJeg merged 11 commits intoNVIDIA:mainfrom
FFY0:criticalkv
Feb 12, 2025
Merged

Integrating CriticalKV with Existing Methods#46
SimJeg merged 11 commits intoNVIDIA:mainfrom
FFY0:criticalkv

Conversation

@FFY0
Copy link
Copy Markdown
Contributor

@FFY0 FFY0 commented Feb 11, 2025

PR description

Support for a new work: CriticalKV Openreview ArXiv.

Fixes #45

This method introduces a novel metric for KV cache entry selection from an output perturbation perspective, which can be seamlessly integrated to enhance existing techniques. When integrating SnapKV, Expected Attention, and Ada-KV, the preliminary results on the 4K ruler with Llama-3.1-8B are shown below.

image

FFY0 added 7 commits January 25, 2025 12:23
Signed-off-by: yuanfeng <yfung@mail.ustc.edu.cn>
Signed-off-by: yuanfeng <yfung@mail.ustc.edu.cn>
Signed-off-by: yuanfeng <yfung@mail.ustc.edu.cn>
@FFY0 FFY0 changed the title CriticalKV Integrating CriticalKV with Existing Methods Feb 11, 2025
@FFY0 FFY0 marked this pull request as draft February 11, 2025 14:54
Comment thread tests/presses/test_presses.py Outdated
@SimJeg
Copy link
Copy Markdown
Collaborator

SimJeg commented Feb 11, 2025

@FFY0 thanks for opening the PR, as you can see you have style errors. Please fix them (this can be done automatically if you have black or something like this). You can run flake8 locally to check you fixed the errors

Comment thread evaluation/evaluate.py Outdated
@SimJeg SimJeg self-assigned this Feb 11, 2025
@SimJeg
Copy link
Copy Markdown
Collaborator

SimJeg commented Feb 11, 2025

also, please open the PR for main, not to on my branch ^^

Signed-off-by: yuanfeng <yfung@mail.ustc.edu.cn>
@FFY0 FFY0 changed the base branch from simon/update-vnorm to main February 11, 2025 15:27
@FFY0
Copy link
Copy Markdown
Contributor Author

FFY0 commented Feb 11, 2025

also, please open the PR for main, not to on my branch ^^

LOL, OK!😂

@SimJeg
Copy link
Copy Markdown
Collaborator

SimJeg commented Feb 11, 2025

two remaining style errors:

./evaluation/evaluate.py:143:121: E501 line too long (129 > 120 characters)
./kvpress/presses/expected_attention_press.py:42:19: E275 missing whitespace after keyword

@FFY0
Copy link
Copy Markdown
Contributor Author

FFY0 commented Feb 11, 2025

./kvpress/presses/expected_attention_press.py:42:19: E275 missing whitespace after keyword

This error wasn’t detected by flake8 when I ran it locally. I checked the corresponding file, and line 42 is just a blank line, so a "missing whitespace" issue shouldn’t appear there. It seems a bit strange.

Comment thread tests/presses/test_presses.py Outdated
@SimJeg
Copy link
Copy Markdown
Collaborator

SimJeg commented Feb 11, 2025

@maxjeblick I don't understand why there are style errors in files @FFY0 did not change...

@SimJeg SimJeg mentioned this pull request Feb 11, 2025
@maxjeblick
Copy link
Copy Markdown
Collaborator

Style checks are currently failing, see #48
I will debug, and push a fix.

@SimJeg SimJeg marked this pull request as ready for review February 12, 2025 11:08
@maxjeblick
Copy link
Copy Markdown
Collaborator

@FFY0 Style errors have been fixed, you can merge main into your branch.

@SimJeg
Copy link
Copy Markdown
Collaborator

SimJeg commented Feb 12, 2025

@FFY0, @maxjeblick fixed the style issue, please merge with main and commit the changes. Don't forget to sign-off your commits.

@SimJeg SimJeg merged commit e172648 into NVIDIA:main Feb 12, 2025
maxjeblick pushed a commit that referenced this pull request Aug 12, 2025
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Request for New Press Support: Integrating CriticalKV with Existing Methods

3 participants