Skip to content

Fix kvzip having poor performance#152

Merged
maxjeblick merged 1 commit intomainfrom
max/fix_kvzip_2
Nov 10, 2025
Merged

Fix kvzip having poor performance#152
maxjeblick merged 1 commit intomainfrom
max/fix_kvzip_2

Conversation

@maxjeblick
Copy link
Copy Markdown
Collaborator

PR description

Probable fix of #147.

Root cause of the issue:

  • kwargs.get("past_key_values", None) returns DynamicCache(layers=[])
  • bool(DynamicCache(layers=[])) equals to false, as bool method falls back to __len__ and len(DynamicCache(layers=[])) == len([]) = 0.
  • Thus, self._cache=None, and it won't be used to perform kvzip compression.

Cause of this bug:
Transformers library 4.54. changed the naming convention from past_key_value to past_key_values.
During some transition period, however, there existed some back on forth on the naming convention ( transformers changed cache implementation a few times, see e.g .#104). I decided to make kvzip compatible with both naming conventions while working on #115 to allow for smoother devleopment. As it turned out, my implementation was faulty.

Comments:
There are other parts in kvzip that can be refactored. For this PR, I'll only address the actual bug fix to have better visibility.

Checklist

Before submitting a PR, please make sure:

  • Tests are working (make test)
  • Code is formatted correctly (make style, on errors try fix with make format)
  • Copyright header is included
  • All commits are signed-off using git commit -s

Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
@maxjeblick maxjeblick requested a review from SimJeg November 7, 2025 14:36
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Nov 7, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@maxjeblick
Copy link
Copy Markdown
Collaborator Author

/ok to test 7520bdc

@maxjeblick
Copy link
Copy Markdown
Collaborator Author

Ruler 4k @ 50% cr, for qwen3-8b

{
  "cwe": {
    "string_match": 99.1
  },
  "fwe": {
    "string_match": 95.73
  },
  "niah_multikey_1": {
    "string_match": 100.0
  },
  "niah_multikey_2": {
    "string_match": 100.0
  },
  "niah_multikey_3": {
    "string_match": 100.0
  },
  "niah_multiquery": {
    "string_match": 99.95
  },
  "niah_multivalue": {
    "string_match": 99.95
  },
  "niah_single_1": {
    "string_match": 100.0
  },
  "niah_single_2": {
    "string_match": 100.0
  },
  "niah_single_3": {
    "string_match": 100.0
  },
  "qa_1": {
    "string_match": 80.0
  },
  "qa_2": {
    "string_match": 62.8
  },
  "vt": {
    "string_match": 100.0
  }
}

@maxjeblick maxjeblick changed the title Fix kvzip having poor perofrmance Fix kvzip having poor performance Nov 9, 2025
@maxjeblick maxjeblick merged commit 6a4f005 into main Nov 10, 2025
3 of 4 checks passed
@maxjeblick maxjeblick deleted the max/fix_kvzip_2 branch November 10, 2025 09:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants