Skip to content

kv-cache : support layer reuse#15504

Merged
ggerganov merged 2 commits intomasterfrom
gg/kv-cache-reuse-layers
Aug 24, 2025
Merged

kv-cache : support layer reuse#15504
ggerganov merged 2 commits intomasterfrom
gg/kv-cache-reuse-layers

Conversation

@ggerganov
Copy link
Copy Markdown
Member

The logic for KV cache layer reuse was hacked quickly for the Gemma-3n release. This PR refactors the implementation to provide more generic support for this functionality.

  • Introduce llama_memory_i::layer_reuse_cb similar to llama_memory_i::layer_filter_cb
  • Add bool hparams.has_kv(il)
  • Remove per-model special-casing in llama_kv_cache

@ggerganov ggerganov merged commit b730706 into master Aug 24, 2025
1 check passed
@ggerganov ggerganov deleted the gg/kv-cache-reuse-layers branch August 24, 2025 10:07
qnixsynapse pushed a commit to janhq/llama.cpp that referenced this pull request Aug 25, 2025
* kv-cache : support layer reuse

ggml-ci

* cont : update comments [no ci]
blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026
* kv-cache : support layer reuse

ggml-ci

* cont : update comments [no ci]
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* kv-cache : support layer reuse

ggml-ci

* cont : update comments [no ci]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant