Skip to content

completion : fix prompt cache for recurrent models#19045

Merged
ggerganov merged 1 commit intomasterfrom
gg/completion-fix-prompt-cache
Jan 25, 2026
Merged

completion : fix prompt cache for recurrent models#19045
ggerganov merged 1 commit intomasterfrom
gg/completion-fix-prompt-cache

Conversation

@ggerganov
Copy link
Copy Markdown
Member

fix #19041

Recurrent memory does not support llama_memory_seq_rm(). Perform it only if necessary.

Copy link
Copy Markdown
Member

@danbev danbev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified locally that this fixes #19041.

@ggerganov ggerganov merged commit 080b161 into master Jan 25, 2026
75 of 78 checks passed
@ggerganov ggerganov deleted the gg/completion-fix-prompt-cache branch January 25, 2026 07:12
shaofeiqi pushed a commit to qualcomm/llama.cpp that referenced this pull request Feb 6, 2026
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Eval bug: lfm2 1.2B: GGML_ASSERT(cell.has_seq_id(seq_id)) when reusing --prompt-cache with llama-completion b7802 (aarch64)

2 participants