fix: properly terminate llama.cpp kv_overrides array with empty key + updated doc #6672

blob42 · 2025-10-22T15:18:09Z

The llama model loading function expects KV overrides to be terminated with an empty key (key[0] == 0). Previously, the kv_overrides vector was not being properly terminated, causing an assertion failure.
140: GGML_ASSERT(params.kv_overrides.back().key[0] == 0 && "KV overrides not terminated with empty key") failed

This commit ensures that after parsing all KV override strings, we add a final terminating entry with an empty key to satisfy the C-style array termination requirement. This fixes the assertion error and allows the model to load correctly with custom KV overrides.

Also included a reference to the usage of the overrides option in the advanced-usage section.

Description

This PR fixes #6643 and relates to #5745

Notes for Reviewers

@mudler I tested these changes with qwen3moe and could change the number of experts. This also mean an API option to set the number of experts on MoE models is possible now with llama.cpp backend.

Also, Compiling the backend takes ~40min on my rig. Is there an easy way to quickly recompile the grpc server without rebuilding the whole llama backend ?

Signed commits

[x ] Yes, I signed my commits.

netlify · 2025-10-22T15:18:16Z

✅ Deploy Preview for localai ready!

Name	Link
🔨 Latest commit	`86e14f6`
🔍 Latest deploy log	https://app.netlify.com/projects/localai/deploys/68f9140c55aece0008c378e8
😎 Deploy Preview	https://deploy-preview-6672--localai.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

mudler · 2025-10-22T15:41:02Z

@blob42 thank you for opening the PR and looking at this!

Looking at the llama.cpp code, I'm not sure how this is handled upstream, as far as I can see, they use the same approach of populating the kv_overrides as ours, but don't terminate with 0 explicitly:

https://github.com/ggml-org/llama.cpp/blob/9b9201f65a22c02cee8e300f58f480a588591227/common/arg.cpp#L2976

However, I didn't tested it yet - so my implementation was a bit naive here and I eventually forgot to test this out (sorry!). I'll give a try to your PR soon.

For testing, usually I go with:

❯ make backends/llama-cpp build && ./local-ai run --debug --address "0.0.0.0:8080"

This builds only the llama-cpp backend and the grpc cache (once), and the next builds will only build the llama-cpp backend

mudler · 2025-10-22T15:47:45Z

Looking at the llama.cpp code, I'm not sure how this is handled upstream, as far as I can see, they use the same approach of populating the kv_overrides as ours, but don't terminate with 0 explicitly:

Ah, just for the records, found it here:

https://github.com/ggml-org/llama.cpp/blob/9b9201f65a22c02cee8e300f58f480a588591227/common/arg.cpp#L1432

Update:

@blob42 would you mind changing this approach of this PR to be closer to upstream? The rationale is to try to not diverge too much from the original implementation, that helps in maintenance long-term.

blob42 · 2025-10-22T16:47:24Z

Ah, just for the this approach of this PR to be closer to upstream? The rationale is to try to not diverge too much from the original implementation, that helps in maintenance long-term.

Sure I will update the PR.

The llama model loading function expects KV overrides to be terminated with an empty key (key[0] == 0). Previously, the kv_overrides vector was not being properly terminated, causing an assertion failure. This commit ensures that after parsing all KV override strings, we add a final terminating entry with an empty key to satisfy the C-style array termination requirement. This fixes the assertion error and allows the model to load correctly with custom KV overrides. Fixes mudler#6643 - Also included a reference to the usage of the `overrides` option in the advanced-usage section. Signed-off-by: blob42 <contact@blob42.xyz>

blob42 · 2025-10-22T17:37:20Z

@mudler should be good to go now. I copied verbatim and it works for me.

mudler · 2025-10-23T07:31:40Z

@mudler should be good to go now. I copied verbatim and it works for me.

great, thanks!

blob42 force-pushed the fix-kv-override branch from 08b9fee to e6d738c Compare October 22, 2025 15:22

blob42 changed the title ~~fix: properly terminate (llama.cpp) kv_overrides array with empty key + doc~~ fix: properly terminate llama.cpp kv_overrides array with empty key + updated doc Oct 22, 2025

blob42 force-pushed the fix-kv-override branch from e6d738c to faab1fc Compare October 22, 2025 15:31

blob42 marked this pull request as draft October 22, 2025 15:33

blob42 force-pushed the fix-kv-override branch from faab1fc to d51cd76 Compare October 22, 2025 15:37

blob42 marked this pull request as ready for review October 22, 2025 15:39

blob42 added 2 commits October 22, 2025 19:27

doc: document the overrides option

86e14f6

blob42 force-pushed the fix-kv-override branch from d51cd76 to 86e14f6 Compare October 22, 2025 17:27

mudler approved these changes Oct 23, 2025

View reviewed changes

mudler merged commit 32c0ab3 into mudler:master Oct 23, 2025
34 of 35 checks passed

mudler added the bug Something isn't working label Oct 23, 2025

BrewTestBot mentioned this pull request Oct 31, 2025

localai 3.7.0 Homebrew/homebrew-core#252225

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: properly terminate llama.cpp kv_overrides array with empty key + updated doc #6672

fix: properly terminate llama.cpp kv_overrides array with empty key + updated doc #6672

Uh oh!

blob42 commented Oct 22, 2025 •

edited

Loading

Uh oh!

netlify bot commented Oct 22, 2025 •

edited

Loading

Uh oh!

mudler commented Oct 22, 2025

Uh oh!

mudler commented Oct 22, 2025 •

edited

Loading

Uh oh!

blob42 commented Oct 22, 2025

Uh oh!

blob42 commented Oct 22, 2025

Uh oh!

mudler commented Oct 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

fix: properly terminate llama.cpp kv_overrides array with empty key + updated doc #6672

fix: properly terminate llama.cpp kv_overrides array with empty key + updated doc #6672

Uh oh!

Conversation

blob42 commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for localai ready!

Uh oh!

mudler commented Oct 22, 2025

Uh oh!

mudler commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blob42 commented Oct 22, 2025

Uh oh!

blob42 commented Oct 22, 2025

Uh oh!

mudler commented Oct 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

blob42 commented Oct 22, 2025 •

edited

Loading

netlify bot commented Oct 22, 2025 •

edited

Loading

mudler commented Oct 22, 2025 •

edited

Loading