llama-server: fix duplicate HTTP headers in multiple models mode#17698
Merged
ngxson merged 2 commits intoggml-org:masterfrom Dec 3, 2025
Merged
llama-server: fix duplicate HTTP headers in multiple models mode#17698ngxson merged 2 commits intoggml-org:masterfrom
ngxson merged 2 commits intoggml-org:masterfrom
Conversation
Contributor
Author
|
Note: I first tried a is_proxied flag approach but it required more code with logic split between modules. Filtering at source is simpler. |
ngxson
reviewed
Dec 2, 2025
Contributor
ngxson
left a comment
There was a problem hiding this comment.
looks good overall! would appropriate if you can address some small comments
|
I just want to confirm that this PR solves my issue, nginx errors are gone and |
- restrict scope of header after std::move - simplify header check (remove unordered_set)
ngxson
approved these changes
Dec 2, 2025
khemchand-zetta
pushed a commit
to khemchand-zetta/llama.cpp
that referenced
this pull request
Dec 4, 2025
…17698) * llama-server: fix duplicate HTTP headers in multiple models mode (ggml-org#17693) * llama-server: address review feedback from ngxson - restrict scope of header after std::move - simplify header check (remove unordered_set)
0Marble
pushed a commit
to 0Marble/llama.cpp
that referenced
this pull request
Dec 18, 2025
…17698) * llama-server: fix duplicate HTTP headers in multiple models mode (ggml-org#17693) * llama-server: address review feedback from ngxson - restrict scope of header after std::move - simplify header check (remove unordered_set)
Anico2
added a commit
to Anico2/llama.cpp
that referenced
this pull request
Jan 15, 2026
…17698) * llama-server: fix duplicate HTTP headers in multiple models mode (ggml-org#17693) * llama-server: address review feedback from ngxson - restrict scope of header after std::move - simplify header check (remove unordered_set)
blime4
referenced
this pull request
in blime4/llama.cpp
Feb 5, 2026
* llama-server: fix duplicate HTTP headers in multiple models mode (#17693) * llama-server: address review feedback from ngxson - restrict scope of header after std::move - simplify header check (remove unordered_set)
Seunghhon
pushed a commit
to Seunghhon/llama.cpp
that referenced
this pull request
Apr 26, 2026
…17698) * llama-server: fix duplicate HTTP headers in multiple models mode (ggml-org#17693) * llama-server: address review feedback from ngxson - restrict scope of header after std::move - simplify header check (remove unordered_set)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Make sure to read the contributing guidelines before submitting a PR
Approach: Filter at source
This patch filters headers before forwarding them to avoid duplication.
Why headers get duplicated:
When the router proxies child process responses, both the router (via
set_default_headers) and the child send the same headers (Server,
Transfer-Encoding, Keep-Alive, CORS). The proxy was forwarding everything,
resulting in duplicates.
Solution:
Skip headers that will be added by the router or httplib:
Handle Content-Type separately via msg_t.content_type to avoid duplication
when httplib calls set_chunked_content_provider() or set_content().
Tested with:
Before: duplicate Server, Transfer-Encoding, Keep-Alive, Access-Control-Allow-Origin, Content-Type
After: all headers appear exactly once
Fixes #17693