server: use random media marker#21962
Merged
ngxson merged 4 commits intoggml-org:masterfrom Apr 15, 2026
Merged
Conversation
ServeurpersoCom
approved these changes
Apr 15, 2026
Contributor
Author
|
asking for 2nd approval @ggml-org/maintainers |
ServeurpersoCom
approved these changes
Apr 15, 2026
Contributor
Author
CISC
approved these changes
Apr 15, 2026
Member
ServeurpersoCom
added a commit
to ServeurpersoCom/llama.cpp
that referenced
this pull request
Apr 16, 2026
ServeurpersoCom
added a commit
to ServeurpersoCom/llama.cpp
that referenced
this pull request
Apr 16, 2026
ServeurpersoCom
added a commit
to ServeurpersoCom/llama.cpp
that referenced
this pull request
Apr 16, 2026
ServeurpersoCom
added a commit
to ServeurpersoCom/llama.cpp
that referenced
this pull request
Apr 16, 2026
ggerganov
pushed a commit
that referenced
this pull request
Apr 16, 2026
…#21980) * server: tests: fetch random media marker via /apply-template (#21962 fix) * server: allow pinning media marker via LLAMA_MEDIA_MARKER env var get_media_marker() checks LLAMA_MEDIA_MARKER at first call and uses it as-is if set, falling back to the random marker otherwise. Tests no longer need to fetch the marker dynamically via /apply-template: the fixture sets LLAMA_MEDIA_MARKER=<__media__> so the hardcoded prompts work as before. Address review feedback from ngxson * server: make get_media_marker() thread-safe via magic statics Use a C++11 static local with a lambda initializer instead of a global static with an empty-check. The runtime guarantees initialization exactly once without explicit locking. Address review feedback from ggerganov * nits * nits
cnsiva
pushed a commit
to saas-home/llama.cpp
that referenced
this pull request
Apr 17, 2026
…g#21962) (ggml-org#21980) * server: tests: fetch random media marker via /apply-template (ggml-org#21962 fix) * server: allow pinning media marker via LLAMA_MEDIA_MARKER env var get_media_marker() checks LLAMA_MEDIA_MARKER at first call and uses it as-is if set, falling back to the random marker otherwise. Tests no longer need to fetch the marker dynamically via /apply-template: the fixture sets LLAMA_MEDIA_MARKER=<__media__> so the hardcoded prompts work as before. Address review feedback from ngxson * server: make get_media_marker() thread-safe via magic statics Use a C++11 static local with a lambda initializer instead of a global static with an empty-check. The runtime guarantees initialization exactly once without explicit locking. Address review feedback from ggerganov * nits * nits
samuraieng
pushed a commit
to samuraieng/llama.cpp
that referenced
this pull request
Apr 19, 2026
…g#21962) (ggml-org#21980) * server: tests: fetch random media marker via /apply-template (ggml-org#21962 fix) * server: allow pinning media marker via LLAMA_MEDIA_MARKER env var get_media_marker() checks LLAMA_MEDIA_MARKER at first call and uses it as-is if set, falling back to the random marker otherwise. Tests no longer need to fetch the marker dynamically via /apply-template: the fixture sets LLAMA_MEDIA_MARKER=<__media__> so the hardcoded prompts work as before. Address review feedback from ngxson * server: make get_media_marker() thread-safe via magic statics Use a C++11 static local with a lambda initializer instead of a global static with an empty-check. The runtime guarantees initialization exactly once without explicit locking. Address review feedback from ggerganov * nits * nits
mudler
added a commit
to mudler/LocalAI
that referenced
this pull request
Apr 19, 2026
CI turboquant docker build was failing with: grpc-server.cpp:2825:40: error: use of undeclared identifier 'get_media_marker' The call was added by 7809c5f (PR #9412) to propagate the mtmd random per-server media marker upstream landed in ggml-org/llama.cpp#21962. The TheTom/llama-cpp-turboquant fork branched before that PR, so its server-common.cpp has no such symbol. Extend patch-grpc-server.sh to substitute get_media_marker() with the legacy "<__media__>" literal in the build-time grpc-server.cpp copy under turboquant-<flavor>-build/. The fork's mtmd_default_marker() returns exactly that string, and the Go layer falls back to the same sentinel when media_marker is empty, so behavior on the turboquant path is unchanged. Patched copy only — the shared source under backend/cpp/llama-cpp/ keeps compiling against vanilla upstream. Verified by running `make docker-build-turboquant` locally end-to-end: all five flavors (avx, avx2, avx512, fallback, grpc+rpc-server) now compile past the previous failure and the image tags successfully.
mudler
added a commit
to mudler/LocalAI
that referenced
this pull request
Apr 19, 2026
…9423) * fix(turboquant): drop ignore-eos patch, bump fork to b8967-627ebbc The upstream PR #21203 (server: respect the ignore_eos flag) has been merged into the TheTom/llama-cpp-turboquant feature/turboquant-kv-cache branch. With the fix now in-tree, 0001-server-respect-the-ignore-eos-flag.patch no longer applies (git apply sees its additions already present) and the nightly turboquant bump fails. Retire the patch and bump the pin to the first fork revision that carries the merged fix (tag feature-turboquant-kv-cache-b8967-627ebbc). This matches the contract in apply-patches.sh: drop patches once the fork catches up. * fix(turboquant): patch out get_media_marker() call in grpc-server copy CI turboquant docker build was failing with: grpc-server.cpp:2825:40: error: use of undeclared identifier 'get_media_marker' The call was added by 7809c5f (PR #9412) to propagate the mtmd random per-server media marker upstream landed in ggml-org/llama.cpp#21962. The TheTom/llama-cpp-turboquant fork branched before that PR, so its server-common.cpp has no such symbol. Extend patch-grpc-server.sh to substitute get_media_marker() with the legacy "<__media__>" literal in the build-time grpc-server.cpp copy under turboquant-<flavor>-build/. The fork's mtmd_default_marker() returns exactly that string, and the Go layer falls back to the same sentinel when media_marker is empty, so behavior on the turboquant path is unchanged. Patched copy only — the shared source under backend/cpp/llama-cpp/ keeps compiling against vanilla upstream. Verified by running `make docker-build-turboquant` locally end-to-end: all five flavors (avx, avx2, avx512, fallback, grpc+rpc-server) now compile past the previous failure and the image tags successfully.
mengqin
pushed a commit
to mengqin/llama.cpp
that referenced
this pull request
Apr 20, 2026
* server: use random media marker * nits * remove legacy <__image__> token * revert special char in random
mengqin
pushed a commit
to mengqin/llama.cpp
that referenced
this pull request
Apr 20, 2026
…g#21962) (ggml-org#21980) * server: tests: fetch random media marker via /apply-template (ggml-org#21962 fix) * server: allow pinning media marker via LLAMA_MEDIA_MARKER env var get_media_marker() checks LLAMA_MEDIA_MARKER at first call and uses it as-is if set, falling back to the random marker otherwise. Tests no longer need to fetch the marker dynamically via /apply-template: the fixture sets LLAMA_MEDIA_MARKER=<__media__> so the hardcoded prompts work as before. Address review feedback from ngxson * server: make get_media_marker() thread-safe via magic statics Use a C++11 static local with a lambda initializer instead of a global static with an empty-check. The runtime guarantees initialization exactly once without explicit locking. Address review feedback from ggerganov * nits * nits
ArberSephirotheca
pushed a commit
to ArberSephirotheca/llama.cpp
that referenced
this pull request
Apr 21, 2026
* server: use random media marker * nits * remove legacy <__image__> token * revert special char in random
ArberSephirotheca
pushed a commit
to ArberSephirotheca/llama.cpp
that referenced
this pull request
Apr 21, 2026
…g#21962) (ggml-org#21980) * server: tests: fetch random media marker via /apply-template (ggml-org#21962 fix) * server: allow pinning media marker via LLAMA_MEDIA_MARKER env var get_media_marker() checks LLAMA_MEDIA_MARKER at first call and uses it as-is if set, falling back to the random marker otherwise. Tests no longer need to fetch the marker dynamically via /apply-template: the fixture sets LLAMA_MEDIA_MARKER=<__media__> so the hardcoded prompts work as before. Address review feedback from ngxson * server: make get_media_marker() thread-safe via magic statics Use a C++11 static local with a lambda initializer instead of a global static with an empty-check. The runtime guarantees initialization exactly once without explicit locking. Address review feedback from ggerganov * nits * nits
arthw
pushed a commit
to arthw/llama.cpp
that referenced
this pull request
Apr 23, 2026
* server: use random media marker * nits * remove legacy <__image__> token * revert special char in random
arthw
pushed a commit
to arthw/llama.cpp
that referenced
this pull request
Apr 23, 2026
…g#21962) (ggml-org#21980) * server: tests: fetch random media marker via /apply-template (ggml-org#21962 fix) * server: allow pinning media marker via LLAMA_MEDIA_MARKER env var get_media_marker() checks LLAMA_MEDIA_MARKER at first call and uses it as-is if set, falling back to the random marker otherwise. Tests no longer need to fetch the marker dynamically via /apply-template: the fixture sets LLAMA_MEDIA_MARKER=<__media__> so the hardcoded prompts work as before. Address review feedback from ngxson * server: make get_media_marker() thread-safe via magic statics Use a C++11 static local with a lambda initializer instead of a global static with an empty-check. The runtime guarantees initialization exactly once without explicit locking. Address review feedback from ggerganov * nits * nits
rsenthilkumar6
pushed a commit
to rsenthilkumar6/llama.cpp
that referenced
this pull request
May 1, 2026
* server: use random media marker * nits * remove legacy <__image__> token * revert special char in random
rsenthilkumar6
pushed a commit
to rsenthilkumar6/llama.cpp
that referenced
this pull request
May 1, 2026
…g#21962) (ggml-org#21980) * server: tests: fetch random media marker via /apply-template (ggml-org#21962 fix) * server: allow pinning media marker via LLAMA_MEDIA_MARKER env var get_media_marker() checks LLAMA_MEDIA_MARKER at first call and uses it as-is if set, falling back to the random marker otherwise. Tests no longer need to fetch the marker dynamically via /apply-template: the fixture sets LLAMA_MEDIA_MARKER=<__media__> so the hardcoded prompts work as before. Address review feedback from ngxson * server: make get_media_marker() thread-safe via magic statics Use a C++11 static local with a lambda initializer instead of a global static with an empty-check. The runtime guarantees initialization exactly once without explicit locking. Address review feedback from ggerganov * nits * nits
jimbothigpen
pushed a commit
to jimbothigpen/frankenturbo2
that referenced
this pull request
May 2, 2026
* server: use random media marker * nits * remove legacy <__image__> token * revert special char in random
jimbothigpen
pushed a commit
to jimbothigpen/frankenturbo2
that referenced
this pull request
May 2, 2026
…g#21962) (ggml-org#21980) * server: tests: fetch random media marker via /apply-template (ggml-org#21962 fix) * server: allow pinning media marker via LLAMA_MEDIA_MARKER env var get_media_marker() checks LLAMA_MEDIA_MARKER at first call and uses it as-is if set, falling back to the random marker otherwise. Tests no longer need to fetch the marker dynamically via /apply-template: the fixture sets LLAMA_MEDIA_MARKER=<__media__> so the hardcoded prompts work as before. Address review feedback from ngxson * server: make get_media_marker() thread-safe via magic statics Use a C++11 static local with a lambda initializer instead of a global static with an empty-check. The runtime guarantees initialization exactly once without explicit locking. Address review feedback from ggerganov * nits * nits
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
Fix #21955
Generate a random media marker each time we launch the server. The string is random enough that collision is impossible to happen in practice
How random? 32 characters, 0-9a-zA-Z, making it 62^32 combinations. And according to math stackexchange:
Requirements