Skip to content

server: use random media marker#21962

Merged
ngxson merged 4 commits intoggml-org:masterfrom
ngxson:xsn/server_random_media_marker
Apr 15, 2026
Merged

server: use random media marker#21962
ngxson merged 4 commits intoggml-org:masterfrom
ngxson:xsn/server_random_media_marker

Conversation

@ngxson
Copy link
Copy Markdown
Contributor

@ngxson ngxson commented Apr 15, 2026

Overview

Fix #21955

Generate a random media marker each time we launch the server. The string is random enough that collision is impossible to happen in practice

How random? 32 characters, 0-9a-zA-Z, making it 62^32 combinations. And according to math stackexchange:

image

Requirements

@ngxson ngxson requested a review from a team as a code owner April 15, 2026 17:36
@ngxson
Copy link
Copy Markdown
Contributor Author

ngxson commented Apr 15, 2026

asking for 2nd approval @ggml-org/maintainers

@ngxson
Copy link
Copy Markdown
Contributor Author

ngxson commented Apr 15, 2026

ping @CISC @pwilkin too if you can approve this

test result can be found in #21955

@ngxson ngxson merged commit 408225b into ggml-org:master Apr 15, 2026
47 of 50 checks passed
@CISC
Copy link
Copy Markdown
Member

CISC commented Apr 15, 2026

ServeurpersoCom added a commit to ServeurpersoCom/llama.cpp that referenced this pull request Apr 16, 2026
ServeurpersoCom added a commit to ServeurpersoCom/llama.cpp that referenced this pull request Apr 16, 2026
ServeurpersoCom added a commit to ServeurpersoCom/llama.cpp that referenced this pull request Apr 16, 2026
ServeurpersoCom added a commit to ServeurpersoCom/llama.cpp that referenced this pull request Apr 16, 2026
ggerganov pushed a commit that referenced this pull request Apr 16, 2026
…#21980)

* server: tests: fetch random media marker via /apply-template (#21962 fix)

* server: allow pinning media marker via LLAMA_MEDIA_MARKER env var

get_media_marker() checks LLAMA_MEDIA_MARKER at first call and uses it
as-is if set, falling back to the random marker otherwise.

Tests no longer need to fetch the marker dynamically via /apply-template:
the fixture sets LLAMA_MEDIA_MARKER=<__media__> so the hardcoded prompts
work as before.

Address review feedback from ngxson

* server: make get_media_marker() thread-safe via magic statics

Use a C++11 static local with a lambda initializer instead of a global
static with an empty-check. The runtime guarantees initialization exactly
once without explicit locking.

Address review feedback from ggerganov

* nits

* nits
cnsiva pushed a commit to saas-home/llama.cpp that referenced this pull request Apr 17, 2026
…g#21962) (ggml-org#21980)

* server: tests: fetch random media marker via /apply-template (ggml-org#21962 fix)

* server: allow pinning media marker via LLAMA_MEDIA_MARKER env var

get_media_marker() checks LLAMA_MEDIA_MARKER at first call and uses it
as-is if set, falling back to the random marker otherwise.

Tests no longer need to fetch the marker dynamically via /apply-template:
the fixture sets LLAMA_MEDIA_MARKER=<__media__> so the hardcoded prompts
work as before.

Address review feedback from ngxson

* server: make get_media_marker() thread-safe via magic statics

Use a C++11 static local with a lambda initializer instead of a global
static with an empty-check. The runtime guarantees initialization exactly
once without explicit locking.

Address review feedback from ggerganov

* nits

* nits
samuraieng pushed a commit to samuraieng/llama.cpp that referenced this pull request Apr 19, 2026
…g#21962) (ggml-org#21980)

* server: tests: fetch random media marker via /apply-template (ggml-org#21962 fix)

* server: allow pinning media marker via LLAMA_MEDIA_MARKER env var

get_media_marker() checks LLAMA_MEDIA_MARKER at first call and uses it
as-is if set, falling back to the random marker otherwise.

Tests no longer need to fetch the marker dynamically via /apply-template:
the fixture sets LLAMA_MEDIA_MARKER=<__media__> so the hardcoded prompts
work as before.

Address review feedback from ngxson

* server: make get_media_marker() thread-safe via magic statics

Use a C++11 static local with a lambda initializer instead of a global
static with an empty-check. The runtime guarantees initialization exactly
once without explicit locking.

Address review feedback from ggerganov

* nits

* nits
mudler added a commit to mudler/LocalAI that referenced this pull request Apr 19, 2026
CI turboquant docker build was failing with:

  grpc-server.cpp:2825:40: error: use of undeclared identifier
  'get_media_marker'

The call was added by 7809c5f (PR #9412) to propagate the mtmd random
per-server media marker upstream landed in ggml-org/llama.cpp#21962. The
TheTom/llama-cpp-turboquant fork branched before that PR, so its
server-common.cpp has no such symbol.

Extend patch-grpc-server.sh to substitute get_media_marker() with the
legacy "<__media__>" literal in the build-time grpc-server.cpp copy
under turboquant-<flavor>-build/. The fork's mtmd_default_marker()
returns exactly that string, and the Go layer falls back to the same
sentinel when media_marker is empty, so behavior on the turboquant path
is unchanged. Patched copy only — the shared source under
backend/cpp/llama-cpp/ keeps compiling against vanilla upstream.

Verified by running `make docker-build-turboquant` locally end-to-end:
all five flavors (avx, avx2, avx512, fallback, grpc+rpc-server) now
compile past the previous failure and the image tags successfully.
mudler added a commit to mudler/LocalAI that referenced this pull request Apr 19, 2026
…9423)

* fix(turboquant): drop ignore-eos patch, bump fork to b8967-627ebbc

The upstream PR #21203 (server: respect the ignore_eos flag) has been
merged into the TheTom/llama-cpp-turboquant feature/turboquant-kv-cache
branch. With the fix now in-tree, 0001-server-respect-the-ignore-eos-flag.patch
no longer applies (git apply sees its additions already present) and the
nightly turboquant bump fails.

Retire the patch and bump the pin to the first fork revision that carries
the merged fix (tag feature-turboquant-kv-cache-b8967-627ebbc). This matches
the contract in apply-patches.sh: drop patches once the fork catches up.

* fix(turboquant): patch out get_media_marker() call in grpc-server copy

CI turboquant docker build was failing with:

  grpc-server.cpp:2825:40: error: use of undeclared identifier
  'get_media_marker'

The call was added by 7809c5f (PR #9412) to propagate the mtmd random
per-server media marker upstream landed in ggml-org/llama.cpp#21962. The
TheTom/llama-cpp-turboquant fork branched before that PR, so its
server-common.cpp has no such symbol.

Extend patch-grpc-server.sh to substitute get_media_marker() with the
legacy "<__media__>" literal in the build-time grpc-server.cpp copy
under turboquant-<flavor>-build/. The fork's mtmd_default_marker()
returns exactly that string, and the Go layer falls back to the same
sentinel when media_marker is empty, so behavior on the turboquant path
is unchanged. Patched copy only — the shared source under
backend/cpp/llama-cpp/ keeps compiling against vanilla upstream.

Verified by running `make docker-build-turboquant` locally end-to-end:
all five flavors (avx, avx2, avx512, fallback, grpc+rpc-server) now
compile past the previous failure and the image tags successfully.
mengqin pushed a commit to mengqin/llama.cpp that referenced this pull request Apr 20, 2026
* server: use random media marker

* nits

* remove legacy <__image__> token

* revert special char in random
mengqin pushed a commit to mengqin/llama.cpp that referenced this pull request Apr 20, 2026
…g#21962) (ggml-org#21980)

* server: tests: fetch random media marker via /apply-template (ggml-org#21962 fix)

* server: allow pinning media marker via LLAMA_MEDIA_MARKER env var

get_media_marker() checks LLAMA_MEDIA_MARKER at first call and uses it
as-is if set, falling back to the random marker otherwise.

Tests no longer need to fetch the marker dynamically via /apply-template:
the fixture sets LLAMA_MEDIA_MARKER=<__media__> so the hardcoded prompts
work as before.

Address review feedback from ngxson

* server: make get_media_marker() thread-safe via magic statics

Use a C++11 static local with a lambda initializer instead of a global
static with an empty-check. The runtime guarantees initialization exactly
once without explicit locking.

Address review feedback from ggerganov

* nits

* nits
ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request Apr 21, 2026
* server: use random media marker

* nits

* remove legacy <__image__> token

* revert special char in random
ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request Apr 21, 2026
…g#21962) (ggml-org#21980)

* server: tests: fetch random media marker via /apply-template (ggml-org#21962 fix)

* server: allow pinning media marker via LLAMA_MEDIA_MARKER env var

get_media_marker() checks LLAMA_MEDIA_MARKER at first call and uses it
as-is if set, falling back to the random marker otherwise.

Tests no longer need to fetch the marker dynamically via /apply-template:
the fixture sets LLAMA_MEDIA_MARKER=<__media__> so the hardcoded prompts
work as before.

Address review feedback from ngxson

* server: make get_media_marker() thread-safe via magic statics

Use a C++11 static local with a lambda initializer instead of a global
static with an empty-check. The runtime guarantees initialization exactly
once without explicit locking.

Address review feedback from ggerganov

* nits

* nits
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Apr 23, 2026
* server: use random media marker

* nits

* remove legacy <__image__> token

* revert special char in random
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Apr 23, 2026
…g#21962) (ggml-org#21980)

* server: tests: fetch random media marker via /apply-template (ggml-org#21962 fix)

* server: allow pinning media marker via LLAMA_MEDIA_MARKER env var

get_media_marker() checks LLAMA_MEDIA_MARKER at first call and uses it
as-is if set, falling back to the random marker otherwise.

Tests no longer need to fetch the marker dynamically via /apply-template:
the fixture sets LLAMA_MEDIA_MARKER=<__media__> so the hardcoded prompts
work as before.

Address review feedback from ngxson

* server: make get_media_marker() thread-safe via magic statics

Use a C++11 static local with a lambda initializer instead of a global
static with an empty-check. The runtime guarantees initialization exactly
once without explicit locking.

Address review feedback from ggerganov

* nits

* nits
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
* server: use random media marker

* nits

* remove legacy <__image__> token

* revert special char in random
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
…g#21962) (ggml-org#21980)

* server: tests: fetch random media marker via /apply-template (ggml-org#21962 fix)

* server: allow pinning media marker via LLAMA_MEDIA_MARKER env var

get_media_marker() checks LLAMA_MEDIA_MARKER at first call and uses it
as-is if set, falling back to the random marker otherwise.

Tests no longer need to fetch the marker dynamically via /apply-template:
the fixture sets LLAMA_MEDIA_MARKER=<__media__> so the hardcoded prompts
work as before.

Address review feedback from ngxson

* server: make get_media_marker() thread-safe via magic statics

Use a C++11 static local with a lambda initializer instead of a global
static with an empty-check. The runtime guarantees initialization exactly
once without explicit locking.

Address review feedback from ggerganov

* nits

* nits
jimbothigpen pushed a commit to jimbothigpen/frankenturbo2 that referenced this pull request May 2, 2026
* server: use random media marker

* nits

* remove legacy <__image__> token

* revert special char in random
jimbothigpen pushed a commit to jimbothigpen/frankenturbo2 that referenced this pull request May 2, 2026
…g#21962) (ggml-org#21980)

* server: tests: fetch random media marker via /apply-template (ggml-org#21962 fix)

* server: allow pinning media marker via LLAMA_MEDIA_MARKER env var

get_media_marker() checks LLAMA_MEDIA_MARKER at first call and uses it
as-is if set, falling back to the random marker otherwise.

Tests no longer need to fetch the marker dynamically via /apply-template:
the fixture sets LLAMA_MEDIA_MARKER=<__media__> so the hardcoded prompts
work as before.

Address review feedback from ngxson

* server: make get_media_marker() thread-safe via magic statics

Use a C++11 static local with a lambda initializer instead of a global
static with an empty-check. The runtime guarantees initialization exactly
once without explicit locking.

Address review feedback from ggerganov

* nits

* nits
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misc. bug: hardcoded media marker is not transparent for user input

3 participants