Name and Version
master
Operating systems
Linux
GGML backends
CUDA
Hardware
NA
Models
Voxtral
Problem description & steps to reproduce
Tracing flaky behavior of Voxtral for audio processing led to discovery that the [BEGIN_AUDIO] tag is not being used in the mtmd prompt template. It sort of works without it but is quite flaky. Recommend fix:
--- mtmd.cpp 2025-12-08 13:13:44.202285955 -0500
+++ mtmd.cpp.new 2025-12-08 13:13:29.850285270 -0500
@@ -330,10 +330,10 @@
aud_beg = "<|audio_bos|>";
aud_end = "<|audio_eos|>";
- } else if (proj == PROJECTOR_TYPE_ULTRAVOX) {
+ } else if ((proj == PROJECTOR_TYPE_ULTRAVOX) ||
+ (proj == PROJECTOR_TYPE_VOXTRAL)) {
// [BEGIN_AUDIO] ... (embeddings) ...
aud_beg = "[BEGIN_AUDIO]";
-
}
}
First Bad Commit
NA
Relevant log output
Name and Version
master
Operating systems
Linux
GGML backends
CUDA
Hardware
NA
Models
Voxtral
Problem description & steps to reproduce
Tracing flaky behavior of Voxtral for audio processing led to discovery that the [BEGIN_AUDIO] tag is not being used in the mtmd prompt template. It sort of works without it but is quite flaky. Recommend fix:
First Bad Commit
NA
Relevant log output