Skip to content

mtmd: qwen3-asr wrong output#22343

Closed
cora4 wants to merge 2 commits intoggml-org:masterfrom
cora4:master
Closed

mtmd: qwen3-asr wrong output#22343
cora4 wants to merge 2 commits intoggml-org:masterfrom
cora4:master

Conversation

@cora4
Copy link
Copy Markdown

@cora4 cora4 commented Apr 25, 2026

Overview

qwen3-asr output is wrong with llama.cpp

@cora4 cora4 requested a review from a team as a code owner April 25, 2026 01:51
@ngxson
Copy link
Copy Markdown
Contributor

ngxson commented Apr 25, 2026

closing this as you did not disclosed AI usage and this is not the intended way to fix

@ngxson ngxson closed this Apr 25, 2026
@ngxson
Copy link
Copy Markdown
Contributor

ngxson commented Apr 25, 2026

The code comments are clearly written by AI

@ngxson
Copy link
Copy Markdown
Contributor

ngxson commented Apr 25, 2026

Your fix cannot be accepted as-is because it introduces a hack into the existing preprocessor, it is not clean and potentially break other models.

The cgraph implementation is redundant because (1) you can use the 4th dim for batching and (2) you replaced the old build_vit with essentially the over-complicated version equivalent of it.

But overall, I do not wish to proceed with contributions where AI is used but not properly disclosures.

@cora4 cora4 changed the title Fixes qwen3-asr mtmd: qwen3-asr wrong output Apr 25, 2026
@cora4
Copy link
Copy Markdown
Author

cora4 commented Apr 25, 2026

qwen3a can be implemented as mtmd_audio_preprocessor_qwen3a, but it was merged in the whisper preprocessor.

mtmd: qwen3 audio support (qwen3-omni and qwen3-asr) (#19441)

tools/mtmd/mtmd.cpp :
// set preprocessor
switch (proj) {
case PROJECTOR_TYPE_QWEN2A:
case PROJECTOR_TYPE_QWEN3A:
case PROJECTOR_TYPE_QWEN25O:
{
// <|audio_bos|> ... (embeddings) ... <|audio_eos|>
aud_beg = "<|audio_bos|>";
aud_end = "<|audio_eos|>";
audio_preproc = std::make_unique<mtmd_audio_preprocessor_whisper>(ctx_a);
} break;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants