mtmd: use causal attn for gemma 4 audio (+ small breaking change to mtmd) by ngxson · Pull Request #21824 · ggml-org/llama.cpp

ngxson · 2026-04-12T21:29:43Z

Overview

Continue #21421

Fix #21820

Fix #21816

For gemma 4, the text model only use non-causal (aka bidirectional attention) for vision input

Breaking change

mtmd_decode_use_non_causal now requires passing a second param, the current chunk

The chunk is optional and can be nullptr (default: assuming the current chunk is image).

In the case of gemma 4:

Vision chunk requires non-causal
Audio chunk requires causal (same as text input)

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: no

mtmd: use causal attn for gemma 4 audio

8895606

ngxson requested a review from a team as a code owner April 12, 2026 21:29

github-actions Bot added the examples label Apr 12, 2026

This was referenced Apr 12, 2026

mtmd: add Gemma 4 audio conformer encoder support #21421

Merged

Eval bug: Gemma4 E2B does not produce correct transcripts from audio #21820

Closed

Eval bug: Gemma4 Crash on mp3, wav uploaded in webui is skipped #21825

Closed

ngxson requested a review from a team April 12, 2026 22:44

CISC approved these changes Apr 13, 2026

View reviewed changes

danbev approved these changes Apr 13, 2026

View reviewed changes

ngxson merged commit 920b3e7 into ggml-org:master Apr 13, 2026
47 checks passed

cnsiva pushed a commit to saas-home/llama.cpp that referenced this pull request Apr 13, 2026

mtmd: use causal attn for gemma 4 audio (ggml-org#21824)

62421f6

HermestoAizales pushed a commit to HermestoAizales/llama.cpp that referenced this pull request Apr 13, 2026

mtmd: use causal attn for gemma 4 audio (ggml-org#21824)

6d8a7ca

ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request Apr 21, 2026

mtmd: use causal attn for gemma 4 audio (ggml-org#21824)

d87ef8d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mtmd: use causal attn for gemma 4 audio (+ small breaking change to mtmd)#21824

mtmd: use causal attn for gemma 4 audio (+ small breaking change to mtmd)#21824
ngxson merged 1 commit intoggml-org:masterfrom
ngxson:xsn/g4_causal_audio

ngxson commented Apr 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ngxson commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Breaking change

Requirements

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ngxson commented Apr 12, 2026 •

edited

Loading