convert : add Llama4ForCausalLM by ngxson · Pull Request #16042 · ggml-org/llama.cpp

ngxson · 2025-09-17T04:59:13Z

Tested with:

Very important note from model card:

These models are not general-purpose chat models. They are Supervised Fine-Tuned (SFT) models, specifically trained to address mathematical, programming (Python, C++), and scientific problems.

That is the reason why model cannot response to simple "hi"

CISC

LGTM

CISC · 2025-09-17T12:45:07Z

Usually there's a use_qk_norm config.json value btw, is that not the case for all Llama4 based models?

ngxson · 2025-09-17T13:12:49Z

Usually there's a use_qk_norm config.json value btw, is that not the case for all Llama4 based models?

It's the case for all llama 4 models, but traditionally this the KQ norm is something arch-specific so I didn't add GGUF metadata in the first place. but yes I think we should add it now.

ngxson · 2025-09-17T13:14:54Z

Hmm but as this may affect existing GGUF for the larger llama 4 MoE, I think it will be a bit messy to add it as GGUF metadata.

In anyway, the rule is currently as follow: all llama 4 models uses KQ norm except for the biggest 17B_128E

So I think we can keep it as-is

CISC · 2025-09-17T13:18:04Z

Hmm but as this may affect existing GGUF for the larger llama 4 MoE, I think it will be a bit messy to add it as GGUF metadata.

In anyway, the rule is currently as follow: all llama 4 models uses KQ norm except for the biggest 17B_128E

So I think we can keep it as-is

Absolutely, just wondering if it's worth adding it for the future.

* convert : add Llama4ForCausalLM * handle swa * half working version * fix use_kq_norm * fix use_kq_norm

convert : add Llama4ForCausalLM

66fe9ff

ngxson marked this pull request as draft September 17, 2025 05:04

handle swa

6f8108c

github-actions Bot added the python python script changes label Sep 17, 2025

ngxson added 3 commits September 17, 2025 12:38

half working version

c540db6

fix use_kq_norm

eb60103

fix use_kq_norm

86c8d34

ngxson marked this pull request as ready for review September 17, 2025 12:25

ngxson requested review from CISC and ggerganov and removed request for ggerganov September 17, 2025 12:25

CISC approved these changes Sep 17, 2025

View reviewed changes

ngxson merged commit 8f8f227 into ggml-org:master Sep 17, 2025
51 of 52 checks passed

blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026

convert : add Llama4ForCausalLM (#16042)

8d8b03a

* convert : add Llama4ForCausalLM * handle swa * half working version * fix use_kq_norm * fix use_kq_norm

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026

convert : add Llama4ForCausalLM (ggml-org#16042)

ac89ebc

* convert : add Llama4ForCausalLM * handle swa * half working version * fix use_kq_norm * fix use_kq_norm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert : add Llama4ForCausalLM#16042

convert : add Llama4ForCausalLM#16042
ngxson merged 5 commits intoggml-org:masterfrom
ngxson:xsn/llama4causalfix

ngxson commented Sep 17, 2025 •

edited

Loading

Uh oh!

CISC left a comment

Uh oh!

CISC commented Sep 17, 2025

Uh oh!

ngxson commented Sep 17, 2025

Uh oh!

ngxson commented Sep 17, 2025

Uh oh!

CISC commented Sep 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ngxson commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CISC left a comment

Choose a reason for hiding this comment

Uh oh!

CISC commented Sep 17, 2025

Uh oh!

ngxson commented Sep 17, 2025

Uh oh!

ngxson commented Sep 17, 2025

Uh oh!

CISC commented Sep 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ngxson commented Sep 17, 2025 •

edited

Loading