Override Transformers defaults by GGUF defaults by a4lg · Pull Request #42770 · huggingface/transformers

a4lg · 2025-12-10T12:45:12Z

What does this PR do?

In some models, GGUF uses default or fixed values different from this library.
To integrate GGUF-based models without additional configuration (and/or individual config.json), we need some kind of compatibility layer (like tensor processors in src/transformers/modeling_gguf_pytorch_utils.py while loading/converting GGUF-based model tensors).

This commit provides additional mapping to provide GGUF-specific default values to initialize parameters in this library.

Currently, only fixed norm_topk_prob value of Qwen3 MoE (True) is defined because:

It differs from the default value of this library (False) and
If this parameter is incorrectly set, it results in almost completely garbled output (see [Model][Quantization] Override HF defaults to GGUF ones (incl. Qwen3 MoE) vllm-project/vllm#30118 for examples).

I'm sure that Qwen3 MoE + GGUF issue should be addressed somehow but I'm not sure whether this PR is the right way to fix the issue.
So, feel free to leave a review!

I could not check all parameters but at least I checked important parameters for MoE models with GGUF support in this library.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@Cyrilvallez @SunMarc @MekkCyber

MekkCyber

Thanks for looking into this @a4lg !

In some models, GGUF uses default or fixed values different from this library. To integrate GGUF-based models without additional configuration, we need some kind of compatibility layer. This commit provides additional mapping to provide GGUF-specific default values to initialize parameters in this library. Currently, only fixed "norm_topk_prob" value of Qwen3 MoE (True) is defined because (a) it differs from the default value of this library (False) and (b) if this parameter is incorrectly set, it results in almost completely garbled output. Signed-off-by: Tsukasa OI <floss_llm@irq.a4lg.com>

MekkCyber

Thank you for iterating

HuggingFaceDocBuilderDev · 2025-12-10T13:33:54Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

SunMarc

Thanks !

* Override Transformers defaults by GGUF defaults In some models, GGUF uses default or fixed values different from this library. To integrate GGUF-based models without additional configuration, we need some kind of compatibility layer. This commit provides additional mapping to provide GGUF-specific default values to initialize parameters in this library. Currently, only fixed "norm_topk_prob" value of Qwen3 MoE (True) is defined because (a) it differs from the default value of this library (False) and (b) if this parameter is incorrectly set, it results in almost completely garbled output. Signed-off-by: Tsukasa OI <floss_llm@irq.a4lg.com> * Apply suggestions from code review Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> --------- Signed-off-by: Tsukasa OI <floss_llm@irq.a4lg.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

MekkCyber reviewed Dec 10, 2025

View reviewed changes

Comment thread src/transformers/integrations/ggml.py

a4lg force-pushed the gguf-config-override-defaults branch from 39a970b to cfd48f7 Compare December 10, 2025 13:13

a4lg force-pushed the gguf-config-override-defaults branch from cfd48f7 to 68d2175 Compare December 10, 2025 13:16

MekkCyber approved these changes Dec 10, 2025

View reviewed changes

Comment thread src/transformers/integrations/ggml.py

a4lg and others added 2 commits December 10, 2025 22:42

Apply suggestions from code review

21012b0

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

Merge branch 'main' into gguf-config-override-defaults

ca3b023

SunMarc approved these changes Dec 10, 2025

View reviewed changes

SunMarc enabled auto-merge (squash) December 10, 2025 14:26

SunMarc merged commit 51a6673 into huggingface:main Dec 10, 2025
25 checks passed

a4lg mentioned this pull request Dec 13, 2025

[Model][Quantization] Override HF defaults to GGUF ones (incl. Qwen3 MoE) vllm-project/vllm#30118

Merged

5 tasks

lucaspirola mentioned this pull request Apr 27, 2026

[GGUF] Add support for Qwen3.5 MoE (qwen35moe arch) #45668

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Override Transformers defaults by GGUF defaults#42770

Override Transformers defaults by GGUF defaults#42770
SunMarc merged 3 commits intohuggingface:mainfrom
a4lg:gguf-config-override-defaults

a4lg commented Dec 10, 2025 •

edited

Loading

Uh oh!

MekkCyber left a comment

Uh oh!

Uh oh!

MekkCyber left a comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Dec 10, 2025

Uh oh!

SunMarc left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

a4lg commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

MekkCyber left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MekkCyber left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Dec 10, 2025

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

a4lg commented Dec 10, 2025 •

edited

Loading