Convert: Make NVFP4 and MXFP4 HF conversions say NVFP4/MXFP4 instead of BF16#20730
Merged
ggerganov merged 6 commits intoggml-org:masterfrom Mar 21, 2026
Merged
Convert: Make NVFP4 and MXFP4 HF conversions say NVFP4/MXFP4 instead of BF16#20730ggerganov merged 6 commits intoggml-org:masterfrom
ggerganov merged 6 commits intoggml-org:masterfrom
Conversation
CISC
reviewed
Mar 18, 2026
Member
CISC
left a comment
There was a problem hiding this comment.
This simplifies things while also fixing MXFP4.
Add quant_method = (self.hparams.get("quantization_config") or {}).get("quant_method") here
llama.cpp/convert_hf_to_gguf.py
Line 714 in 5d41b4c
and add
self._is_mxfp4 = quant_method == "mxfp4" herellama.cpp/convert_hf_to_gguf.py
Line 730 in 5d41b4c
then update this
llama.cpp/convert_hf_to_gguf.py
Lines 11126 to 11131 in 5d41b4c
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
CISC
approved these changes
Mar 21, 2026
ggerganov
reviewed
Mar 21, 2026
Comment on lines
732
to
+733
| self._is_nvfp4 = quant_algo == "NVFP4" | ||
| self._is_mxfp4 = quant_method == "mxfp4" |
Member
There was a problem hiding this comment.
Should these be case insensitive?
Member
There was a problem hiding this comment.
Not as far as I've been able to tell, quant_method seems to be always lower-case, while quant_algo (at least for NVFP4) is always uppper-case.
Seunghhon
pushed a commit
to Seunghhon/llama.cpp
that referenced
this pull request
Apr 26, 2026
…of BF16 (ggml-org#20730) * Corrected convert script for NVFP4 naming and updated gguf constants * Add mostly_MXFP4 to FileType Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * simplify * set initial value [no ci] --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
rsenthilkumar6
pushed a commit
to rsenthilkumar6/llama.cpp
that referenced
this pull request
May 1, 2026
…of BF16 (ggml-org#20730) * Corrected convert script for NVFP4 naming and updated gguf constants * Add mostly_MXFP4 to FileType Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * simplify * set initial value [no ci] --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When converting an NVFP4 or MXFP4 HF model to GGUF, the default behavior is to keep the same name if not specified, so the converted models still are named BF16:
Before:
After:
This script edit identifies the file type properly for both NVFP4 and MXFP4 and adds the missing MOSTLY_NVFP4 and missing MOSTLY_MXFP4_MOE from the table in the py constants file.