Skip to content

Convert: Make NVFP4 and MXFP4 HF conversions say NVFP4/MXFP4 instead of BF16#20730

Merged
ggerganov merged 6 commits intoggml-org:masterfrom
michaelw9999:fix-nvfp4-naming
Mar 21, 2026
Merged

Convert: Make NVFP4 and MXFP4 HF conversions say NVFP4/MXFP4 instead of BF16#20730
ggerganov merged 6 commits intoggml-org:masterfrom
michaelw9999:fix-nvfp4-naming

Conversation

@michaelw9999
Copy link
Copy Markdown
Contributor

@michaelw9999 michaelw9999 commented Mar 18, 2026

When converting an NVFP4 or MXFP4 HF model to GGUF, the default behavior is to keep the same name if not specified, so the converted models still are named BF16:

Before:

| model                          |     
| ------------------------------ | 
| qwen35 0.8B BF16               |

After:

| model                          |     
| ------------------------------ | 
| qwen35 0.8B NVFP4              |

This script edit identifies the file type properly for both NVFP4 and MXFP4 and adds the missing MOSTLY_NVFP4 and missing MOSTLY_MXFP4_MOE from the table in the py constants file.

@michaelw9999 michaelw9999 requested a review from CISC as a code owner March 18, 2026 19:58
@michaelw9999 michaelw9999 changed the title Corrected convert script for NVFP4 naming and updated gguf constants Convert: Make NVFP4 HF conversions say NVFP4 instead of BF16 Mar 18, 2026
Copy link
Copy Markdown
Member

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This simplifies things while also fixing MXFP4.

Add quant_method = (self.hparams.get("quantization_config") or {}).get("quant_method") here

quant_algo = (self.hparams.get("quantization_config") or {}).get("quant_algo")

and add self._is_mxfp4 = quant_method == "mxfp4" here
self._is_nvfp4 = quant_algo == "NVFP4"

then update this

llama.cpp/convert_hf_to_gguf.py

Lines 11126 to 11131 in 5d41b4c

# TODO: remove once MXFP4 is supported more generally
def dequant_model(self):
quant_config = self.hparams.get("quantization_config")
if quant_config is not None and quant_config.get("quant_method") == "mxfp4":
return
return super().dequant_model()

Comment thread convert_hf_to_gguf.py Outdated
Comment thread convert_hf_to_gguf.py Outdated
Comment thread convert_hf_to_gguf.py Outdated
Comment thread gguf-py/gguf/constants.py
michaelw9999 and others added 3 commits March 18, 2026 14:43
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
@michaelw9999 michaelw9999 changed the title Convert: Make NVFP4 HF conversions say NVFP4 instead of BF16 Convert: Make NVFP4 and MXFP4 HF conversions say NVFP4/MXFP4 instead of BF16 Mar 18, 2026
@github-actions github-actions Bot added the python python script changes label Mar 18, 2026
@CISC CISC requested a review from ggerganov March 21, 2026 10:55
Comment thread convert_hf_to_gguf.py
Comment on lines 732 to +733
self._is_nvfp4 = quant_algo == "NVFP4"
self._is_mxfp4 = quant_method == "mxfp4"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be case insensitive?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not as far as I've been able to tell, quant_method seems to be always lower-case, while quant_algo (at least for NVFP4) is always uppper-case.

@ggerganov ggerganov merged commit eac9c6e into ggml-org:master Mar 21, 2026
1 check passed
@michaelw9999 michaelw9999 deleted the fix-nvfp4-naming branch March 21, 2026 16:14
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
…of BF16 (ggml-org#20730)

* Corrected convert script for NVFP4 naming and updated gguf constants

* Add mostly_MXFP4 to FileType

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* simplify

* set initial value [no ci]

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
…of BF16 (ggml-org#20730)

* Corrected convert script for NVFP4 naming and updated gguf constants

* Add mostly_MXFP4 to FileType

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* simplify

* set initial value [no ci]

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants