Convert: Make NVFP4 and MXFP4 HF conversions say NVFP4/MXFP4 instead of BF16 by michaelw9999 · Pull Request #20730 · ggml-org/llama.cpp

michaelw9999 · 2026-03-18T19:58:03Z

When converting an NVFP4 or MXFP4 HF model to GGUF, the default behavior is to keep the same name if not specified, so the converted models still are named BF16:

Before:

| model                          |     
| ------------------------------ | 
| qwen35 0.8B BF16               |

After:

| model                          |     
| ------------------------------ | 
| qwen35 0.8B NVFP4              |

This script edit identifies the file type properly for both NVFP4 and MXFP4 and adds the missing MOSTLY_NVFP4 and missing MOSTLY_MXFP4_MOE from the table in the py constants file.

CISC

This simplifies things while also fixing MXFP4.

Add quant_method = (self.hparams.get("quantization_config") or {}).get("quant_method") here

llama.cpp/convert_hf_to_gguf.py

Line 714 in 5d41b4c

quant_algo = (self.hparams.get("quantization_config") or {}).get("quant_algo")

and add self._is_mxfp4 = quant_method == "mxfp4" here

llama.cpp/convert_hf_to_gguf.py

Line 730 in 5d41b4c

self._is_nvfp4 = quant_algo == "NVFP4"

then update this

llama.cpp/convert_hf_to_gguf.py

Lines 11126 to 11131 in 5d41b4c

    
           # TODO: remove once MXFP4 is supported more generally 
        
           def dequant_model(self): 
        
               quant_config = self.hparams.get("quantization_config") 
        
               if quant_config is not None and quant_config.get("quant_method") == "mxfp4": 
        
                   return 
        
               return super().dequant_model()

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

ggerganov · 2026-03-21T11:11:05Z

        self._is_nvfp4 = quant_algo == "NVFP4"
+        self._is_mxfp4 = quant_method == "mxfp4"


Should these be case insensitive?

Not as far as I've been able to tell, quant_method seems to be always lower-case, while quant_algo (at least for NVFP4) is always uppper-case.

…of BF16 (ggml-org#20730) * Corrected convert script for NVFP4 naming and updated gguf constants * Add mostly_MXFP4 to FileType Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * simplify * set initial value [no ci] --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

Corrected convert script for NVFP4 naming and updated gguf constants

5d41b4c

michaelw9999 requested a review from CISC as a code owner March 18, 2026 19:58

michaelw9999 changed the title ~~Corrected convert script for NVFP4 naming and updated gguf constants~~ Convert: Make NVFP4 HF conversions say NVFP4 instead of BF16 Mar 18, 2026

CISC reviewed Mar 18, 2026

View reviewed changes

Comment thread convert_hf_to_gguf.py Outdated

Comment thread convert_hf_to_gguf.py Outdated

Comment thread convert_hf_to_gguf.py Outdated

Comment thread gguf-py/gguf/constants.py

michaelw9999 and others added 3 commits March 18, 2026 14:43

Add mostly_MXFP4 to FileType

b3f80d7

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

Update convert_hf_to_gguf.py

dafffd1

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

Update convert_hf_to_gguf.py

b24b97b

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

michaelw9999 changed the title ~~Convert: Make NVFP4 HF conversions say NVFP4 instead of BF16~~ Convert: Make NVFP4 and MXFP4 HF conversions say NVFP4/MXFP4 instead of BF16 Mar 18, 2026

github-actions Bot added the python python script changes label Mar 18, 2026

CISC added 2 commits March 21, 2026 11:44

simplify

827486c

set initial value [no ci]

848f68f

CISC approved these changes Mar 21, 2026

View reviewed changes

CISC requested a review from ggerganov March 21, 2026 10:55

ggerganov reviewed Mar 21, 2026

View reviewed changes

ggerganov merged commit eac9c6e into ggml-org:master Mar 21, 2026
1 check passed

michaelw9999 deleted the fix-nvfp4-naming branch March 21, 2026 16:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert: Make NVFP4 and MXFP4 HF conversions say NVFP4/MXFP4 instead of BF16#20730

Convert: Make NVFP4 and MXFP4 HF conversions say NVFP4/MXFP4 instead of BF16#20730
ggerganov merged 6 commits intoggml-org:masterfrom
michaelw9999:fix-nvfp4-naming

michaelw9999 commented Mar 18, 2026 •

edited

Loading

Uh oh!

CISC left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ggerganov Mar 21, 2026

Uh oh!

CISC Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	# TODO: remove once MXFP4 is supported more generally
	def dequant_model(self):
	quant_config = self.hparams.get("quantization_config")
	if quant_config is not None and quant_config.get("quant_method") == "mxfp4":
	return
	return super().dequant_model()

		self._is_nvfp4 = quant_algo == "NVFP4"
		self._is_mxfp4 = quant_method == "mxfp4"

Conversation

michaelw9999 commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CISC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ggerganov Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

CISC Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

michaelw9999 commented Mar 18, 2026 •

edited

Loading