Skip to content

Fix flashattn wrt quantized models#43145

Merged
SunMarc merged 10 commits intomainfrom
fix-dtype-quant
Jan 12, 2026
Merged

Fix flashattn wrt quantized models#43145
SunMarc merged 10 commits intomainfrom
fix-dtype-quant

Conversation

@SunMarc
Copy link
Copy Markdown
Member

@SunMarc SunMarc commented Jan 7, 2026

What does this PR do?

This PR fixes an issue with config and quantization_config. Since we don't propagate the quantization_config into subconfig, we don't cast the tensors to the correct dtype when using flash attn. However, I prefer not to propagate this information for now. Instead, we just set _is_quantized flag. In the past, we were using _pre_quantization_dtype.

Fixes #43001

Regression caused by https://github.com/huggingface/transformers/pull/42882/files

@SunMarc SunMarc requested a review from MekkCyber January 7, 2026 14:28
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Contributor

@MekkCyber MekkCyber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing

@SunMarc SunMarc enabled auto-merge (squash) January 9, 2026 15:46
@github-actions
Copy link
Copy Markdown
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43145&sha=3cc7ed

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: diffllama, falcon, falcon_mamba, gpt_neo, gptj, kyutai_speech_to_text, mimi, moshi, nemotron

@SunMarc SunMarc merged commit 35fe341 into main Jan 12, 2026
25 of 26 checks passed
@SunMarc SunMarc deleted the fix-dtype-quant branch January 12, 2026 16:47
SangbumChoi pushed a commit to SangbumChoi/transformers that referenced this pull request Jan 23, 2026
* fix regression

* fix

* fix

* fix

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants