Fix Qwen3Next dtype API usage by SrijanUpadhyay · Pull Request #41735 · huggingface/transformers

SrijanUpadhyay · 2025-10-19T14:54:03Z

This PR fixes an invalid PyTorch API usage in the Qwen3Next model.

Changes

Replaced torch.get_current_dtype() with torch.get_default_dtype() in both modular and modeling files
This change fixes FLA (Flash Linear Attention) compatibility when using different dtypes like float32/float16

Technical Details

The original code was using a non-existent PyTorch API get_current_dtype()
The correct API is get_default_dtype() which returns the global default dtype setting
This change ensures proper dtype handling in the Qwen3Next GatedDeltaNet normalization layer

Testing

The changes have been tested with make fixup and pass the repository consistency checks.

Fix #41732

vasqu

LGTM overall, let's revert the unrelated changes tho!

vasqu · 2025-10-20T12:59:08Z

+        # Safety: if the model is sharded across multiple devices (hf_device_map/device_map) and we are
+        # doing sampling, enable `remove_invalid_values` by default to avoid NaN/Inf logits causing CUDA
+        # asserts during multinomial sampling. Users can still override this by passing the flag explicitly.


Unrelated changes? Also in logits_process.py

Rocketknight1 · 2025-10-20T13:37:35Z

Yeah, if you could revert the unrelated changes this PR is good!

Replace torch.get_current_dtype() with torch.get_default_dtype() to fix FLA compatibility

github-actions · 2025-10-23T04:37:53Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: qwen3_next

SrijanUpadhyay · 2025-10-23T04:47:59Z

Thanks for pointing it out, i have made this pr clean and added other changes to #41734. Sorry for the delay. Please check and provide me the feedback or any further improvements.

vasqu

Thx for the PR

HuggingFaceDocBuilderDev · 2025-10-23T12:02:45Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Replace torch.get_current_dtype() with torch.get_default_dtype() to fix FLA compatibility

arman-kazemi · 2025-12-10T21:43:48Z

This might be a naive question, but how come this didn't make it into v4.57.3? When will this be on a release version?

Replace torch.get_current_dtype() with torch.get_default_dtype() to fix FLA compatibility

vasqu reviewed Oct 20, 2025

View reviewed changes

Rocketknight1 mentioned this pull request Oct 20, 2025

Qwen3Next compatibilty issue with flash-linear-attention #41732

Closed

4 tasks

Fix Qwen3Next dtype API usage

6717030

Replace torch.get_current_dtype() with torch.get_default_dtype() to fix FLA compatibility

SrijanUpadhyay force-pushed the fix-qwen3next-dtype-api branch from 15485b7 to 6717030 Compare October 23, 2025 04:36

vasqu approved these changes Oct 23, 2025

View reviewed changes

vasqu enabled auto-merge (squash) October 23, 2025 11:54

vasqu merged commit d4562bb into huggingface:main Oct 23, 2025
17 checks passed

ngazagna-qc pushed a commit to ngazagna-qc/transformers that referenced this pull request Oct 24, 2025

Fix Qwen3Next dtype API usage (huggingface#41735)

c367234

Replace torch.get_current_dtype() with torch.get_default_dtype() to fix FLA compatibility

i3hz pushed a commit to i3hz/transformers that referenced this pull request Oct 30, 2025

Fix Qwen3Next dtype API usage (huggingface#41735)

786faad

Replace torch.get_current_dtype() with torch.get_default_dtype() to fix FLA compatibility

SangbumChoi pushed a commit to SangbumChoi/transformers that referenced this pull request Jan 23, 2026

Fix Qwen3Next dtype API usage (huggingface#41735)

87957a6

Replace torch.get_current_dtype() with torch.get_default_dtype() to fix FLA compatibility

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Qwen3Next dtype API usage#41735

Fix Qwen3Next dtype API usage#41735
vasqu merged 1 commit intohuggingface:mainfrom
SrijanUpadhyay:fix-qwen3next-dtype-api

SrijanUpadhyay commented Oct 19, 2025 •

edited

Loading

Uh oh!

vasqu left a comment

Uh oh!

vasqu Oct 20, 2025

Uh oh!

Rocketknight1 commented Oct 20, 2025

Uh oh!

github-actions Bot commented Oct 23, 2025

Uh oh!

SrijanUpadhyay commented Oct 23, 2025

Uh oh!

vasqu left a comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Oct 23, 2025

Uh oh!

arman-kazemi commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

SrijanUpadhyay commented Oct 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Technical Details

Testing

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

vasqu Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Rocketknight1 commented Oct 20, 2025

Uh oh!

github-actions Bot commented Oct 23, 2025

Uh oh!

SrijanUpadhyay commented Oct 23, 2025

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Oct 23, 2025

Uh oh!

arman-kazemi commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

SrijanUpadhyay commented Oct 19, 2025 •

edited

Loading