align xpu's autocast behavior w/ cuda by using device agnostic torch APIs by yao-matrix · Pull Request #38284 · huggingface/transformers

yao-matrix · 2025-05-22T07:54:51Z

@ArthurZucker, pls help review, thx very much.

cuda Signed-off-by: Matrix Yao <matrix.yao@intel.com>

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

Rocketknight1 · 2025-05-22T16:40:28Z

cc @IlyasMoutawwakil

yao-matrix · 2025-05-23T01:19:35Z

ci failure seems not brought by my PR

…magegpt Signed-off-by: Matrix Yao <matrix.yao@intel.com>

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

yao-matrix · 2025-05-23T03:02:52Z


        # Upcast (turn off autocast) and reorder (Scale K by 1 / root(dk))
-        with torch.amp.autocast(query.device.type, enabled=False):
+        with torch.autocast(query.device.type, enabled=False):


align to other modeling code in models directory

yao-matrix · 2025-05-26T22:32:28Z

@ArthurZucker @IlyasMoutawwakil could you help review and comment? Thx very much

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

yao-matrix · 2025-05-29T06:26:25Z

ci failure maybe because of the instable ci env

Rocketknight1 · 2025-06-04T12:13:22Z

CI seems clear now! cc @IlyasMoutawwakil

yao-matrix · 2025-06-09T01:08:26Z

@IlyasMoutawwakil , could you help review? Thx very much.

IlyasMoutawwakil · 2025-06-11T08:11:26Z

        input_dtype = query_states.dtype
+        device_type = (
+            query_states.device.type
+            if isinstance(query_states.device.type, str) and query_states.device.type != "mps"


why would query_states.device.type be anything other than str ?

and what's the problem with mps exactly ?

i don't know, it's a existing practice in original code

transformers/src/transformers/models/dbrx/modeling_dbrx.py

Line 67 in 324cc77

device_type = device_type if isinstance(device_type, str) and device_type != "mps" else "cpu"

, and i reuse it because i don't have mps so just follow the existing behavior. I can see it also be in chameleon and recurrent_gemma modeling since the first PR, so i cannot retrieve the history on why using this. Maybe @ArthurZucker and @zucchini-nlp know the reason.

yeah, I think it comes from this comment #29285 (comment). The changes were first added in this PR

thx @zucchini-nlp, this PR explains why disable casting is needed.
And, @IlyasMoutawwakil , this PR #29439 , explains why mps is excluded, it's because mps doesn't support amp.
For when query_states.device.type is not a str, I guess it's a backward-compatible behavior, because before PT 2.0, there are only 2 values for torch.device.type which are "cpu" and "cuda", so for devices like "mps" (supported since PT 1.12), so for PT 1.13, tensor device type in "mps" may return None. So need this guard. But I don't have device to confirm, just my guess.

thanks, so I guess we can remove the str check, since support for torch 1.x was dropped a while ago.

@IlyasMoutawwakil , done, pls help review again, thx.

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

IlyasMoutawwakil

LGTM

yao-matrix · 2025-06-18T23:10:06Z

@Rocketknight1 , do you know who else need review this PR after Ilyas approved? Thx.

ydshieh

LVGTM, thank you

ydshieh · 2025-06-19T11:13:17Z

run-slow: qwen2_5_omni, gemma, phimoe, qwen2_moe, gpt2, distilbert

github-actions · 2025-06-19T11:14:36Z

This comment contains run-slow, running the specified jobs:

models: ['models/distilbert', 'models/gemma', 'models/gpt2', 'models/phimoe', 'models/qwen2_5_omni', 'models/qwen2_moe']
quantizations: [] ...

HuggingFaceDocBuilderDev · 2025-06-19T11:48:41Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

yao-matrix added 4 commits May 21, 2025 22:34

siwtch to device agnostic autocast in nemotron to align xpu behavior w/

a7abb19

cuda Signed-off-by: Matrix Yao <matrix.yao@intel.com>

fix issue

1130d3b

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

fix style

ded4b4e

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

Merge branch 'main' into autocast-xpu

2be4cef

Merge branch 'main' into autocast-xpu

e32878c

yao-matrix added 2 commits May 23, 2025 02:36

use torch.cast as other modeling code for decision_transformer&gpt2&i…

18a3bd8

…magegpt Signed-off-by: Matrix Yao <matrix.yao@intel.com>

refine

2b8f5d4

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

yao-matrix commented May 23, 2025

View reviewed changes

yao-matrix added 2 commits May 26, 2025 06:33

Merge branch 'main' into autocast-xpu

86f9f00

Merge branch 'main' into autocast-xpu

8a70406

yao-matrix added 3 commits May 29, 2025 08:39

Merge branch 'main' into autocast-xpu

cc09702

update get_autocast_gpu_dtype to device agnostic one

74bf427

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

fix style

7241eb3

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

yao-matrix changed the title ~~align xpu's autocast behavior w/ cuda by using device agnostic torch.autocast~~ align xpu's autocast behavior w/ cuda by using device agnostic torch APIs May 29, 2025

yao-matrix added 2 commits May 30, 2025 08:40

Merge branch 'main' into autocast-xpu

3b40d0d

Merge branch 'main' into autocast-xpu

e1c3eb7

yao-matrix added 2 commits June 6, 2025 10:26

Merge branch 'main' into autocast-xpu

38932c7

Merge branch 'main' into autocast-xpu

0485942

yao-matrix added 2 commits June 10, 2025 06:44

Merge branch 'main' into autocast-xpu

7ba99c7

Merge branch 'main' into autocast-xpu

bd84a90

IlyasMoutawwakil reviewed Jun 11, 2025

View reviewed changes

yao-matrix added 3 commits June 13, 2025 07:24

Merge branch 'main' into autocast-xpu

58400de

Merge branch 'main' into autocast-xpu

82b840c

Merge branch 'main' into autocast-xpu

9d449b8

yao-matrix added 2 commits June 18, 2025 00:03

fix comments

2f2cc64

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

fix style

aec8cc4

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

IlyasMoutawwakil approved these changes Jun 18, 2025

View reviewed changes

Merge branch 'main' into autocast-xpu

3383a22

IlyasMoutawwakil requested a review from ydshieh June 19, 2025 08:41

ydshieh approved these changes Jun 19, 2025

View reviewed changes

Merge branch 'main' into autocast-xpu

71d907e

ydshieh enabled auto-merge (squash) June 19, 2025 11:36

ydshieh merged commit a9ce8c6 into huggingface:main Jun 19, 2025
20 of 21 checks passed

yao-matrix deleted the autocast-xpu branch June 19, 2025 23:57

Conversation

yao-matrix commented May 22, 2025

Uh oh!

Rocketknight1 commented May 22, 2025

Uh oh!

yao-matrix commented May 23, 2025

Uh oh!

yao-matrix May 23, 2025

Choose a reason for hiding this comment

Uh oh!

yao-matrix commented May 26, 2025

Uh oh!

yao-matrix commented May 29, 2025

Uh oh!

Rocketknight1 commented Jun 4, 2025

Uh oh!

yao-matrix commented Jun 9, 2025

Uh oh!

IlyasMoutawwakil Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

IlyasMoutawwakil Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

yao-matrix Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Jun 13, 2025

Choose a reason for hiding this comment

Uh oh!

yao-matrix Jun 16, 2025

Choose a reason for hiding this comment

Uh oh!

IlyasMoutawwakil Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

yao-matrix Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

IlyasMoutawwakil left a comment

Choose a reason for hiding this comment

Uh oh!

yao-matrix commented Jun 18, 2025

Uh oh!

ydshieh left a comment

Choose a reason for hiding this comment

Uh oh!

ydshieh commented Jun 19, 2025

Uh oh!

github-actions Bot commented Jun 19, 2025

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jun 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

yao-matrix Jun 12, 2025 •

edited

Loading