[graph_trainer] Save CUDA-to-CPU copies in SAC pass to match core behavior by SherlockNoMad · Pull Request #2811 · pytorch/torchtitan

SherlockNoMad · 2026-04-03T07:30:58Z

Stack from ghstack (oldest at bottom):

Core's _apply_op_sac always marks aten._to_copy CUDA->CPU transfers as
MUST_SAVE to avoid wastefully recomputing device transfers during
backward (e.g., MoE D2H sync for all-to-all metadata).

Graph_trainer's apply_sac_pass was missing this check, causing these
transfers to fall through to PREFER_RECOMPUTE. Add the same logic at
the FX graph level by inspecting the source node's fake tensor device
and the target device kwarg.

…avior Core's _apply_op_sac always marks aten._to_copy CUDA->CPU transfers as MUST_SAVE to avoid wastefully recomputing device transfers during backward (e.g., MoE D2H sync for all-to-all metadata). Graph_trainer's apply_sac_pass was missing this check, causing these transfers to fall through to PREFER_RECOMPUTE. Add the same logic at the FX graph level by inspecting the source node's fake tensor device and the target device kwarg. [ghstack-poisoned]

…core behavior Port the fix from PR #2811 to graph_trainer's `apply_sac_pass`. Core's `_apply_op_sac` always marks `aten._to_copy` CUDA->CPU transfers as MUST_SAVE to avoid wastefully recomputing device transfers during backward (e.g., MoE D2H sync for all-to-all metadata). Graph_trainer's `apply_sac_pass` was missing this check.

pytorch-bot bot added the ciflow/8gpu label Apr 3, 2026

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 3, 2026

SherlockNoMad mentioned this pull request Apr 9, 2026

[GraphTrainer][AutoDev] Save CUDA-to-CPU copies in SAC pass to match core behavior #2908

Draft

4 tasks

SherlockNoMad closed this Apr 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[graph_trainer] Save CUDA-to-CPU copies in SAC pass to match core behavior#2811

[graph_trainer] Save CUDA-to-CPU copies in SAC pass to match core behavior#2811
SherlockNoMad wants to merge 1 commit intogh/SherlockNoMad/16/basefrom
gh/SherlockNoMad/16/head

SherlockNoMad commented Apr 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SherlockNoMad commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SherlockNoMad commented Apr 3, 2026 •

edited

Loading