Add MoE to Gemma4 TP plan by sywangyi · Pull Request #45219 · huggingface/transformers

sywangyi · 2026-04-03T13:59:01Z

What does this PR do?

google/gemma-4-26B-A4B-it
tp 2, memory is 46G per rank wo the change, drop to about 25G w per rank with the change

text models: @ArthurZucker @Cyrilvallez

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

Rocketknight1

Happy to approve this one and people can yell at me later if there's any problem!

HuggingFaceDocBuilderDev · 2026-04-08T13:16:07Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sywangyi · 2026-04-08T13:28:43Z

@Rocketknight1 thanks for the approval, I check the failure in ci, nothing to do with the PR

TP plan not correct

Cyrilvallez

Thanks, indeed the MoE part was forgotten! Sorry @Rocketknight1, I dismissed your review as I saw it was in the merging queue and I panicked thinking the mlp part should be removed, but it should stay as well haha

github-actions · 2026-04-08T14:04:32Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: gemma4

reduce memory for gemma4 moe model in tp Signed-off-by: Wang, Yi <yi.a.wang@intel.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>

reduce memory for gemma4 moe model in tp

b6dc152

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

Rocketknight1 previously approved these changes Apr 8, 2026

View reviewed changes

Rocketknight1 enabled auto-merge April 8, 2026 13:06

Rocketknight1 added this pull request to the merge queue Apr 8, 2026

github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 8, 2026

Cyrilvallez changed the title ~~reduce memory for gemma4 moe model in tp~~ Improve Gemma4 TP plan Apr 8, 2026

Cyrilvallez changed the title ~~Improve Gemma4 TP plan~~ Add MoE to Gemma4 TP plan Apr 8, 2026

Cyrilvallez approved these changes Apr 8, 2026

View reviewed changes

Merge branch 'main' into reduce_memory_gemma4

83d4a9f

Cyrilvallez merged commit 7f6cc4b into huggingface:main Apr 8, 2026
15 of 18 checks passed

Cyrilvallez added a commit that referenced this pull request Apr 9, 2026

Add MoE to Gemma4 TP plan (#45219)

23c562c

reduce memory for gemma4 moe model in tp Signed-off-by: Wang, Yi <yi.a.wang@intel.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>

evalstate mentioned this pull request Apr 28, 2026

Cumulative defect fixes from recent Transformers PRs evalstate/transformers#41

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MoE to Gemma4 TP plan#45219

Add MoE to Gemma4 TP plan#45219
Cyrilvallez merged 2 commits intohuggingface:mainfrom
sywangyi:reduce_memory_gemma4

sywangyi commented Apr 3, 2026 •

edited

Loading

Uh oh!

Rocketknight1 left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 8, 2026

Uh oh!

Uh oh!

sywangyi commented Apr 8, 2026

Uh oh!

Cyrilvallez left a comment

Uh oh!

github-actions Bot commented Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

sywangyi commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

Rocketknight1 left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 8, 2026

Uh oh!

Uh oh!

sywangyi commented Apr 8, 2026

Uh oh!

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sywangyi commented Apr 3, 2026 •

edited

Loading