Skip to content

[docs] optimizations quickstart#42538

Merged
stevhliu merged 3 commits intohuggingface:mainfrom
stevhliu:opt-overview
Dec 15, 2025
Merged

[docs] optimizations quickstart#42538
stevhliu merged 3 commits intohuggingface:mainfrom
stevhliu:opt-overview

Conversation

@stevhliu
Copy link
Copy Markdown
Member

@stevhliu stevhliu commented Dec 1, 2025

Adds an overview/quickstart of available Transformers optimization techniques to provide a clear and centralized place where they're all documented and help users select the right optimization to increase speed or reduce memory footprint

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Sorry for the delay, added some comments!

Comment thread docs/source/en/optimization_overview.md
Comment thread docs/source/en/optimization_overview.md Outdated
Comment thread docs/source/en/optimization_overview.md Outdated
Copy link
Copy Markdown
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! cc @LysandreJik as well if you want to have a look/have more comments!

Copy link
Copy Markdown
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super good start! Love it. Let's get it in and iterate over it then

cc @SunMarc as well, related to some topics we discussed recently.

Comment on lines +21 to +22
> [!NOTE]
> Memory and speed are closely related but not the same. Shrinking your memory footprint makes a model "faster" because there is less data to move around. Pure speed optimizations don't always reduce memory and sometimes increase usage. Choose the appropriate optimization based on your use case and hardware.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool note

Copy link
Copy Markdown
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks that's really nice ! Just left a minor comment

Comment thread docs/source/en/optimization_overview.md Outdated
Comment on lines +136 to +147
[Expert parallelism](./expert_parallelism) distributes experts across devices for mixture-of-experts (MoE) models. Set `enable_expert_parallel` in [`DistributedConfig`] to enable it.

```py
from transformers import AutoModelForCausalLM
from transformers.distributed.configuration_utils import DistributedConfig

distributed_config = DistributedConfig(enable_expert_parallel=True)
model = AutoModelForCausalLM.from_pretrained(
"openai/gpt-oss-120b",
distributed_config=distributed_config,
)
```
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we want to promote this API for now as it is not that used for now. Since we are refactoring moe and more, maybe it is a good time to fix how distributed_config will work (tp, ep, pp) ? cc @ArthurZucker @3outeille. Right now, the only model that is using this feature is llama4.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, i can update docs for expert parallelism in #42409 once the API is more stable

@stevhliu stevhliu merged commit 31de95e into huggingface:main Dec 15, 2025
15 checks passed
@stevhliu stevhliu deleted the opt-overview branch December 15, 2025 22:25
SangbumChoi pushed a commit to SangbumChoi/transformers that referenced this pull request Jan 23, 2026
* quickstart

* feedback

* feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants