Skip to content

Remove unused parameters and improve add_tensor_parallel_hooks_t…#44768

Merged
michaelbenayoun merged 4 commits intohuggingface:mainfrom
michaelbenayoun:fix_add_tensor_parallel_hooks_to_module
Apr 9, 2026
Merged

Remove unused parameters and improve add_tensor_parallel_hooks_t…#44768
michaelbenayoun merged 4 commits intohuggingface:mainfrom
michaelbenayoun:fix_add_tensor_parallel_hooks_to_module

Conversation

@michaelbenayoun
Copy link
Copy Markdown
Member

The function add_tensor_parallel_hooks_to_module has unused parameters, in this PR we:

  • Remove tp_plan, which is not used.
  • Remove parameter_name which is not used
  • Remove layer_name. This parameter is only used for logging purposes, and we can infer it when it does happen. It is a bit more costly, but since it is not supposed to happen, it is ok to proceed like that imo

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

except NotImplementedError as e:
print(
f"Trying to prepare {layer_name}, but it's not supported. Corresponding module: {module} Fix it's TP plan: {e}"
modules2names = {v: k for k, v in dict(model.named_modules()).items()}
Copy link
Copy Markdown
Member

@3outeille 3outeille Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add_tensor_parallel_hooks_to_module is called for name, module in model.named_modules(): in def distribute_model() so that's a bit costly no to create a dict at every iteration?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok imo, it is called only when we have a broken TP plan, not at every call.
We can also have a cache for this if you prefer.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed it by caching

@michaelbenayoun michaelbenayoun added this pull request to the merge queue Apr 9, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 9, 2026
@michaelbenayoun michaelbenayoun added this pull request to the merge queue Apr 9, 2026
Merged via the queue into huggingface:main with commit 9d840ea Apr 9, 2026
28 checks passed
@michaelbenayoun michaelbenayoun deleted the fix_add_tensor_parallel_hooks_to_module branch April 9, 2026 17:11
sirzechs66 pushed a commit to sirzechs66/transformers that referenced this pull request Apr 18, 2026
…ggingface#44768)

* fix: remove unused parameters and improve add_tensor_parallel_hooks_to_module

* feat: cache module -> name mappings in hooks

* fix: do not use cache, use explicit parameter

---------

Co-authored-by: Ferdinand Mom <47445085+3outeille@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants