Skip to content

fix: prevent accelerate from splitting vision encoder by setting _no_…#43047

Merged
ArthurZucker merged 3 commits intohuggingface:mainfrom
CodersAcademy006:main
Apr 14, 2026
Merged

fix: prevent accelerate from splitting vision encoder by setting _no_…#43047
ArthurZucker merged 3 commits intohuggingface:mainfrom
CodersAcademy006:main

Conversation

@CodersAcademy006
Copy link
Copy Markdown

This PR resolves the model parallelism crash in PeVideo and PeAudioVideo models by adding _no_split_modules = ["TimmWrapperForImageClassification"] to their configuration. Currently, accelerate naively splits the timm-based vision encoder layer-by-layer across devices, breaking internal residual connections and causing a RuntimeError during distributed training. By explicitly marking the wrapper as a non-splittable module, we ensure the vision encoder remains atomic on a single device, restoring stability for FSDP and model parallelism workflows as verified by the now-passing test_model_parallelism unit tests.

Fixes #42918

Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@ArthurZucker
Copy link
Copy Markdown
Collaborator

can you fix the consistency (python utils/modular_model_converter.py)

…dular files and regenerate

- Update modular_pe_audio_video.py and modular_pe_video.py (source of truth)
- Regenerate modeling_pe_audio_video.py and modeling_pe_video.py via modular_model_converter.py
- Remove @unittest.skip on test_model_parallelism now that the crash is resolved

Fixes huggingface#42918
@CodersAcademy006
Copy link
Copy Markdown
Author

CodersAcademy006 commented Mar 2, 2026

@ArthurZucker, sorry for being late. Fixed, updated modular_pe_video.py and modular_pe_audio_video.py to declare _no_split_modules explicitly in the PreTrainedModel classes (needed because the converter resolves parent class from the installed package). Ran modular_model_converter.py to regenerate both modeling files and removed the @unittest.skip decorators from test_model_parallelism in both test files. All four files should now be consistent.

@ArthurZucker ArthurZucker enabled auto-merge April 13, 2026 08:20
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: pe_audio, pe_audio_video, pe_video

@ArthurZucker ArthurZucker added this pull request to the merge queue Apr 14, 2026
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Merged via the queue into huggingface:main with commit 27fbb51 Apr 14, 2026
21 checks passed
sirzechs66 pushed a commit to sirzechs66/transformers that referenced this pull request Apr 18, 2026
huggingface#43047)

* fix: add TimmWrapperForImageClassification to _no_split_modules in modular files and regenerate

- Update modular_pe_audio_video.py and modular_pe_video.py (source of truth)
- Regenerate modeling_pe_audio_video.py and modeling_pe_video.py via modular_model_converter.py
- Remove @unittest.skip on test_model_parallelism now that the crash is resolved

Fixes huggingface#42918

* fix: add TimmWrapperForImageClassification to _no_split_modules in pe_audio, pe_video, pe_audio_video

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

model parallelism unit test failed for modeling_pe_audio_video.py and modeling_pe_video.py

3 participants