Skip to content

Fix distributed inference#49

Merged
SimJeg merged 1 commit intomainfrom
simon/fix-distributed-inference
Feb 13, 2025
Merged

Fix distributed inference#49
SimJeg merged 1 commit intomainfrom
simon/fix-distributed-inference

Conversation

@SimJeg
Copy link
Copy Markdown
Collaborator

@SimJeg SimJeg commented Feb 13, 2025

The device_map="auto" option in the pipeline failed for ExpectedAttentionPress because in every layer, this press uses the rotary embedding layer (to re-do the same computations btw) which is stucked in cuda:0.

I checked that this was the only press with an issue by running test_presses.py and temporarily replacing the the unit_test_model by llama3 8b on 2 GPUs.

@SimJeg SimJeg merged commit f98de1f into main Feb 13, 2025
@SimJeg SimJeg deleted the simon/fix-distributed-inference branch February 13, 2025 12:48
maxjeblick pushed a commit that referenced this pull request Aug 12, 2025
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants