Fix Gemma2 synced multi-GPU generation#35232
Conversation
fcbc37b to
57c52db
Compare
|
@gante Couldn't you, please, review this PR? |
ArthurZucker
left a comment
There was a problem hiding this comment.
Thanks, could you additionally provide a reproducer ? 🤗
|
@ArthurZucker Sure, here is an example script I am running this with deepspeed. The following is the deepspeed config |
|
@ArthurZucker let's merge this and a related PR (#35893), seems like a real issue and has been reported by another user recently |
ArthurZucker
left a comment
There was a problem hiding this comment.
Sounds good, I was a bit waiting to get more failures but let's go
|
@ManukyanD thank you for the fix 💛 cc @SunMarc: this PR (and #35893) copies the fix in #34095 into functions that are overwritten in specific models. There, you mentioned we had no tests for it [multigpu + generate] -- any chance you had a look at it, or would you be able to share pointers to add an appropriate lightweight test? 🙏 |
* Fix Gemma2 synced multi-GPU generation * Fix import ordering in modular_gemma2.py
* Fix Gemma2 synced multi-GPU generation * Fix import ordering in modular_gemma2.py
What does this PR do?
Generation with Gemma2ForCausalLM in synced multi-GPU settings crashes because the cache_position goes out of bounds. This PR addresses the issue.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@ArthurZucker
@gante