Skip to content

[bugfix] Update modeling_llama.py so it skips keys correctly#36289

Closed
HDCharles wants to merge 0 commit intohuggingface:mainfrom
HDCharles:main
Closed

[bugfix] Update modeling_llama.py so it skips keys correctly#36289
HDCharles wants to merge 0 commit intohuggingface:mainfrom
HDCharles:main

Conversation

@HDCharles
Copy link
Copy Markdown
Contributor

@HDCharles HDCharles commented Feb 19, 2025

the llama model was using past_key_value and past_key_values interchangeably which caused issues because only one of those was actually skipped in _skip_keys_device_placement when both needed to be.

This PR changes it so only past_key_values is used.

without this fix, llama + torch.compile was having issues as in the surfaced issue below. Note, though this was surfaced due to a TorchAO recipe though the bug was unrelated to torchao

i think #35763 might have caused this bug but not sure

Fixes

pytorch/ao#1705

@jerryzh168
Copy link
Copy Markdown
Contributor

cc @SunMarc can you take a look?

@HDCharles
Copy link
Copy Markdown
Contributor Author

are these test failures actually blocking? it seems like other models just have the same issue.

@SunMarc
Copy link
Copy Markdown
Member

SunMarc commented Feb 20, 2025

This PR that was merged today actually fixes the issue that you are having (Cache is not a nn.Module anymore, so it will be skipped automatically).

@SunMarc
Copy link
Copy Markdown
Member

SunMarc commented Feb 20, 2025

Still, it would be indeed be better to only keep one of them. @ArthurZucker why do we have past_key_value and past_key_values ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants