Fix Llama QnA `from_pretrained` by vasqu · Pull Request #34061 · huggingface/transformers

vasqu · 2024-10-10T10:58:01Z

What does this PR do?

Fixes issues with llama for qna when using from_pretrained to load any base model. Currently, when we load any llama model, we get a 100% mismatch (i.e. everything is randomly initialized). The workaround is to manually save the (base) model and then load it from disk (ref. #30381):

from transformers import AutoModel, AutoModelForQuestionAnswering
model = AutoModel.from_pretrained("meta-llama/llama-2-7b-hf")
model.save_pretrained("base_model")
model = AutoModelForQuestionAnswering.from_pretrained("base_model")

This is very unintuitive and goes against the usage of auto classes and from_pretrained which should be quick and easy.

I tried to use loading hooks in #34038 to keep it BC but to no avail (I could fool the error messages tho :D). So it might be breaking older versions which I think is warranted to ensure having an easy from_pretrained call instead. #29258 tried the same at first to and then opted for a version which doesn't work as expected.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@ArthurZucker

vasqu · 2024-10-10T10:59:01Z

Due to copy, I expect the issue to be the same over here.

ArthurZucker

I'll work on the from_pretrained issue that prevents the base_model_prefix from working here!

The only issue with this is the fact that it will completely break BC for people who trained a LlamaForQuestionAnswering as their checkpoints will have transformers instead of model no?

ArthurZucker · 2024-10-10T12:48:34Z

I have seen issues with from from_pretrained, long due a fix for them 👀

vasqu · 2024-10-10T13:09:28Z

Yes, it would be completely breaking :/ I've tried using some loading hooks which didn't work but I don't have the time to properly look into this atm.

Should I close this then?

vasqu · 2024-10-24T16:56:21Z

@ArthurZucker Any updates on this? Should I close this? 👀

ArthurZucker · 2024-10-29T10:09:30Z

Hey sorry no, It's on my TODO list, hope to get to it next week!

vasqu · 2024-10-30T16:26:22Z

No worries!

ArthurZucker · 2025-02-11T15:01:59Z

BTW I think we can close this, #36033 should make it easier to update without breaking

vasqu · 2025-02-11T17:11:07Z

Nice, will close then 👍

fix llama qna loading by changing to correct model prefix

4984501

vasqu commented Oct 10, 2024

View reviewed changes

vasqu mentioned this pull request Oct 10, 2024

Mistral-related models for QnA #34045

Merged

5 tasks

makes sense to propagate the name change everywhere

17234f7

ArthurZucker reviewed Oct 10, 2024

View reviewed changes

vasqu closed this Feb 11, 2025

vasqu deleted the llama-for-qna branch February 11, 2025 17:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Llama QnA `from_pretrained`#34061

Fix Llama QnA `from_pretrained`#34061
vasqu wants to merge 2 commits intohuggingface:mainfrom
vasqu:llama-for-qna

vasqu commented Oct 10, 2024

Uh oh!

vasqu Oct 10, 2024

Uh oh!

ArthurZucker left a comment

Uh oh!

ArthurZucker commented Oct 10, 2024

Uh oh!

vasqu commented Oct 10, 2024

Uh oh!

vasqu commented Oct 24, 2024

Uh oh!

ArthurZucker commented Oct 29, 2024

Uh oh!

vasqu commented Oct 30, 2024

Uh oh!

ArthurZucker commented Feb 11, 2025

Uh oh!

vasqu commented Feb 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vasqu commented Oct 10, 2024

What does this PR do?

Before submitting

Who can review?

Uh oh!

vasqu Oct 10, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker commented Oct 10, 2024

Uh oh!

vasqu commented Oct 10, 2024

Uh oh!

vasqu commented Oct 24, 2024

Uh oh!

ArthurZucker commented Oct 29, 2024

Uh oh!

vasqu commented Oct 30, 2024

Uh oh!

ArthurZucker commented Feb 11, 2025

Uh oh!

vasqu commented Feb 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants