Skip to content

Fix Base Model Name of LlamaForQuestionAnswering#29258

Merged
younesbelkada merged 3 commits intohuggingface:mainfrom
lenglaender:llama_qa_model_base
Mar 1, 2024
Merged

Fix Base Model Name of LlamaForQuestionAnswering#29258
younesbelkada merged 3 commits intohuggingface:mainfrom
lenglaender:llama_qa_model_base

Conversation

@lenglaender
Copy link
Copy Markdown
Contributor

@lenglaender lenglaender commented Feb 23, 2024

What does this PR do?

The LlamaForQuestionAnswering currently has the LlamaModel in the transformer variable. This does not match the base_model_prefix set in LlamaPreTrainedModel, which is "model".

This Pull Request changes the name from transformer to model in LlamaForQuestionAnswering

Who can review?

@lenglaender lenglaender marked this pull request as ready for review February 23, 2024 23:20
Copy link
Copy Markdown
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR ! Unfortunately this is a breakign change - you could overwrite the base_model_prefix only for that class though, what do you think?

@lenglaender
Copy link
Copy Markdown
Contributor Author

lenglaender commented Feb 27, 2024

True, I didn't think about whether renaming the variable would be a breaking change. In this case, setting base_model_prefix to the right value is sufficient. I changed it.

Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be "breaking" but since it was not reported it means it was not used as you mentioned you cannot save + load

@ArthurZucker
Copy link
Copy Markdown
Collaborator

@younesbelkada feel free to merge if it is alright with you

Copy link
Copy Markdown
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks !

@younesbelkada younesbelkada merged commit 2858d6c into huggingface:main Mar 1, 2024
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

calpt added a commit to adapter-hub/adapters that referenced this pull request Apr 6, 2024
Changes:
- HF changed parts of the Llama model implementation
- HF added a `LlamaForQuestionAnswering`. However, this model has a
wrong base model name. I added a workaround that solves this problem
until this is fixed in Transformers
(huggingface/transformers#29258)

---------

Co-authored-by: calpt <calpt@mail.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants