Fix Base Model Name of LlamaForQuestionAnswering#29258
Fix Base Model Name of LlamaForQuestionAnswering#29258younesbelkada merged 3 commits intohuggingface:mainfrom
Conversation
younesbelkada
left a comment
There was a problem hiding this comment.
Thanks for the PR ! Unfortunately this is a breakign change - you could overwrite the base_model_prefix only for that class though, what do you think?
|
True, I didn't think about whether renaming the variable would be a breaking change. In this case, setting |
ArthurZucker
left a comment
There was a problem hiding this comment.
Might be "breaking" but since it was not reported it means it was not used as you mentioned you cannot save + load
|
@younesbelkada feel free to merge if it is alright with you |
younesbelkada
left a comment
There was a problem hiding this comment.
Looks great, thanks !
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Changes: - HF changed parts of the Llama model implementation - HF added a `LlamaForQuestionAnswering`. However, this model has a wrong base model name. I added a workaround that solves this problem until this is fixed in Transformers (huggingface/transformers#29258) --------- Co-authored-by: calpt <calpt@mail.de>
What does this PR do?
The
LlamaForQuestionAnsweringcurrently has theLlamaModelin thetransformervariable. This does not match thebase_model_prefixset inLlamaPreTrainedModel, which is "model".This Pull Request changes the name from
transformertomodelinLlamaForQuestionAnsweringWho can review?