[fix] PEFT integration fixes preventing save/load & integration#45428
[fix] PEFT integration fixes preventing save/load & integration#45428zucchini-nlp merged 3 commits intohuggingface:mainfrom
fix] PEFT integration fixes preventing save/load & integration#45428Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Thanks for fixing this, Tom. Changes LGTM.
Regarding your first issue, we fixed the corresponding code in PEFT in huggingface/peft#3127 but forgot to fix it here (for context, these functions will all be imported from PEFT eventually but we currently duplicate them in transformers for backwards compatibility).
|
Very glad to see that we found the same solution here. I'm looking forward to 0.19 when we can rely on peft fully for these. I'll wait for the tests to hopefully pass, and then I'll try to ping a maintainer 🤗
|
|
Test failures are due to 404's on this model repository: https://huggingface.co/AI-Sweden-Models/gpt-sw3-126m
|
|
CI was fixed yesterday, rebasing will help I don't see anything super core, and since Benjamin agrees imo can be merged. For the second point, it is also related to |
|
View the CircleCI Test Summary for this PR: https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=45428&sha=9b824c |
…ggingface#45428) * PEFT integration fixes preventing save/load & integration * Rerun make style with newer ruff --------- Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
What does this PR do?
Code Agent Policy
The Transformers repo is currently being overwhelmed by a large number of PRs and issue comments written by
code agents. We are currently bottlenecked by our ability to review and respond to them. As a result,
we ask that new users do not submit pure code agent PRs at this time.
You may use code agents in drafting or to help you diagnose issues. We'd also ask autonomous "OpenClaw"-like agents
not to open any PRs or issues for the moment.
PRs that appear to be fully agent-written will probably be closed without review, and we may block users who do this
repeatedly or maliciously.
This is a rapidly-evolving situation that's causing significant shockwaves in the open-source community. As a result,
this policy is likely to be updated regularly in the near future. For more information, please read
CONTRIBUTING.md.Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Details
I fixed 2 issues in this PR, and had an agent write some matching tests.
1.
KeyErrorin PEFT loading for non-MoE conversion-mapped architecturesWhen integrating https://huggingface.co/nomic-ai/nomic-embed-multimodal-3b into Transformers / Sentence Transformers, I ran into the following issue:
Because we're loading a PEFT model, we're going through
convert_peft_config_for_transformers. This again calls_convert_peft_config_moe, which seemingly uses_MODEL_TO_CONVERSION_PATTERNto determine if we need to go through with the MoE conversion:transformers/src/transformers/integrations/peft.py
Lines 1077 to 1082 in def5e68
However,
_MODEL_TO_CONVERSION_PATTERNhas a bunch of non-MoE architectures in it nowadays too:transformers/src/transformers/conversion_mapping.py
Lines 63 to 77 in def5e68
And so
_MOE_TARGET_MODULE_MAPPING[base_model_type]fails with a KeyError.2. PEFT adapter weights dropped for base-only models
These lines here will rename the weights from
base_model.model.model.tomodel.:transformers/src/transformers/integrations/peft.py
Lines 227 to 238 in def5e68
But this assumes 2 levels of
model., i.e. something that only happens on models with heads, headless models (i.e. loaded withAutoModel) won't work with this anymore. I believe this is a regression introduced in v5 via #43261, and it's the causing issues on Sentence Transformers: huggingface/sentence-transformers#3701I extended this to also replace
base_model.model.with"", so that"base_model.model"gets removed (as the comment already says) even if there aren't 2 levels ofmodel.Who can review?
@ArthurZucker @BenjaminBossan