Skip to content

bugfix: propage weight key_mapping to peft to fix 3.52 VLM renaming #38627

Merged
Cyrilvallez merged 4 commits intohuggingface:mainfrom
ManuelFay:peft-propagate-mapping
Jun 16, 2025
Merged

bugfix: propage weight key_mapping to peft to fix 3.52 VLM renaming #38627
Cyrilvallez merged 4 commits intohuggingface:mainfrom
ManuelFay:peft-propagate-mapping

Conversation

@ManuelFay
Copy link
Copy Markdown
Contributor

@ManuelFay ManuelFay commented Jun 5, 2025

What does this PR do?

Fixes behavior introduced in 4.52 VLM refacto #37033 that renames VLM model weights but forgot about adapters.
(Theoretically, it seems the problem was just the non-propagation of keymapping to peft but this is made a lot more salient by recent big changes in VLM naming, which deprecates all existing adapters basically)

@zucchini-nlp

Note that this is just for loading, not sure if we want to generalize this to saving... I kind of feel we might just want people to save new adapters in the correct format going forward (and slowly convert all models to the new format while ensuring backcompat)

@ManuelFay ManuelFay changed the title propage key mapping to peft bugfix: propage weight key_mapping to peft to fix 3.52 VLM renaming Jun 5, 2025
Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally it looks to be doing the same key rename as in loading a base transformers, though I am not very familiar with how PEFT works. cc @BenjaminBossan for that

For saving, yeah, if that could be saved in new format and still loaded back correctly, that is the preferred way. However currently transformers saves back the base model in prev-style state dict, I am planning to gradually remove it

@ManuelFay
Copy link
Copy Markdown
Contributor Author

I mean this is not even PEFT, this code works without the peft library. Very intuitively, it just changes the names of the modules the adapters attach to, matching the code you wrote to rename the params in the base model !

Copy link
Copy Markdown
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to ensure that my understanding is correct: The structure of some VLM architectures has changed. Therefore, we need to remap in order to allow old checkpoints to still be loaded. This requires a mapping dict to be applied, but this dict was not applied when using model.load_adapter. This PR fixes that.

Generally, this LGTM. Ideally, I'd like to also see a unit test to ensure this works and won't break in the future. This would require an "old style" checkpoint, so maybe not that easy to implement, perhaps @zucchini-nlp has some tips.

A question that came up: Would this change not mean that we have to apply a similar mapping in PEFT too (and similarly all external packages that create checkpoints based on top of transformers VLMs)?

Comment thread src/transformers/modeling_utils.py Outdated
@ManuelFay
Copy link
Copy Markdown
Contributor Author

ManuelFay commented Jun 6, 2025

@BenjaminBossan yes, exactly correct ! I made the requested changes.

I'll let @zucchini-nlp implement the test, I'm on vacation until mid june and as you say, not sure how to best process.

As for your interrogation, this is exactly my case. I maintain the colpali repository, which is largely based on inheriting from VLMs, and often uses PEFT. All existing checkpoints are now incompatible with the new release. I am able to remap on my end with the key_mapping argument but overriding the from_pretrained behaviour to add this kwarg, but this did not propagate to PEFT adapters.

The code I wrote here should enable all PEFT checkpoints that are linked to models in the VLMs list (explicit mapping dictionary and remapping is specified) to function. I'm not sure you need to do anything on the PEFT side in your case.
However, the question of saving remains, do we save in the new or the old format...

BTW @zucchini-nlp, the verifications that are made for the video preprocessor resolution also don't consider the case of adapters that don't have a config.json in the repo itself (but that's a different issue altogether).

I felt fixing the loading was the most urgent in all cases because thousands of models probably have become obsolete.

@ManuelFay
Copy link
Copy Markdown
Contributor Author

1 failed becauseAttributeError -> 'NoneType' object has no attribute 'copy' Number of failed tests: 1

Reverting back to the way it was

@BenjaminBossan
Copy link
Copy Markdown
Member

BenjaminBossan commented Jun 6, 2025

The code I wrote here should enable all PEFT checkpoints that are linked to models in the VLMs list (explicit mapping dictionary and remapping is specified) to function. I'm not sure you need to do anything on the PEFT side in your case.

Hmm, but PEFT itself does not go through transformers model.load_adapter so I'm unsure. Could you give an example of a (ideally small) model that was affected for you so that I can test it in PEFT.

However, the question of saving remains, do we save in the new or the old format...

Since PEFT will be applied on top of the new architecture, the PEFT checkpoint itself will also assume the new architecture. IIUC, this would mean that if a user saves a PEFT checkpoint with the new architecture, and another user with an older transformers version from before #37033 tries to load it, it will fail (no forward compatibility). Is that right @zucchini-nlp?

Reverting back to the way it was

Ah yes sorry, the copy would need to be conditioned on it being a dict.

@ManuelFay
Copy link
Copy Markdown
Contributor Author

IDK PEFT well so you're surely right.

The mapping regexes make it so both ways of saving should enable loading it back up correctly (verifying the beginning of lines etc), so should be okay on that end.

I'm doing it with colqwen2-v1.0 personally, but I'm guessing some generative qwen2-vl fine-tunes should be affected, lemme check !

@zucchini-nlp
Copy link
Copy Markdown
Member

Looks like that breaks more things than we expected. My expectation was that almost all external libraries that use transformers model classes would load ckpt with from_pretrained(). Ideally if the users loads and saved a checkpoint with transformers, then it should be fully BC. The checkpoint is mapped to new keys when loading and remapped back to old keys before saving

IIUC, this would mean that if a user saves a PEFT checkpoint with the new architecture, and another user with an older transformers version from before #37033 tries to load it, it will fail (no forward compatibility). Is that right @zucchini-nlp?

This one therefore should not be a problem, if we ignore the adapters for now. I believe the solution in this PR which re-maps keys to their new style would make it work for PEFT in the same way as in bare transformers.

Regarding the PEFT library, I am neither sure if you need any changes since I am not an expert. We can look at it together and check

model.hf_quantizer = hf_quantizer

if _adapter_model_path is not None:
adapter_kwargs["key_mapping"] = key_mapping
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will use any key_mapping provided by users, which we don't probably want yet. Let's check if it is a VLM and apply only for those cases

Copy link
Copy Markdown
Contributor Author

@ManuelFay ManuelFay Jun 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, let's not hardcode anymore things :/ User specified key mappings should be propagated as well I believe.

It's the only way to propagate the mapping for all models that inherit from VLM classes that are not specified in your hardcoded list. This is not a rare setting, and currently, passing a key_mapping to the from_pretrained class is the only viable workaround to the bump.

Typically, in ColQwen2, I now have to do this:

class ColQwen2(Qwen2VLForConditionalGeneration):
    @classmethod
    def from_pretrained(cls, *args, **kwargs):
        key_mapping = kwargs.pop("key_mapping", None)
        if key_mapping is None:
            key_mapping = super()._checkpoint_conversion_mapping
        return super().from_pretrained(*args, **kwargs, key_mapping=key_mapping)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that hardcoding is suboptimal and that was done for loading/saving to prevent users from using self._checkpoint_conversion_mapping as a hacky workaround in their models. We don't aim to maintain self._checkpoint_conversion_mapping for all models and want to restrict for VLMs only for BC

About applying key_mapping all models, I am pinging core maintainers @Cyrilvallez. Not sure if that was an intended behavior when adding key_mapping` or not, Let's make sure it aligns with the other maintainers' plans :)

Copy link
Copy Markdown
Member

@Cyrilvallez Cyrilvallez Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Indeed, key_mapping is public and supposed to be used at the user's discretion - however it's a niche arg, and we don't expect many people to use it. But it should still be propagated correctly in all cases as it's being done right now

IMO, vlms should have merged the potential mapping passed by the user to their own, instead of overriding it, but once again as we don't expect people to use the arg except in really particular situations, it's not a big issue 👌

@ManuelFay
Copy link
Copy Markdown
Contributor Author

ManuelFay commented Jun 6, 2025

@BenjaminBossan Here's a minimal example with a model with a noin-moddified architecture (and a peft adapter)

import torch
from PIL import Image
from transformers import AutoProcessor, Qwen2VLForConditionalGeneration

# Load processor and model
processor = AutoProcessor.from_pretrained("Qwen/Qwen2-VL-2B-Instruct")
model = Qwen2VLForConditionalGeneration.from_pretrained(
    "lightonai/MonoQwen2-VL-v0.1",
    device_map="cuda",
    # attn_implementation="flash_attention_2",
    # torch_dtype=torch.bfloat16,
)

Loading adapter weights from lightonai/MonoQwen2-VL-v0.1 led to unexpected keys not found in the model: model.layers.0.mlp.down_proj.lora_A.default.weight, model.layers.0.mlp.down_proj.lora_B.default.weight, model.layers.0.mlp.gate_proj.lora_A.default.weight, model.layers.0.mlp.gate_proj.lora_B.default.weight, model.layers.0.mlp.up_proj.lora_A.default.weight, model.layers.0.mlp.up_proj.lora_B.default.weight, ... model.language_model.layers.27.mlp.down_proj.lora_A.default.weight, model.language_model.layers.27.mlp.down_proj.lora_B.default.weight

@BenjaminBossan
Copy link
Copy Markdown
Member

Thanks for the pointer. I ran my own tests to check all directions, using transformers v4.49.0 as the old state, which should be from before the mentioned PR, and transformers main (8c59cdb3f81f9f4b8e7b8d05b92d40c2e9413788) as the new state. This is the script I used:

import os

from transformers import Qwen2VLForConditionalGeneration
from peft import LoraConfig, PeftModel, get_peft_model


model_id = "Qwen/Qwen2-VL-2B-Instruct"
path = "/tmp/peft/38627"

# Load processor and model
model = Qwen2VLForConditionalGeneration.from_pretrained(model_id)

if not os.path.exists(path + "/adapter_model.safetensors"):
    # Define LoRA configuration
    lora_config = LoraConfig(target_modules=["q_proj", "v_proj"], init_lora_weights=False)
    # Create PEFT model
    peft_model = get_peft_model(model, lora_config)
    peft_model.save_pretrained(path)
    print("no existing PEFT model found, created and saved new one at", path)
else:
    # Load existing PEFT model
    peft_model = PeftModel.from_pretrained(model, path)
    print("loaded existing PEFT model from", path)

Indeed, there are now issues when the version used to save the PEFT checkpoint and to load it are not identical:

test create version load version outcome
1 v4.49.0 v4.49.0 works
2 main main works
3 v4.49.0 main UserWarning: Found missing adapter keys while loading the checkpoint: ['base_model.model.model.language_model.layers.0.self_attn.q_proj.lora_A.default.weight', ...]
4 main v4.49.0 UserWarning: Found missing adapter keys while loading the checkpoint: ['base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight' ...]

To explain why this happens: PEFT adds new modules on top of the base model and also wraps it. So if the base model has a structure like model.model.layers.0.self_attn.q_proj, it becomes base_model.model.model.model.layers.0.self_attn.q_proj. When save_pretrained is called on a PEFT model, only the PEFT part is stored, like base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight. Now, when the model structure changes, this won't fit anymore, resulting in the warnings shown above.

I think what we would need to do in PEFT is to apply the same mapping as in this PR to the PEFT checkpoint. I'm not sure if we have access to key_mapping inside of PeftModel.from_pretrained though, help would be appreciated.

@ManuelFay
Copy link
Copy Markdown
Contributor Author

ManuelFay commented Jun 6, 2025

Thanks for investigating !

In the main lib, conversion is done by hard coding a list of model for which we want to use their specified mapping attribute (_checkpoint_conversion_mapping). I'm guessing you could take the same list (called VLMs), and apply the mapping from the base model in the same way automatically ? Having a key_mapping argument might ofc also help non standard use cases.

This might be a bit independent from this PR though since it's in PEFT code ?
The code I sent should be fixed by just merging this PR (backward compatibility at least)

@BenjaminBossan
Copy link
Copy Markdown
Member

I'm guessing you could take the same list (called VLMs), and apply the mapping from the base model in the same way automatically ?

Yes, this is true, I prototyped a patch that does that and it indeed works, making #37033 backwards compatible with older checkpoints. However, this still leaves the problem of using a checkpoint from after #37033 with a transformers version from before (forwards compatibility). This doesn't work as the (inverse) mapping does not exist in older transformers versions. I also don't think that there is a nice way to detect this so that we can instruct users to upgrade transformers, we could only try to make guesses. IMO this is not very nice, as I imagine this could be a real issue for users. @zucchini-nlp do you have an idea how to address this?

This might be a bit independent from this PR though since it's in PEFT code ?

Yes, sorry for spamming this PR. I can create a separate issue.

@ManuelFay
Copy link
Copy Markdown
Contributor Author

Very cool ! Haha no problem at all ofc, I wasn't sure actually since I didn't quite look at the extent of the scope of the "integrations.peft" transformers code. Thanks a ton for looking into all of this !

BenjaminBossan added a commit to BenjaminBossan/peft that referenced this pull request Jun 6, 2025
Follow up to huggingface#2554
See discussion in huggingface/transformers#38627

To quote:

> transformers PR #37033 re-arranges the way visual language models are
built by moving the LM head from the language model to the top-level
VLM (among other things).

A consequence of this is that the keys in the PEFT state_dict now also
follow the new architecture. This means that:

1. If a PEFT checkpoint was saved with the old architecture but is
   loaded with the new architecture, loading fails.
2. If a PEFT checkpoint was saved with the new architecture but is
   loaded with the old architecture, loading fails.

1. can be addressed by making use of the newly added
_checkpoint_conversion_mapping attribute for models with the new
architecture. In transformers, this is used to map old model state_dicts
to the new state_dict format. In PEFT, with some fiddling, we can use
the same mapping to make old PEFT state_dicts compatible with the new
architecture (backwards compatibility).

However, 2. is not easily addressed. We would need a reverse mapping for
this. This could be easily derived from _checkpoint_conversion_mapping,
but since this attribute doesn't exist on old models, we cannot do that.
Therefore, new checkpoints created with PEFT on these models won't load
successfully when users use old transformers (forward compatibility).

These cases are covered by the added unit tests, which means that the
test covering case 2 currently fails.

If we could reliably detect that we are in case 2, we could warn the
user and advise them to upgrade transformers, but I don't know if it's
possible to figure this out.

Note that we skip prompt learning methods when applying the mapping.
This is because they don't have the "base_model.model." prefix, which we
need to remove before mapping. Instead just using "base_model.". This
could be fine, we could only remove "base_model.", However, the
subsequent sub-module could also be called "model", resulting in what
looks like "base_model.model.". To avoid this confusion, we skip prefix
tuning. Since it should be applied to the language model part directly
and applies itself on the outer model (unlike LoRA et al), skipping
should be fine.

We also allow users to pass their own key_mapping to from_pretrained and
load_adapter, though the documentation advises against it. This argument
could theoretically be used as a workaround in case there is indeed an
issue with prompt learning state_dicts.

Apart from these changes, I also made a small change to account for
huggingface/transformers#38017 (comment).
@BenjaminBossan
Copy link
Copy Markdown
Member

I created a PR to address this in PEFT: huggingface/peft#2574. I used fake mini models to mimic the old and new model architectures for testing. The tests are still failing because of the issue I mentioned above.

@ManuelFay
Copy link
Copy Markdown
Contributor Author

Hello @zucchini-nlp , any news ?

@zucchini-nlp
Copy link
Copy Markdown
Member

Let's wait for a core maintainer's review, ping @Cyrilvallez :)

Copy link
Copy Markdown
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! I must say I really like this, it's true that I did not think of the adapters when introducing key_mapping!
However, this does not solve the issue at hand in all cases: if people first load the model then use load_adapter (i.e. outside of from_pretrained) it won't work with vlms. So load_adapter should have the same default key_mapping as from_pretrained in general to keep everything consistent!

model.hf_quantizer = hf_quantizer

if _adapter_model_path is not None:
adapter_kwargs["key_mapping"] = key_mapping
Copy link
Copy Markdown
Member

@Cyrilvallez Cyrilvallez Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Indeed, key_mapping is public and supposed to be used at the user's discretion - however it's a niche arg, and we don't expect many people to use it. But it should still be propagated correctly in all cases as it's being done right now

IMO, vlms should have merged the potential mapping passed by the user to their own, instead of overriding it, but once again as we don't expect people to use the arg except in really particular situations, it's not a big issue 👌

@ManuelFay
Copy link
Copy Markdown
Contributor Author

Thanks a lot for taking a look @Cyrilvallez ! Do you have any further comments @zucchini-nlp ?

Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, no comments. Thanks, let's merge!

@zucchini-nlp zucchini-nlp enabled auto-merge (squash) June 16, 2025 06:37
@Cyrilvallez Cyrilvallez disabled auto-merge June 16, 2025 08:09
@Cyrilvallez Cyrilvallez merged commit ce6ac53 into huggingface:main Jun 16, 2025
19 of 20 checks passed
BenjaminBossan added a commit to huggingface/peft that referenced this pull request Jun 23, 2025
FIX Transformers VLM architecture changes

Follow up to #2554
See discussion in huggingface/transformers#38627

To quote:

> transformers PR #37033 re-arranges the way visual language models are
built by moving the LM head from the language model to the top-level
VLM (among other things).

A consequence of this is that the keys in the PEFT state_dict now also
follow the new architecture. This means that:

1. If a PEFT checkpoint was saved with the old architecture but is
   loaded with the new architecture, loading fails.
2. If a PEFT checkpoint was saved with the new architecture but is
   loaded with the old architecture, loading fails.

1. can be addressed by making use of the newly added
_checkpoint_conversion_mapping attribute for models with the new
architecture. In transformers, this is used to map old model state_dicts
to the new state_dict format. In PEFT, with some fiddling, we can use
the same mapping to make old PEFT state_dicts compatible with the new
architecture (backwards compatibility).

However, 2. is not easily addressed. We would need a reverse mapping for
this. This could be easily derived from _checkpoint_conversion_mapping,
but since this attribute doesn't exist on old models, we cannot do that.
Therefore, new checkpoints created with PEFT on these models won't load
successfully when users use old transformers (forward compatibility).

These cases are covered by the added unit tests, which means that the
test covering case 2 are marked as xfail.

If we could reliably detect that we are in case 2, we could warn the
user and advise them to upgrade transformers, but I don't know if it's
possible to figure this out.

We also allow users to pass their own key_mapping to from_pretrained and
load_adapter, though the documentation advises against it. This argument
could theoretically be used as a workaround in case there is indeed an
issue with prompt learning state_dicts.

Apart from these changes, I also made a small change to account for
huggingface/transformers#38017 (comment).
yao-matrix pushed a commit to yao-matrix/peft that referenced this pull request Jun 25, 2025
FIX Transformers VLM architecture changes

Follow up to huggingface#2554
See discussion in huggingface/transformers#38627

To quote:

> transformers PR #37033 re-arranges the way visual language models are
built by moving the LM head from the language model to the top-level
VLM (among other things).

A consequence of this is that the keys in the PEFT state_dict now also
follow the new architecture. This means that:

1. If a PEFT checkpoint was saved with the old architecture but is
   loaded with the new architecture, loading fails.
2. If a PEFT checkpoint was saved with the new architecture but is
   loaded with the old architecture, loading fails.

1. can be addressed by making use of the newly added
_checkpoint_conversion_mapping attribute for models with the new
architecture. In transformers, this is used to map old model state_dicts
to the new state_dict format. In PEFT, with some fiddling, we can use
the same mapping to make old PEFT state_dicts compatible with the new
architecture (backwards compatibility).

However, 2. is not easily addressed. We would need a reverse mapping for
this. This could be easily derived from _checkpoint_conversion_mapping,
but since this attribute doesn't exist on old models, we cannot do that.
Therefore, new checkpoints created with PEFT on these models won't load
successfully when users use old transformers (forward compatibility).

These cases are covered by the added unit tests, which means that the
test covering case 2 are marked as xfail.

If we could reliably detect that we are in case 2, we could warn the
user and advise them to upgrade transformers, but I don't know if it's
possible to figure this out.

We also allow users to pass their own key_mapping to from_pretrained and
load_adapter, though the documentation advises against it. This argument
could theoretically be used as a workaround in case there is indeed an
issue with prompt learning state_dicts.

Apart from these changes, I also made a small change to account for
huggingface/transformers#38017 (comment).
efraimdahl pushed a commit to efraimdahl/peft that referenced this pull request Jul 12, 2025
FIX Transformers VLM architecture changes

Follow up to huggingface#2554
See discussion in huggingface/transformers#38627

To quote:

> transformers PR #37033 re-arranges the way visual language models are
built by moving the LM head from the language model to the top-level
VLM (among other things).

A consequence of this is that the keys in the PEFT state_dict now also
follow the new architecture. This means that:

1. If a PEFT checkpoint was saved with the old architecture but is
   loaded with the new architecture, loading fails.
2. If a PEFT checkpoint was saved with the new architecture but is
   loaded with the old architecture, loading fails.

1. can be addressed by making use of the newly added
_checkpoint_conversion_mapping attribute for models with the new
architecture. In transformers, this is used to map old model state_dicts
to the new state_dict format. In PEFT, with some fiddling, we can use
the same mapping to make old PEFT state_dicts compatible with the new
architecture (backwards compatibility).

However, 2. is not easily addressed. We would need a reverse mapping for
this. This could be easily derived from _checkpoint_conversion_mapping,
but since this attribute doesn't exist on old models, we cannot do that.
Therefore, new checkpoints created with PEFT on these models won't load
successfully when users use old transformers (forward compatibility).

These cases are covered by the added unit tests, which means that the
test covering case 2 are marked as xfail.

If we could reliably detect that we are in case 2, we could warn the
user and advise them to upgrade transformers, but I don't know if it's
possible to figure this out.

We also allow users to pass their own key_mapping to from_pretrained and
load_adapter, though the documentation advises against it. This argument
could theoretically be used as a workaround in case there is indeed an
issue with prompt learning state_dicts.

Apart from these changes, I also made a small change to account for
huggingface/transformers#38017 (comment).
Guy-Bilitski pushed a commit to Guy-Bilitski/UIOrthoLoRA that referenced this pull request Feb 5, 2026
FIX Transformers VLM architecture changes

Follow up to #2554
See discussion in huggingface/transformers#38627

To quote:

> transformers PR #37033 re-arranges the way visual language models are
built by moving the LM head from the language model to the top-level
VLM (among other things).

A consequence of this is that the keys in the PEFT state_dict now also
follow the new architecture. This means that:

1. If a PEFT checkpoint was saved with the old architecture but is
   loaded with the new architecture, loading fails.
2. If a PEFT checkpoint was saved with the new architecture but is
   loaded with the old architecture, loading fails.

1. can be addressed by making use of the newly added
_checkpoint_conversion_mapping attribute for models with the new
architecture. In transformers, this is used to map old model state_dicts
to the new state_dict format. In PEFT, with some fiddling, we can use
the same mapping to make old PEFT state_dicts compatible with the new
architecture (backwards compatibility).

However, 2. is not easily addressed. We would need a reverse mapping for
this. This could be easily derived from _checkpoint_conversion_mapping,
but since this attribute doesn't exist on old models, we cannot do that.
Therefore, new checkpoints created with PEFT on these models won't load
successfully when users use old transformers (forward compatibility).

These cases are covered by the added unit tests, which means that the
test covering case 2 are marked as xfail.

If we could reliably detect that we are in case 2, we could warn the
user and advise them to upgrade transformers, but I don't know if it's
possible to figure this out.

We also allow users to pass their own key_mapping to from_pretrained and
load_adapter, though the documentation advises against it. This argument
could theoretically be used as a workaround in case there is indeed an
issue with prompt learning state_dicts.

Apart from these changes, I also made a small change to account for
huggingface/transformers#38017 (comment).
AngledLuffa added a commit to stanfordnlp/stanza that referenced this pull request Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants