Add Qwen3.5 support for sequence classification#44406
Conversation
- Introduced Qwen3_5ForSequenceClassification class in modeling_qwen3_5.py and modular_qwen3_5.py. - Updated MODEL_FOR_SEQUENCE_CLASSIFICATION_MAPPING_NAMES to include new Qwen3.5 models.
|
[For maintainers] Suggested jobs to run (before merge) run-slow: auto, qwen3_5 |
zucchini-nlp
left a comment
There was a problem hiding this comment.
The PR lgtm. We usually don't add a task class for models if there are no weights on the hub for this specific task. Though with the GenericSequenceClassfier I think we can even automatically create sequence classification tasks on top on base models
|
Just to confirm @zucchini-nlp, are you suggesting we enable sequence classification generically for base checkpoints like That makes sense regarding not expanding the task surface when no task-specific weights exist on the Hub. |
|
Ah no, it's fine for this PR. I was just thinking out loud :) Let's merge then |
|
I saw that the current sequence classifier is configured via
which seems to ignore the vision tower and is not ideal for VLM-type tasks. is this intended? i'm not very familiar with the whole |
|
nope, kinda related to #44625. Sorry, it got swept off as a lower prio task from my list, will do smth with it |
Adds sequence-classification support for Qwen3.5 in AutoModelForSequenceClassification.
What does this PR do?
This PR enables loading Qwen3.5 checkpoints with
AutoModelForSequenceClassification, which previously failed with:ValueError: Unrecognized configuration class Qwen3_5Config for AutoModelForSequenceClassification.Changes
-- src/transformers/models/qwen3_5/modular_qwen3_5.py
-- generated src/transformers/models/qwen3_5/modeling_qwen3_5.py
__all__.-- src/transformers/models/auto/modeling_auto.py
--
("qwen3_5", "Qwen3_5ForSequenceClassification")--
("qwen3_5_text", "Qwen3_5ForSequenceClassification")Why both mappings?
Qwen3.5 uses a composite VLM config (qwen3_5) with a text sub-config (
qwen3_5_text).Registering both ensures classification works for direct text config usage and composite config loading paths.
Before :
Loading from e.g. Qwen/Qwen3.5-0.8B raises
ValueError: Unrecognized configuration class Qwen3_5Config for AutoModelForSequenceClassification.After this PR:
Loading from
Qwen/Qwen3.5-0.8Bnow resolves toQwen3_5ForSequenceClassification.@Cyrilvallez @zucchini-nlp @ArthurZucker
Fixes #44405