Skip to content

🚨 Generic Sequence Classifier works for multimodal models#44664

Open
zucchini-nlp wants to merge 3 commits intohuggingface:mainfrom
zucchini-nlp:sequence-clf
Open

🚨 Generic Sequence Classifier works for multimodal models#44664
zucchini-nlp wants to merge 3 commits intohuggingface:mainfrom
zucchini-nlp:sequence-clf

Conversation

@zucchini-nlp
Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp commented Mar 13, 2026

Fixes #44625 and #44406 (comment)

After long discussion, ig this is the easiest and most straightforward way for us to use Generic classes with mulitmodals, and keep their signature. Overloading didn't work as I expected 😓

Ans I am quite reluctant to use TypedDicts, as we will have a lot of unrelated args even after filtering by modality.

Working example:

from transformers import AutoConfig, AutoModelForSequenceClassification, Qwen3_5TextForSequenceClassification

model_name = "onnx-internal-testing/tiny-random-Qwen3_5ForConditionalGeneration"
config = AutoConfig.from_pretrained(model_name, num_labels=1)
model = AutoModelForSequenceClassification.from_pretrained(model_name, config=config)
print(model.config.num_labels == 1)

# suppose a text-only classifier is saved on the hub
text_model = Qwen3_5TextForSequenceClassification(config.text_config)
text_model.save_pretrained("tmp_dir")
text_config = AutoConfig.from_pretrained("tmp_dir", num_labels=5)
text_model = AutoModelForSequenceClassification.from_pretrained("tmp_dir", config=text_config)
print(text_model.config.num_labels == 5)

This should work for ST

@zucchini-nlp zucchini-nlp requested a review from tomaarsen March 13, 2026 13:39
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.


class Qwen3_5ForSequenceClassification(GenericForSequenceClassification, Qwen3_5PreTrainedModel):
config: Qwen3_5TextConfig
config_class = AutoConfig
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main change: to allow a single class to be loaded as LLM (text_config) or as multimodal lm (vlm config)

Not sure if it aligns well with transformers tho

Copy link
Copy Markdown
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments, I have a bit of my doubts because the signature is not properly reflecting VLMs. Not sure if it would rather make sense to have a dedicated one instead

Also, are these really the only ones that use it? A bit surprising but oh well

Comment thread src/transformers/models/qwen3_5/modular_qwen3_5.py Outdated
Comment thread tests/test_pipeline_mixin.py Outdated
Comment thread src/transformers/modeling_layers.py
@zucchini-nlp zucchini-nlp changed the title Generic Sequence Classifier works for multimodal models [WIP] Generic Sequence Classifier works for multimodal models Apr 16, 2026
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, gemma3, qwen3_5

@zucchini-nlp zucchini-nlp changed the title [WIP] Generic Sequence Classifier works for multimodal models Generic Sequence Classifier works for multimodal models Apr 16, 2026
@github-actions
Copy link
Copy Markdown
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=44664&sha=277261

@zucchini-nlp
Copy link
Copy Markdown
Member Author

Failing test is unrelated and due to bad regex, same as in #45313. The previous fix didn't fix it for all cases 😢

Copy link
Copy Markdown
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add 🚨 to the title, I hope not too many relied on the exact class name for qwen 3.5 text here

But I think this is the best solution for now then, sad that the override didnt work out 😢

@zucchini-nlp zucchini-nlp changed the title Generic Sequence Classifier works for multimodal models 🚨 Generic Sequence Classifier works for multimodal models Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Qwen3.5 num_labels not propagated from core config to text config

3 participants