:rotating_light: Generic Sequence Classifier works for multimodal models by zucchini-nlp · Pull Request #44664 · huggingface/transformers

zucchini-nlp · 2026-03-13T13:39:52Z

After long discussion, ig this is the easiest and most straightforward way for us to use Generic classes with mulitmodals, and keep their signature. Overloading didn't work as I expected 😓

Ans I am quite reluctant to use TypedDicts, as we will have a lot of unrelated args even after filtering by modality.

Working example:

from transformers import AutoConfig, AutoModelForSequenceClassification, Qwen3_5TextForSequenceClassification

model_name = "onnx-internal-testing/tiny-random-Qwen3_5ForConditionalGeneration"
config = AutoConfig.from_pretrained(model_name, num_labels=1)
model = AutoModelForSequenceClassification.from_pretrained(model_name, config=config)
print(model.config.num_labels == 1)

# suppose a text-only classifier is saved on the hub
text_model = Qwen3_5TextForSequenceClassification(config.text_config)
text_model.save_pretrained("tmp_dir")
text_config = AutoConfig.from_pretrained("tmp_dir", num_labels=5)
text_model = AutoModelForSequenceClassification.from_pretrained("tmp_dir", config=text_config)
print(text_model.config.num_labels == 5)

This should work for ST

HuggingFaceDocBuilderDev · 2026-03-13T13:49:20Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp · 2026-04-08T12:21:01Z


 class Qwen3_5ForSequenceClassification(GenericForSequenceClassification, Qwen3_5PreTrainedModel):
-    config: Qwen3_5TextConfig
+    config_class = AutoConfig


The main change: to allow a single class to be loaded as LLM (text_config) or as multimodal lm (vlm config)

Not sure if it aligns well with transformers tho

vasqu

Some initial comments, I have a bit of my doubts because the signature is not properly reflecting VLMs. Not sure if it would rather make sense to have a dedicated one instead

Also, are these really the only ones that use it? A bit surprising but oh well

github-actions · 2026-04-16T13:31:27Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, gemma3, qwen3_5

github-actions · 2026-04-16T13:45:00Z

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=44664&sha=277261

zucchini-nlp · 2026-04-16T14:10:46Z

Failing test is unrelated and due to bad regex, same as in #45313. The previous fix didn't fix it for all cases 😢

vasqu

Let's add 🚨 to the title, I hope not too many relied on the exact class name for qwen 3.5 text here

But I think this is the best solution for now then, sad that the override didnt work out 😢

zucchini-nlp requested a review from tomaarsen March 13, 2026 13:39

zucchini-nlp commented Apr 8, 2026

View reviewed changes

zucchini-nlp requested review from Cyrilvallez and vasqu April 8, 2026 12:21

vasqu reviewed Apr 13, 2026

View reviewed changes

Comment thread src/transformers/models/qwen3_5/modular_qwen3_5.py Outdated

Comment thread tests/test_pipeline_mixin.py Outdated

Comment thread src/transformers/modeling_layers.py

zucchini-nlp removed request for Cyrilvallez and tomaarsen April 16, 2026 09:38

zucchini-nlp changed the title ~~Generic Sequence Classifier works for multimodal models~~ [WIP] Generic Sequence Classifier works for multimodal models Apr 16, 2026

maybe even easier than typed dict?

19ab5fb

zucchini-nlp force-pushed the sequence-clf branch from 5968cdc to 19ab5fb Compare April 16, 2026 12:23

zucchini-nlp added 2 commits April 16, 2026 14:33

add the test

cd2ec58

overloading doesn't work as I expected when inheriting :(

277261d

zucchini-nlp changed the title ~~[WIP] Generic Sequence Classifier works for multimodal models~~ Generic Sequence Classifier works for multimodal models Apr 16, 2026

vasqu approved these changes Apr 23, 2026

View reviewed changes

zucchini-nlp changed the title ~~Generic Sequence Classifier works for multimodal models~~ 🚨 Generic Sequence Classifier works for multimodal models Apr 23, 2026

This was referenced Apr 29, 2026

Cumulative feature and defect updates from recent Transformers PRs evalstate/transformers#42

Open

Cumulative defect fixes from recent Transformers PRs evalstate/transformers#43

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🚨 Generic Sequence Classifier works for multimodal models#44664

🚨 Generic Sequence Classifier works for multimodal models#44664
zucchini-nlp wants to merge 3 commits intohuggingface:mainfrom
zucchini-nlp:sequence-clf

zucchini-nlp commented Mar 13, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Mar 13, 2026

Uh oh!

zucchini-nlp Apr 8, 2026

Uh oh!

vasqu left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

zucchini-nlp commented Apr 16, 2026

Uh oh!

vasqu left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zucchini-nlp commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Mar 13, 2026

Uh oh!

zucchini-nlp Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

vasqu left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

zucchini-nlp commented Apr 16, 2026

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zucchini-nlp commented Mar 13, 2026 •

edited

Loading

vasqu left a comment •

edited

Loading