Skip to content

[processors] Unbloating simple processors#40377

Merged
zucchini-nlp merged 17 commits intohuggingface:mainfrom
zucchini-nlp:clean-processors
Sep 10, 2025
Merged

[processors] Unbloating simple processors#40377
zucchini-nlp merged 17 commits intohuggingface:mainfrom
zucchini-nlp:clean-processors

Conversation

@zucchini-nlp
Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp commented Aug 22, 2025

What does this PR do?

I think most processor don't have special functions except for passing each modality to its own preprocessor and combining outputs. This PR is an attempt to modularize processor's __call__ method, we define a default call and delete model-specific code in processor files if it is same as the default one.

Currently we have a few patterns in processors, so imo we can have model-specific methods to handle preparing inputs etc., and keep common code in the Mixin

  1. Simply combine outputs from each attribute
  2. Expand text sequences with special tokens and then combine output
  3. Special handling for bboxes and other image-like inputs

I will split it into several PRs to make review process easier and faster. This PR will only start from easy processors

@molbap
Copy link
Copy Markdown
Contributor

molbap commented Aug 22, 2025

Very nice initiative, it's something that has been bothering me for a while. It could also be the occasion to allow users to have their custom processing piped in, same way we externalize attention classes. I know it's something that's requested sometimes by users especially concerned with processing in their training loop

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@zucchini-nlp
Copy link
Copy Markdown
Member Author

It could also be the occasion to allow users to have their custom processing piped in, same way we externalize attention classes.

Yeah, 100%, would love to sort out processors, Let's break BC taking advantage of v5! 😆

@zucchini-nlp zucchini-nlp requested a review from qubvel August 25, 2025 10:34
@zucchini-nlp
Copy link
Copy Markdown
Member Author

Failing tests are unrelated and are failing on main as well

@zucchini-nlp
Copy link
Copy Markdown
Member Author

@bot /style

Copy link
Copy Markdown
Contributor

@molbap molbap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 unbloat unbloat 👀

Comment thread src/transformers/models/bros/processing_bros.py
processor = self.processor_class.from_pretrained(
"deepseek-community/Janus-Pro-1B",
extra_special_tokens=special_image_tokens,
**self.prepare_processor_dict(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe declare it above to avoid the nesting

@zucchini-nlp
Copy link
Copy Markdown
Member Author

@qubvel gentle ping ;)

Copy link
Copy Markdown
Contributor

@qubvel qubvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Very nice unbloating 🔥 🔥 🔥

The only thing that caught my eye is a changed signature, it would be perfect to keep it

main
Screenshot 2025-09-08 at 11 34 05

PR
Screenshot 2025-09-08 at 11 34 34

Comment thread src/transformers/models/clap/processing_clap.py Outdated
@zucchini-nlp
Copy link
Copy Markdown
Member Author

zucchini-nlp commented Sep 8, 2025

Hmm ig that was caused by a different PR and it is as the second option in main branch. But I get the idea that processor specific kwargs (if any) will not be in typing

@qubvel
Copy link
Copy Markdown
Contributor

qubvel commented Sep 8, 2025

Hmm ig that was caused by a #40676 and it is as the second option in main branch. But I get the idea that processor specific kwargs (if any) will not be in typing

ahh, thanks for letting me know, it seems I didn't pull the latest changes 😄

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Sep 9, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: align, altclip, bridgetower, bros, chameleon, chinese_clip, clap, clip, clipseg, clvp, colpali, deepseek_vl, deepseek_vl_hybrid, donut, emu3, flava

@zucchini-nlp
Copy link
Copy Markdown
Member Author

zucchini-nlp commented Sep 10, 2025

Since the typing hints are not changed in comparison to main, merging. I explored a way if we want to have different typing in kwargs for specific models, but seems that Generic doesn't work well with Unpack yet :(

I'll see if there are any better options

@zucchini-nlp zucchini-nlp merged commit 08edec9 into huggingface:main Sep 10, 2025
23 checks passed
vijayabhaskar-ev pushed a commit to vijayabhaskar-ev/transformers that referenced this pull request Oct 2, 2025
* modularize processor - step 1

* typos

* why raise error, super call check it also

* tiny update

* fix copies

* fix style and test

* lost an import / fix copies

* fix tests

* oops deleted accidentally
yuchenxie4645 pushed a commit to yuchenxie4645/transformers that referenced this pull request Oct 4, 2025
* modularize processor - step 1

* typos

* why raise error, super call check it also

* tiny update

* fix copies

* fix style and test

* lost an import / fix copies

* fix tests

* oops deleted accidentally
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants