[processors] Unbloating simple processors#40377
[processors] Unbloating simple processors#40377zucchini-nlp merged 17 commits intohuggingface:mainfrom
Conversation
|
Very nice initiative, it's something that has been bothering me for a while. It could also be the occasion to allow users to have their custom processing piped in, same way we externalize attention classes. I know it's something that's requested sometimes by users especially concerned with processing in their training loop |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Yeah, 100%, would love to sort out processors, Let's break BC taking advantage of v5! 😆 |
|
Failing tests are unrelated and are failing on main as well |
|
@bot /style |
| processor = self.processor_class.from_pretrained( | ||
| "deepseek-community/Janus-Pro-1B", | ||
| extra_special_tokens=special_image_tokens, | ||
| **self.prepare_processor_dict(), |
There was a problem hiding this comment.
maybe declare it above to avoid the nesting
|
@qubvel gentle ping ;) |
|
Hmm ig that was caused by a different PR and it is as the second option in main branch. But I get the idea that processor specific kwargs (if any) will not be in typing |
ahh, thanks for letting me know, it seems I didn't pull the latest changes 😄 |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: align, altclip, bridgetower, bros, chameleon, chinese_clip, clap, clip, clipseg, clvp, colpali, deepseek_vl, deepseek_vl_hybrid, donut, emu3, flava |
|
Since the typing hints are not changed in comparison to main, merging. I explored a way if we want to have different typing in kwargs for specific models, but seems that I'll see if there are any better options |
* modularize processor - step 1 * typos * why raise error, super call check it also * tiny update * fix copies * fix style and test * lost an import / fix copies * fix tests * oops deleted accidentally
* modularize processor - step 1 * typos * why raise error, super call check it also * tiny update * fix copies * fix style and test * lost an import / fix copies * fix tests * oops deleted accidentally


What does this PR do?
I think most processor don't have special functions except for passing each modality to its own preprocessor and combining outputs. This PR is an attempt to modularize processor's
__call__method, we define a default call and delete model-specific code in processor files if it is same as the default one.Currently we have a few patterns in processors, so imo we can have model-specific methods to handle preparing inputs etc., and keep common code in the Mixin
bboxesand other image-like inputsI will split it into several PRs to make review process easier and faster. This PR will only start from easy processors