Mask2former & Maskformer Fast Image Processor by SangbumChoi · Pull Request #35685 · huggingface/transformers

SangbumChoi · 2025-01-14T07:32:44Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

SangbumChoi · 2025-01-16T07:27:53Z

@yonigozlan @qubvel Hi, While I am doing this PR I had a one question of pad_size. Normally original processor of transformer have both do_pad and pad_size. I think the exsitance of pad_size can be many reason but one is for making strict size for inference in ONNX or torch.jit.trace model. (Since it can only accept the static size of image)

However, currently maskformer's processing policy is always pad the max size of width height in the given batch of images. So I was planning to give pad_size in the MaskFormerImageProcessor in order to add pad_size in the MaskFormerImageProcessorFast and modular_file in mask2former also. Can I have some feedback about this idea?

qubvel · 2025-01-16T14:27:28Z

Hi @SangbumChoi! Thanks for working on fast processors 🤗 I think it would be nice to have in case the change is backward compatible with current configs, e.g. pad_size=None by default

yonigozlan · 2025-01-16T15:13:31Z

Hi @SangbumChoi great to see more work with fast processors :). Agreed with @qubvel, I don't see any issues with adding functionalities to image processor as long as they are fully backward compatible and the default processing stays the same.

TO DO: add test for mask2former

SangbumChoi · 2025-01-17T09:06:49Z

@qubvel @yonigozlan Requesting first round review.

yonigozlan

Overall looks good to me! It looks like there are a few things we can remove or simplify for maskformer and in the modular of mask2former

SangbumChoi · 2025-01-18T12:39:51Z

@yonigozlan Hi, I have pushed all the commit except for #35685 (comment).

Test failure seem unrelated!

…nsformers into mask2former_fast

yonigozlan · 2025-02-20T04:54:56Z

Thanks again @SangbumChoi for adding this! As Arthur said, since my last review, there was a big refactor of fast image processors. Now in most cases you should only have to add custom code in the _preprocess function, and add docstrings to init and preprocess for added image processing kwargs if needed. However maskformer seems a bit more involved for backward compatibility because of the deprecated kwargs such as _maxsize, though the modifications needed should be very similar to the ones that were made for detr in #35069 , so hopefully that can help !
If you can make these modifications that would be great, and as this is quite a recent refactor I’d be glad to have your feedback and I’m happy to answer any questions you may have 🤗

SangbumChoi · 2025-02-21T01:27:15Z

@yonigozlan @ArthurZucker Hi team, I want to also upload the benchmark also. I think I have seen some code from @yonigozlan to test these changes and visualize in the graph but I think it can be accessed only internally can you share? (Otherwise you can share me through slack)

yonigozlan · 2025-02-24T15:09:34Z

Hi @SangbumChoi! Could you run a simple benchmark like this one?

def benchmark_image_processor(image_processor, images,benchmark_it=10, warmup_it=10):
    # warm up
    for _ in range(warmup_it):
        _ = image_processor(images=images, return_tensors="pt", device=device)
    # benchmark
    start_time = time.time()
    for _ in range(benchmark_it):
        _ = image_processor(images=images, return_tensors="pt", device=device)
    end_time = time.time()

    return (end_time - start_time) / benchmark_it

image = Image.open(requests.get("http://images.cocodataset.org/val2017/000000039769.jpg", stream=True).raw)
checkpoint = "maskformer/mask2former checkpoint"
image_processor_fast = AutoImageProcessor.from_pretrained(checkpoint, use_fast=True)
image_processor_slow = AutoImageProcessor.from_pretrained(checkpoint)
device = "cuda"
batch_size = 4

slow_time_one = benchmark_image_processor(image_processor_slow, image, benchmark_it=10)
fast_time_one = benchmark_image_processor(image_processor_fast, image, benchmark_it=10)
slow_time_batch = benchmark_image_processor(image_processor_slow, [image]*batch_size, benchmark_it=10)
fast_time_batch = benchmark_image_processor(image_processor_fast, [image]*batch_size, benchmark_it=10)

print(f"slow_time_one: {slow_time_one}, fast_time_one: {fast_time_one}, speedup: {slow_time_one/fast_time_one}")
print(f"slow_time_batch: {slow_time_batch}, fast_time_batch: {fast_time_batch}, speedup: {slow_time_batch/fast_time_batch}")

Thanks!

…to mask2former_fast

ArthurZucker

Thanks 🤗

ArthurZucker · 2025-04-08T13:26:04Z

+        self.num_labels = num_labels
+        self.pad_size = pad_size
+
+        self._valid_processor_keys = [


cc @yonigozlan I don't understand why we have this

ArthurZucker · 2025-04-08T13:28:33Z

+        self.do_resize = do_resize
+        self.size = size
+        self.resample = resample
+        self.size_divisor = size_divisor
+        self.do_rescale = do_rescale
+        self.rescale_factor = rescale_factor


same here all of these if not most are basic ones should not appear

github-actions · 2025-07-23T02:30:30Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, mask2former, maskformer

yonigozlan

Hey @SangbumChoi ! Pushed some changes to finish up this PR and make it ready to merge. Thanks for your contribution!

…nsformers into mask2former_fast

yonigozlan · 2025-07-23T02:35:30Z

Also removed the zero shot examples as they seem out of scope for this PR, happy to review in another PR!

HuggingFaceDocBuilderDev · 2025-07-23T02:48:17Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

* add maskformerfast * test * revert do_reduce_labels and add testing * make style & fix-copies * add mask2former and make fix-copies TO DO: add test for mask2former * make fix-copies * fill docstring * enable mask2former fast processor * python utils/custom_init_isort.py * make fix-copies * fix PR's comments * modular file update * add license * make style * modular file * make fix-copies * merge * temp commit * finish up maskformer mask2former * remove zero shot examples --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

SangbumChoi added 2 commits January 14, 2025 06:32

add maskformerfast

e259083

test

0f4aa7e

SangbumChoi requested review from ArthurZucker, Rocketknight1, qubvel and yonigozlan as code owners January 14, 2025 07:32

SangbumChoi marked this pull request as draft January 14, 2025 07:34

qubvel added Vision Processing labels Jan 14, 2025

qubvel removed the request for review from Rocketknight1 January 14, 2025 11:12

SangbumChoi added 6 commits January 17, 2025 10:34

revert do_reduce_labels and add testing

9aef911

make style & fix-copies

a4b0a22

add mask2former and make fix-copies

c6a0a4e

TO DO: add test for mask2former

make fix-copies

943f83c

fill docstring

a903e01

enable mask2former fast processor

df88b16

SangbumChoi commented Jan 17, 2025

View reviewed changes

Comment thread src/transformers/models/mask2former/image_processing_mask2former.py

SangbumChoi added 2 commits January 17, 2025 16:04

python utils/custom_init_isort.py

b0a89b5

make fix-copies

92c40cb

SangbumChoi marked this pull request as ready for review January 17, 2025 08:57

SangbumChoi requested review from amyeroberts and stevhliu as code owners January 17, 2025 08:57

yonigozlan reviewed Jan 17, 2025

View reviewed changes

fix PR's comments

868f16f

SangbumChoi requested a review from yonigozlan January 18, 2025 12:40

SangbumChoi added 2 commits February 19, 2025 16:47

make fix-copies

e027e0d

Merge branch 'mask2former_fast' of https://github.com/SangbumChoi/tra…

7d150f4

…nsformers into mask2former_fast

SangbumChoi requested review from ArthurZucker and yonigozlan February 19, 2025 08:00

SangbumChoi added 3 commits March 7, 2025 15:45

merge

74449e2

Merge branch 'main' of https://github.com/SangbumChoi/transformers in…

3641885

…to mask2former_fast

temp commit

df2aeb0

yonigozlan mentioned this pull request Mar 25, 2025

[Contributions Welcome] Add Fast Image Processors #36978

Closed

81 tasks

ArthurZucker approved these changes Apr 8, 2025

View reviewed changes

yonigozlan added 2 commits July 22, 2025 20:21

Merge remote-tracking branch 'upstream/main' into mask2former_fast

349aa48

finish up maskformer mask2former

02edb24

yonigozlan approved these changes Jul 23, 2025

View reviewed changes

yonigozlan and others added 3 commits July 22, 2025 22:31

Merge branch 'main' into mask2former_fast

5b1db19

remove zero shot examples

d096f47

Merge branch 'mask2former_fast' of https://github.com/SangbumChoi/tra…

d46b4d9

…nsformers into mask2former_fast

yonigozlan enabled auto-merge (squash) July 23, 2025 02:35

yonigozlan merged commit d9b35c6 into huggingface:main Jul 23, 2025
25 checks passed

Conversation

SangbumChoi commented Jan 14, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

SangbumChoi commented Jan 16, 2025

Uh oh!

qubvel commented Jan 16, 2025

Uh oh!

yonigozlan commented Jan 16, 2025

Uh oh!

Uh oh!

SangbumChoi commented Jan 17, 2025

Uh oh!

yonigozlan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SangbumChoi commented Jan 18, 2025

Uh oh!

yonigozlan commented Feb 20, 2025

Uh oh!

SangbumChoi commented Feb 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yonigozlan commented Feb 24, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jul 23, 2025

Uh oh!

yonigozlan left a comment

Choose a reason for hiding this comment

Uh oh!

yonigozlan commented Jul 23, 2025

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jul 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

SangbumChoi commented Feb 21, 2025 •

edited

Loading