Skip to content

Mask2former & Maskformer Fast Image Processor#35685

Merged
yonigozlan merged 28 commits intohuggingface:mainfrom
SangbumChoi:mask2former_fast
Jul 23, 2025
Merged

Mask2former & Maskformer Fast Image Processor#35685
yonigozlan merged 28 commits intohuggingface:mainfrom
SangbumChoi:mask2former_fast

Conversation

@SangbumChoi
Copy link
Copy Markdown
Contributor

What does this PR do?

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@SangbumChoi
Copy link
Copy Markdown
Contributor Author

@yonigozlan @qubvel Hi, While I am doing this PR I had a one question of pad_size. Normally original processor of transformer have both do_pad and pad_size. I think the exsitance of pad_size can be many reason but one is for making strict size for inference in ONNX or torch.jit.trace model. (Since it can only accept the static size of image)

However, currently maskformer's processing policy is always pad the max size of width height in the given batch of images. So I was planning to give pad_size in the MaskFormerImageProcessor in order to add pad_size in the MaskFormerImageProcessorFast and modular_file in mask2former also. Can I have some feedback about this idea?

@qubvel
Copy link
Copy Markdown
Contributor

qubvel commented Jan 16, 2025

Hi @SangbumChoi! Thanks for working on fast processors 🤗 I think it would be nice to have in case the change is backward compatible with current configs, e.g. pad_size=None by default

@yonigozlan
Copy link
Copy Markdown
Member

Hi @SangbumChoi great to see more work with fast processors :). Agreed with @qubvel, I don't see any issues with adding functionalities to image processor as long as they are fully backward compatible and the default processing stays the same.

Comment thread src/transformers/models/mask2former/image_processing_mask2former.py
@SangbumChoi SangbumChoi marked this pull request as ready for review January 17, 2025 08:57
@SangbumChoi
Copy link
Copy Markdown
Contributor Author

@qubvel @yonigozlan Requesting first round review.

Copy link
Copy Markdown
Member

@yonigozlan yonigozlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me! It looks like there are a few things we can remove or simplify for maskformer and in the modular of mask2former

Comment thread tests/models/mask2former/test_image_processing_mask2former.py Outdated
Comment thread tests/models/maskformer/test_image_processing_maskformer.py Outdated
Comment thread src/transformers/models/maskformer/image_processing_maskformer_fast.py Outdated
Comment thread src/transformers/models/maskformer/image_processing_maskformer_fast.py Outdated
Comment thread src/transformers/models/maskformer/image_processing_maskformer_fast.py Outdated
Comment thread src/transformers/models/maskformer/image_processing_maskformer_fast.py Outdated
Comment thread src/transformers/models/mask2former/modular_mask2former.py Outdated
Comment thread src/transformers/models/mask2former/modular_mask2former.py Outdated
Comment thread src/transformers/models/mask2former/modular_mask2former.py Outdated
@SangbumChoi
Copy link
Copy Markdown
Contributor Author

@yonigozlan Hi, I have pushed all the commit except for #35685 (comment).

Test failure seem unrelated!

@yonigozlan
Copy link
Copy Markdown
Member

Thanks again @SangbumChoi for adding this! As Arthur said, since my last review, there was a big refactor of fast image processors. Now in most cases you should only have to add custom code in the _preprocess function, and add docstrings to init and preprocess for added image processing kwargs if needed. However maskformer seems a bit more involved for backward compatibility because of the deprecated kwargs such as _maxsize, though the modifications needed should be very similar to the ones that were made for detr in #35069 , so hopefully that can help !
If you can make these modifications that would be great, and as this is quite a recent refactor I’d be glad to have your feedback and I’m happy to answer any questions you may have 🤗

@SangbumChoi
Copy link
Copy Markdown
Contributor Author

SangbumChoi commented Feb 21, 2025

@yonigozlan @ArthurZucker Hi team, I want to also upload the benchmark also. I think I have seen some code from @yonigozlan to test these changes and visualize in the graph but I think it can be accessed only internally can you share? (Otherwise you can share me through slack)

@yonigozlan
Copy link
Copy Markdown
Member

Hi @SangbumChoi! Could you run a simple benchmark like this one?

def benchmark_image_processor(image_processor, images,benchmark_it=10, warmup_it=10):
    # warm up
    for _ in range(warmup_it):
        _ = image_processor(images=images, return_tensors="pt", device=device)
    # benchmark
    start_time = time.time()
    for _ in range(benchmark_it):
        _ = image_processor(images=images, return_tensors="pt", device=device)
    end_time = time.time()

    return (end_time - start_time) / benchmark_it

image = Image.open(requests.get("http://images.cocodataset.org/val2017/000000039769.jpg", stream=True).raw)
checkpoint = "maskformer/mask2former checkpoint"
image_processor_fast = AutoImageProcessor.from_pretrained(checkpoint, use_fast=True)
image_processor_slow = AutoImageProcessor.from_pretrained(checkpoint)
device = "cuda"
batch_size = 4

slow_time_one = benchmark_image_processor(image_processor_slow, image, benchmark_it=10)
fast_time_one = benchmark_image_processor(image_processor_fast, image, benchmark_it=10)
slow_time_batch = benchmark_image_processor(image_processor_slow, [image]*batch_size, benchmark_it=10)
fast_time_batch = benchmark_image_processor(image_processor_fast, [image]*batch_size, benchmark_it=10)

print(f"slow_time_one: {slow_time_one}, fast_time_one: {fast_time_one}, speedup: {slow_time_one/fast_time_one}")
print(f"slow_time_batch: {slow_time_batch}, fast_time_batch: {fast_time_batch}, speedup: {slow_time_batch/fast_time_batch}")

Thanks!

Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 🤗

self.num_labels = num_labels
self.pad_size = pad_size

self._valid_processor_keys = [
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @yonigozlan I don't understand why we have this

Comment on lines +119 to +124
self.do_resize = do_resize
self.size = size
self.resample = resample
self.size_divisor = size_divisor
self.do_rescale = do_rescale
self.rescale_factor = rescale_factor
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here all of these if not most are basic ones should not appear

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, mask2former, maskformer

Copy link
Copy Markdown
Member

@yonigozlan yonigozlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @SangbumChoi ! Pushed some changes to finish up this PR and make it ready to merge. Thanks for your contribution!

@yonigozlan
Copy link
Copy Markdown
Member

Also removed the zero shot examples as they seem out of scope for this PR, happy to review in another PR!

@yonigozlan yonigozlan enabled auto-merge (squash) July 23, 2025 02:35
@yonigozlan yonigozlan merged commit d9b35c6 into huggingface:main Jul 23, 2025
25 checks passed
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* add maskformerfast

* test

* revert do_reduce_labels and add testing

* make style & fix-copies

* add mask2former and make fix-copies
TO DO:
	add test for mask2former

* make fix-copies

* fill docstring

* enable mask2former fast processor

* python utils/custom_init_isort.py

* make fix-copies

* fix PR's comments

* modular file update

* add license

* make style

* modular file

* make fix-copies

* merge

* temp commit

* finish up maskformer mask2former

* remove zero shot examples

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* add maskformerfast

* test

* revert do_reduce_labels and add testing

* make style & fix-copies

* add mask2former and make fix-copies
TO DO:
	add test for mask2former

* make fix-copies

* fill docstring

* enable mask2former fast processor

* python utils/custom_init_isort.py

* make fix-copies

* fix PR's comments

* modular file update

* add license

* make style

* modular file

* make fix-copies

* merge

* temp commit

* finish up maskformer mask2former

* remove zero shot examples

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* add maskformerfast

* test

* revert do_reduce_labels and add testing

* make style & fix-copies

* add mask2former and make fix-copies
TO DO:
	add test for mask2former

* make fix-copies

* fill docstring

* enable mask2former fast processor

* python utils/custom_init_isort.py

* make fix-copies

* fix PR's comments

* modular file update

* add license

* make style

* modular file

* make fix-copies

* merge

* temp commit

* finish up maskformer mask2former

* remove zero shot examples

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* add maskformerfast

* test

* revert do_reduce_labels and add testing

* make style & fix-copies

* add mask2former and make fix-copies
TO DO:
	add test for mask2former

* make fix-copies

* fill docstring

* enable mask2former fast processor

* python utils/custom_init_isort.py

* make fix-copies

* fix PR's comments

* modular file update

* add license

* make style

* modular file

* make fix-copies

* merge

* temp commit

* finish up maskformer mask2former

* remove zero shot examples

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* add maskformerfast

* test

* revert do_reduce_labels and add testing

* make style & fix-copies

* add mask2former and make fix-copies
TO DO:
	add test for mask2former

* make fix-copies

* fill docstring

* enable mask2former fast processor

* python utils/custom_init_isort.py

* make fix-copies

* fix PR's comments

* modular file update

* add license

* make style

* modular file

* make fix-copies

* merge

* temp commit

* finish up maskformer mask2former

* remove zero shot examples

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* add maskformerfast

* test

* revert do_reduce_labels and add testing

* make style & fix-copies

* add mask2former and make fix-copies
TO DO:
	add test for mask2former

* make fix-copies

* fill docstring

* enable mask2former fast processor

* python utils/custom_init_isort.py

* make fix-copies

* fix PR's comments

* modular file update

* add license

* make style

* modular file

* make fix-copies

* merge

* temp commit

* finish up maskformer mask2former

* remove zero shot examples

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* add maskformerfast

* test

* revert do_reduce_labels and add testing

* make style & fix-copies

* add mask2former and make fix-copies
TO DO:
	add test for mask2former

* make fix-copies

* fill docstring

* enable mask2former fast processor

* python utils/custom_init_isort.py

* make fix-copies

* fix PR's comments

* modular file update

* add license

* make style

* modular file

* make fix-copies

* merge

* temp commit

* finish up maskformer mask2former

* remove zero shot examples

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants