Add EfficientNet Image PreProcessor by zshn25 · Pull Request #37055 · huggingface/transformers

zshn25 · 2025-03-27T21:11:19Z

What does this PR do?

Add Fast Image Processor #36978 for EfficientNet

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

github-actions · 2025-03-27T21:11:32Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

zshn25 · 2025-03-29T20:02:38Z

3 tests fail but can't figure out why.

======================================================================= short test summary info =======================================================================
FAILED tests/models/efficientnet/test_image_processing_efficientnet.py::EfficientNetImageProcessorTest::test_rescale - AssertionError: False is not true
FAILED tests/models/efficientnet/test_image_processing_efficientnet.py::EfficientNetImageProcessorTest::test_slow_fast_equivalence - AssertionError: False is not true
FAILED tests/models/efficientnet/test_image_processing_efficientnet.py::EfficientNetImageProcessorTest::test_slow_fast_equivalence_batched - AssertionError: False is not true
==================================================================== 3 failed, 17 passed in 3.93s =====================================================================

- reshape test passes when casted to float64 - equivalence test doesn't pass

zshn25 · 2025-03-30T10:31:14Z

test_rescale passes when inputs are casted to np.float64 rather than the default np.float32. Made necessary change to the test method.

- changes order of rescale, normalize acc to slow - rescale_offset defaults to False acc to slow - resample was causing difference in fast and slow. Changing test to bilinear resolves this difference

zshn25 · 2025-03-30T11:17:26Z

Thanks to #37094 (comment), changing the test resamping to bilinear passes the equivalence tests

yonigozlan

Hi @zshn25 ! Thanks a lot for working on this, looks great! Only things to change is using F.InterpolationMode.NEAREST_EXACT and see if the equivalence tests pass this way.

yonigozlan · 2025-03-31T17:49:24Z

        do_normalize=True,
        image_mean=[0.5, 0.5, 0.5],
        image_std=[0.5, 0.5, 0.5],
+        resample=PILImageResampling.BILINEAR,  # NEAREST is too different between PIL and torchvision


Could you try with F.InterpolationMode.NEAREST_EXACT?

Tried, both the equivalence tests fail with nearest but not with bilinear.

yonigozlan · 2025-03-31T17:49:57Z

+    BASE_IMAGE_PROCESSOR_FAST_DOCSTRING,
+)
+class EfficientNetImageProcessorFast(BaseImageProcessorFast):
+    resample = PILImageResampling.NEAREST


Could you try with F.InterpolationMode.NEAREST_EXACT?

Hi @yonigozlan, thanks for the review. Using F.InterpolationMode.NEAREST_EXACT, gives the TypeError: Object of type InterpolationMode is not JSON serializable error while dump the json pretrained file.

I now use the pil_torch_interpolation_mapping to map PILImageResampling.NEAREST to InterpolationMode.NEAREST_EXACT instead of InterpolationMode.NEAREST

Yes that's good!

yonigozlan · 2025-03-31T18:25:01Z

Hi @chewyuenrachael and @Yann-CV thank you for working on your PRs #37119 #37094 . I reviewed this one as it was the first to be posted, but it looks like your PR also helped here so thanks a lot!

Yann-CV · 2025-03-31T19:47:15Z

+            device=images.device,
+        )
+        # if/elif as we use fused rescale and normalize if both are set to True
+        if do_rescale:


I can be wrong but I guess here the offset will not be applied if the normalization and rescale are activated together (do_rescale becomes false from _fuse_mean_std_and_rescale_factor)

Hi @Yann-CV, thank you for the review. Nice catch there. I pushed a fix. Thank you.

won't there be a problem here if offset is true and do_normalize is True (because of the elif right after)?

True. nice catch @yonigozlan. Replaced elif by if.

…erpolationMode is not JSON serializable

Yann-CV · 2025-04-01T07:01:02Z

-        rescaled_image = image_processor.rescale(image, scale=1 / 127.5)
-        expected_image = (image * (1 / 127.5)).astype(np.float32) - 1
-        self.assertTrue(np.allclose(rescaled_image, expected_image))
+            rescaled_image = image_processor.rescale(image, scale=1 / 127.5, dtype=np.float64)


if we compare the resize methods in both classes, the slow one is converting data to float64 inside itself. if the goal is to be fully equivalent, it probably needs to be done as well in the fast version.

Also the fast method is using torch tensors, in my opinion it is a better practice to test it torch.Tensor objects

added tests with torch.tensor objects for rescale

…re both true

- added tests for rescale + normalize

yonigozlan

Thanks for iterating and adding tests! some little things left to change/check

yonigozlan · 2025-04-04T03:32:06Z

+    BASE_IMAGE_PROCESSOR_FAST_DOCSTRING,
+)
+class EfficientNetImageProcessorFast(BaseImageProcessorFast):
+    resample = PILImageResampling.NEAREST


Yes that's good!

yonigozlan · 2025-04-04T03:45:43Z

+            device=images.device,
+        )
+        # if/elif as we use fused rescale and normalize if both are set to True
+        if do_rescale:


won't there be a problem here if offset is true and do_normalize is True (because of the elif right after)?

zshn25 · 2025-04-04T10:59:32Z

All tests pass except one, which is unrelated to this PR

FAILED tests/models/idefics2/test_modeling_idefics2.py::Idefics2ForConditionalGenerationModelTest::test_constrained_beam_search_generate_dict_output - RuntimeError: shape mismatch: value tensor of shape [32, 64] cannot be broadcast to indexing result of shape [34, 64]```

yonigozlan

Thanks for iterating looks great now! Let's wait for @ArthurZucker final approval then LGTM

HuggingFaceDocBuilderDev · 2025-04-15T17:01:32Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

* added efficientnet image preprocessor but tests fail * ruff checks pass * ruff formatted * properly pass rescale_offset through the functions * - corrected indentation, ordering of methods - reshape test passes when casted to float64 - equivalence test doesn't pass * all tests now pass - changes order of rescale, normalize acc to slow - rescale_offset defaults to False acc to slow - resample was causing difference in fast and slow. Changing test to bilinear resolves this difference * ruff reformat * F.InterpolationMode.NEAREST_EXACT gives TypeError: Object of type InterpolationMode is not JSON serializable * fixes offset not being applied when do_rescale and do_normalization are both true * - using nearest_exact sampling - added tests for rescale + normalize * resolving reviews --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

tomaarsen · 2025-08-05T07:51:24Z

Hello!

This change introduces a dependency of torchvision>=0.19.0, whereas previously I was able to use older versions. Even worse: when using older versions of torchvision, importing parts of transformers will quietly (!!!) fail, e.g.:

ImportError: cannot import name 'PreTrainedModel' from 'transformers'

when running

from transformers import PreTrainedModel

The underlying source only becomes apparent when importing with the full path:

from transformers.modeling_utils import PreTrainedModel

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "[sic]\lib\site-packages\transformers\modeling_utils.py", line 74, in <module>
    from .loss.loss_utils import LOSS_MAPPING
  File "[sic]\lib\site-packages\transformers\loss\loss_utils.py", line 21, in <module>
    from .loss_d_fine import DFineForObjectDetectionLoss
  File "[sic]\lib\site-packages\transformers\loss\loss_d_fine.py", line 21, in <module>
    from .loss_for_object_detection import (
  File "[sic]\lib\site-packages\transformers\loss\loss_for_object_detection.py", line 32, in <module>
    from transformers.image_transforms import center_to_corners_format
  File "[sic]\lib\site-packages\transformers\image_transforms.py", line 22, in <module>
    from .image_utils import (
  File "[sic]\lib\site-packages\transformers\image_utils.py", line 62, in <module>
    PILImageResampling.NEAREST: InterpolationMode.NEAREST_EXACT,
  File "C:\Users\tom\.conda\envs\setfit\lib\enum.py", line 429, in __getattr__
    raise AttributeError(name) from None
AttributeError: NEAREST_EXACT

I'm fine with upgrading my torchvision, but perhaps that should be required (via setup.py), and we really should be wary with hidden import errors.

Tom Aarsen

added efficientnet image preprocessor but tests fail

bc38b3c

github-actions Bot marked this pull request as draft March 27, 2025 21:11

zshn25 mentioned this pull request Mar 27, 2025

[Contributions Welcome] Add Fast Image Processors #36978

Closed

81 tasks

ruff checks pass

304efe4

zshn25 marked this pull request as ready for review March 27, 2025 21:22

github-actions Bot requested review from ydshieh and yonigozlan March 27, 2025 21:22

zshn25 added 2 commits March 27, 2025 22:23

ruff formatted

9456d3a

properly pass rescale_offset through the functions

c914a4e

zshn25 and others added 2 commits March 30, 2025 10:55

Merge branch 'huggingface:main' into main

4ba3343

- corrected indentation, ordering of methods

2daf5d1

- reshape test passes when casted to float64 - equivalence test doesn't pass

zshn25 added 2 commits March 30, 2025 13:13

all tests now pass

e725d86

- changes order of rescale, normalize acc to slow - rescale_offset defaults to False acc to slow - resample was causing difference in fast and slow. Changing test to bilinear resolves this difference

ruff reformat

dd1b963

Merge branch 'main' into main

bc836bc

yonigozlan reviewed Mar 31, 2025

View reviewed changes

Yann-CV reviewed Mar 31, 2025

View reviewed changes

F.InterpolationMode.NEAREST_EXACT gives TypeError: Object of type Int…

229fa81

…erpolationMode is not JSON serializable

zshn25 marked this pull request as draft April 1, 2025 06:52

Yann-CV reviewed Apr 1, 2025

View reviewed changes

zshn25 added 2 commits April 1, 2025 10:26

Merge branch 'main' of https://github.com/huggingface/transformers

ba9b1d1

fixes offset not being applied when do_rescale and do_normalization a…

240251d

…re both true

Yann-CV reviewed Apr 1, 2025

View reviewed changes

Comment thread src/transformers/models/efficientnet/image_processing_efficientnet_fast.py

- using nearest_exact sampling

36446c5

- added tests for rescale + normalize

zshn25 marked this pull request as ready for review April 1, 2025 16:49

Merge branch 'main' into main

a578d70

Merge branch 'huggingface:main' into main

0e7ed7e

yonigozlan reviewed Apr 4, 2025

View reviewed changes

zshn25 added 3 commits April 4, 2025 08:36

resolving reviews

3e6af15

Merge branch 'main' of https://github.com/zshn25/transformers

916a1b3

Merge branch 'main' of https://github.com/huggingface/transformers

6979fed

Merge branch 'main' into main

61e582f

yonigozlan approved these changes Apr 7, 2025

View reviewed changes

yonigozlan added 2 commits April 15, 2025 13:28

Merge branch 'main' into main

acec645

Merge branch 'main' into main

8d1b44a

Merge branch 'main' into main

418e1ca

yonigozlan merged commit a7d2bba into huggingface:main Apr 16, 2025
20 checks passed

tomaarsen mentioned this pull request Aug 5, 2025

Hidden torchvision>=0.19.0 dependency results in quiet import failures of e.g. PreTrainedModel #39907

Closed

Conversation

zshn25 commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

github-actions Bot commented Mar 27, 2025

Uh oh!

zshn25 commented Mar 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zshn25 commented Mar 30, 2025

Uh oh!

zshn25 commented Mar 30, 2025

Uh oh!

yonigozlan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yonigozlan commented Mar 31, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yonigozlan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zshn25 commented Apr 4, 2025

Uh oh!

yonigozlan left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 15, 2025

Uh oh!

Uh oh!

tomaarsen commented Aug 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

zshn25 commented Mar 27, 2025 •

edited

Loading

zshn25 commented Mar 29, 2025 •

edited

Loading