Skip to content

Add EfficientNet Image PreProcessor#37055

Merged
yonigozlan merged 22 commits intohuggingface:mainfrom
zshn25:main
Apr 16, 2025
Merged

Add EfficientNet Image PreProcessor#37055
yonigozlan merged 22 commits intohuggingface:mainfrom
zshn25:main

Conversation

@zshn25
Copy link
Copy Markdown
Contributor

@zshn25 zshn25 commented Mar 27, 2025

What does this PR do?

Add Fast Image Processor #36978 for EfficientNet

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@yonigozlan

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@github-actions github-actions Bot marked this pull request as draft March 27, 2025 21:11
@github-actions
Copy link
Copy Markdown
Contributor

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

@zshn25 zshn25 marked this pull request as ready for review March 27, 2025 21:22
@github-actions github-actions Bot requested review from ydshieh and yonigozlan March 27, 2025 21:22
@zshn25
Copy link
Copy Markdown
Contributor Author

zshn25 commented Mar 29, 2025

3 tests fail but can't figure out why.

======================================================================= short test summary info =======================================================================
FAILED tests/models/efficientnet/test_image_processing_efficientnet.py::EfficientNetImageProcessorTest::test_rescale - AssertionError: False is not true
FAILED tests/models/efficientnet/test_image_processing_efficientnet.py::EfficientNetImageProcessorTest::test_slow_fast_equivalence - AssertionError: False is not true
FAILED tests/models/efficientnet/test_image_processing_efficientnet.py::EfficientNetImageProcessorTest::test_slow_fast_equivalence_batched - AssertionError: False is not true
==================================================================== 3 failed, 17 passed in 3.93s =====================================================================

zshn25 and others added 2 commits March 30, 2025 10:55
- reshape test passes when casted to float64
- equivalence test doesn't pass
@zshn25
Copy link
Copy Markdown
Contributor Author

zshn25 commented Mar 30, 2025

test_rescale passes when inputs are casted to np.float64 rather than the default np.float32. Made necessary change to the test method.

zshn25 added 2 commits March 30, 2025 13:13
- changes order of rescale, normalize acc to slow
- rescale_offset defaults to False acc to slow
- resample was causing difference in fast and slow. Changing test to bilinear resolves this difference
@zshn25
Copy link
Copy Markdown
Contributor Author

zshn25 commented Mar 30, 2025

Thanks to #37094 (comment), changing the test resamping to bilinear passes the equivalence tests

Copy link
Copy Markdown
Member

@yonigozlan yonigozlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @zshn25 ! Thanks a lot for working on this, looks great! Only things to change is using F.InterpolationMode.NEAREST_EXACT and see if the equivalence tests pass this way.

do_normalize=True,
image_mean=[0.5, 0.5, 0.5],
image_std=[0.5, 0.5, 0.5],
resample=PILImageResampling.BILINEAR, # NEAREST is too different between PIL and torchvision
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you try with F.InterpolationMode.NEAREST_EXACT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried, both the equivalence tests fail with nearest but not with bilinear.

BASE_IMAGE_PROCESSOR_FAST_DOCSTRING,
)
class EfficientNetImageProcessorFast(BaseImageProcessorFast):
resample = PILImageResampling.NEAREST
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you try with F.InterpolationMode.NEAREST_EXACT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yonigozlan, thanks for the review. Using F.InterpolationMode.NEAREST_EXACT, gives the TypeError: Object of type InterpolationMode is not JSON serializable error while dump the json pretrained file.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now use the pil_torch_interpolation_mapping to map PILImageResampling.NEAREST to InterpolationMode.NEAREST_EXACT instead of InterpolationMode.NEAREST

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's good!

@yonigozlan
Copy link
Copy Markdown
Member

Hi @chewyuenrachael and @Yann-CV thank you for working on your PRs #37119 #37094 . I reviewed this one as it was the first to be posted, but it looks like your PR also helped here so thanks a lot!

device=images.device,
)
# if/elif as we use fused rescale and normalize if both are set to True
if do_rescale:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can be wrong but I guess here the offset will not be applied if the normalization and rescale are activated together (do_rescale becomes false from _fuse_mean_std_and_rescale_factor)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Yann-CV, thank you for the review. Nice catch there. I pushed a fix. Thank you.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't there be a problem here if offset is true and do_normalize is True (because of the elif right after)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. nice catch @yonigozlan. Replaced elif by if.

@zshn25 zshn25 marked this pull request as draft April 1, 2025 06:52
rescaled_image = image_processor.rescale(image, scale=1 / 127.5)
expected_image = (image * (1 / 127.5)).astype(np.float32) - 1
self.assertTrue(np.allclose(rescaled_image, expected_image))
rescaled_image = image_processor.rescale(image, scale=1 / 127.5, dtype=np.float64)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we compare the resize methods in both classes, the slow one is converting data to float64 inside itself. if the goal is to be fully equivalent, it probably needs to be done as well in the fast version.

Also the fast method is using torch tensors, in my opinion it is a better practice to test it torch.Tensor objects

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added tests with torch.tensor objects for rescale

- added tests for rescale + normalize
@zshn25 zshn25 marked this pull request as ready for review April 1, 2025 16:49
Copy link
Copy Markdown
Member

@yonigozlan yonigozlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating and adding tests! some little things left to change/check

BASE_IMAGE_PROCESSOR_FAST_DOCSTRING,
)
class EfficientNetImageProcessorFast(BaseImageProcessorFast):
resample = PILImageResampling.NEAREST
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's good!

Comment thread src/transformers/models/efficientnet/image_processing_efficientnet_fast.py Outdated
device=images.device,
)
# if/elif as we use fused rescale and normalize if both are set to True
if do_rescale:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't there be a problem here if offset is true and do_normalize is True (because of the elif right after)?

Comment thread src/transformers/models/efficientnet/image_processing_efficientnet_fast.py Outdated
@zshn25
Copy link
Copy Markdown
Contributor Author

zshn25 commented Apr 4, 2025

All tests pass except one, which is unrelated to this PR

FAILED tests/models/idefics2/test_modeling_idefics2.py::Idefics2ForConditionalGenerationModelTest::test_constrained_beam_search_generate_dict_output - RuntimeError: shape mismatch: value tensor of shape [32, 64] cannot be broadcast to indexing result of shape [34, 64]```

Copy link
Copy Markdown
Member

@yonigozlan yonigozlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating looks great now! Let's wait for @ArthurZucker final approval then LGTM

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@yonigozlan yonigozlan merged commit a7d2bba into huggingface:main Apr 16, 2025
20 checks passed
cyr0930 pushed a commit to cyr0930/transformers that referenced this pull request Apr 18, 2025
* added efficientnet image preprocessor but tests fail

* ruff checks pass

* ruff formatted

* properly pass rescale_offset through the functions

* - corrected indentation, ordering of methods
- reshape test passes when casted to float64
- equivalence test doesn't pass

* all tests now pass
- changes order of rescale, normalize acc to slow
- rescale_offset defaults to False acc to slow
- resample was causing difference in fast and slow. Changing test to bilinear resolves this difference

* ruff reformat

* F.InterpolationMode.NEAREST_EXACT gives TypeError: Object of type InterpolationMode is not JSON serializable

* fixes offset not being applied when do_rescale and do_normalization are both true

* - using nearest_exact sampling
- added tests for rescale + normalize

* resolving reviews

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025
* added efficientnet image preprocessor but tests fail

* ruff checks pass

* ruff formatted

* properly pass rescale_offset through the functions

* - corrected indentation, ordering of methods
- reshape test passes when casted to float64
- equivalence test doesn't pass

* all tests now pass
- changes order of rescale, normalize acc to slow
- rescale_offset defaults to False acc to slow
- resample was causing difference in fast and slow. Changing test to bilinear resolves this difference

* ruff reformat

* F.InterpolationMode.NEAREST_EXACT gives TypeError: Object of type InterpolationMode is not JSON serializable

* fixes offset not being applied when do_rescale and do_normalization are both true

* - using nearest_exact sampling
- added tests for rescale + normalize

* resolving reviews

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
@tomaarsen
Copy link
Copy Markdown
Member

Hello!

This change introduces a dependency of torchvision>=0.19.0, whereas previously I was able to use older versions. Even worse: when using older versions of torchvision, importing parts of transformers will quietly (!!!) fail, e.g.:

ImportError: cannot import name 'PreTrainedModel' from 'transformers'

when running

from transformers import PreTrainedModel

The underlying source only becomes apparent when importing with the full path:

from transformers.modeling_utils import PreTrainedModel
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "[sic]\lib\site-packages\transformers\modeling_utils.py", line 74, in <module>
    from .loss.loss_utils import LOSS_MAPPING
  File "[sic]\lib\site-packages\transformers\loss\loss_utils.py", line 21, in <module>
    from .loss_d_fine import DFineForObjectDetectionLoss
  File "[sic]\lib\site-packages\transformers\loss\loss_d_fine.py", line 21, in <module>
    from .loss_for_object_detection import (
  File "[sic]\lib\site-packages\transformers\loss\loss_for_object_detection.py", line 32, in <module>
    from transformers.image_transforms import center_to_corners_format
  File "[sic]\lib\site-packages\transformers\image_transforms.py", line 22, in <module>
    from .image_utils import (
  File "[sic]\lib\site-packages\transformers\image_utils.py", line 62, in <module>
    PILImageResampling.NEAREST: InterpolationMode.NEAREST_EXACT,
  File "C:\Users\tom\.conda\envs\setfit\lib\enum.py", line 429, in __getattr__
    raise AttributeError(name) from None
AttributeError: NEAREST_EXACT

I'm fine with upgrading my torchvision, but perhaps that should be required (via setup.py), and we really should be wary with hidden import errors.

  • Tom Aarsen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants