Fix ONNX exports for Optimum compatible models by merveenoyan · Pull Request #31311 · huggingface/transformers

merveenoyan · 2024-06-07T12:29:43Z

@amyeroberts as discussed and also pinging @xenova for review :') (who also fixed DPT)

I prioritized Optimum compatible ones because I'm launching a project where there's Optimum examples for vision models. I will have a separate PR for the models that aren't compatible with Optimum. Rest of the Optimum compatible models export well without a problem.

xenova · 2024-06-07T12:35:29Z

+        def safe_int(x):
+          return x.to(torch.int64) if torch.jit.is_tracing() else int(x)
+        old_grid_size = safe_int(posemb_grid.size(0) ** 0.5)


It might be necessary to extract this cast function to a utility file, and then re-use across different models. I have a feeling that other instances, like:

will break execution, since during inference, it may require normal python types

xenova · 2024-06-07T12:36:20Z

+        new_height = (torch.ceil(orig_height / patch_height) * patch_height).to(torch.int64)
+        new_width = (torch.ceil(orig_width / patch_width) * patch_width).to(torch.int64)


Same comment as above - doesn't interpolate require (int, int) when not tracing?

I'll check tracing, thanks for the heads up

HuggingFaceDocBuilderDev · 2024-06-07T12:57:36Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

amyeroberts

Very nice! Thanks for fixing this for all these models ❤️

Just a few small comments

amyeroberts · 2024-06-07T13:33:12Z

        raise TypeError(f"Could not infer framework from class {model_class}.")
+
+
+def safe_int(x):


Docstrings would be helpful here e.g. for inspecting in IDEs: what does it mean for an int to be safe?

Indeed a better name is probably a good idea 😅 I called it safe_int in a way to "safely cast some value (which could be a python number or tensor) to an integer in a way that respects tracing"

I'll swap with torch_int and torch_float

amyeroberts · 2024-06-07T13:34:48Z

-        new_width = int(math.ceil(orig_width / patch_width) * patch_width)
+        new_height = (
+            safe_float(torch.ceil(orig_height / patch_height) * patch_height)
+            if torch.jit.is_tracing()


Do we need the conditional here? This is already handled in the safe_float and safe_int functions

I think it's required for torch.ceil no?

tbh, I don't know, is there a reason we couldn't usetorch.ceil directly?

if I'm passing an int or float, torch.ceil will be called first and it will fail because torch.ceil can only be called with tensors AFAIK

Makes sense!

Only other Q here then is why do we use a float when tracing and int otherwise?

sorry I think I was mistaken with that one, you're right, I fixed it :)

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

merveenoyan · 2024-06-16T15:10:49Z

@amyeroberts the failing tests seem irrelevant to this PR, I can't re-run them, can you re-run?

amyeroberts · 2024-06-17T10:43:31Z

@merveenoyan si si - done!

amyeroberts

Thanks for fixing all of these!

Just for my own understanding - it there any reason to not use the torch compatible float/int when were not tracing?

merveenoyan · 2024-06-19T09:45:27Z

@amyeroberts to my understanding, torch ONNX export internally calls trace thus to capture the graph properly all needs to be torch tensor, one needs to check tracing, and convert to torch tensor/convert ops internally if it's not torch

merveenoyan · 2024-06-19T09:45:40Z

@amyeroberts can you merge if you think it's ok?

amyeroberts · 2024-06-19T10:50:53Z

@amyeroberts to my understanding, torch ONNX export internally calls trace thus to capture the graph properly all needs to be torch tensor, one needs to check tracing, and convert to torch tensor/convert ops internally if it's not torch

Right, I see why we need to do it for the onnx export, but for day-to-day use could we just use torch primitives instead of a python int or float i.e. do we need to maintain this if/else structure or can we switch everything to torch land regardless of whether we're tracing or not?

merveenoyan · 2024-06-21T14:17:02Z

@amyeroberts I guess if it's just torch modelling code then yes. Would you like me to swap everything?

merveenoyan · 2024-06-21T14:17:13Z

also asking the same question to @xenova

amyeroberts · 2024-06-21T15:08:22Z

Would you like me to swap everything?

@merveenoyan Yes please! This will be cleaner and easier to follow in the code :)

xenova · 2024-06-21T18:44:27Z

also asking the same question to @xenova

I agree with @amyeroberts - if there is a way to "do everything in torch land", that's the best solution! However, there are cases where I'm not entirely sure how to do this. For example, with torch.nn.interpolate:

during inference, the sizes/scales CANNOT be tensors. They must be python ints/floats.
during tracing, the sizes/scales SHOULD be tensors. Dynamic shapes usually break if they are python ints/floats due to the loss of tracing.

See here for example code (DinoV2 backbone):

        if torch.jit.is_tracing():
          sqrt_N = N ** 0.5
          patch_pos_embed = nn.functional.interpolate(
              patch_pos_embed.reshape(1, (sqrt_N).to(torch.int64), (sqrt_N).to(torch.int64), dim).permute(0, 3, 1, 2),
              size=(w0, h0),
              mode="bicubic",
              antialias=self.interpolate_antialias,
          )
        else:
          sqrt_N = math.sqrt(N)
          sx, sy = float(w0) / sqrt_N, float(h0) / sqrt_N
          patch_pos_embed = nn.functional.interpolate(
              patch_pos_embed.reshape(1, int(sqrt_N), int(sqrt_N), dim).permute(0, 3, 1, 2),
              scale_factor=(sx, sy),
              mode="bicubic",
              antialias=self.interpolate_antialias,
          )

Very ugly... I know :/

merveenoyan · 2024-06-26T09:28:19Z

@xenova sounds good, very glad to work with you tbh I didn't know that it would be required in inference.
@amyeroberts should we merge?

amyeroberts · 2024-06-26T13:22:17Z

@merveenoyan My understanding from above was that the PR would be updated to remove all the if/else structures wherever possible (but as @xenova points out isn't everywhere unfortunately)

merveenoyan · 2024-06-27T08:24:19Z

@amyeroberts from what I understood we should still keep them in if/else not to break the inference (I'm also scared of edge cases if there is etc) so I'd rather keep them. what I can do is to test all of them to see if they break or not when all are tensors and remove where it doesn't have to be a python type

amyeroberts · 2024-06-27T09:46:33Z

@merveenoyan OK. Let's just merge then and we can follow up in future PRs 👍

merveenoyan added 3 commits June 7, 2024 14:28

fixed models

057e6ec

format with bumped ruff version on my local

43bdb88

fix copies

6f6857c

xenova reviewed Jun 7, 2024

View reviewed changes

merveenoyan added 2 commits June 7, 2024 15:20

add tracing checks

a0d75be

format

2489e2b

merveenoyan requested a review from xenova June 7, 2024 13:27

amyeroberts reviewed Jun 7, 2024

View reviewed changes

merveenoyan and others added 8 commits June 7, 2024 16:49

Update src/transformers/utils/generic.py

29919ae

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

format

b5ef412

Merge branch 'main' into fix-onnx

d590c2a

style fix

a0ab8e5

Update modeling_mobilevit.py

25da544

add docstring and change name

9ac59d4

Update __init__.py

0cc509f

Update __init__.py

8a74c32

amyeroberts approved these changes Jun 17, 2024

View reviewed changes

amyeroberts merged commit c9f191a into huggingface:main Jun 27, 2024

echarlaix mentioned this pull request Jul 15, 2024

Requires for torch.tensor before casting #31755

Merged

xenova mentioned this pull request Aug 29, 2024

ONNX export may fail for Hiera due to math.sqrt and python type int casts #33181

Closed

		new_height = (torch.ceil(orig_height / patch_height) * patch_height).to(torch.int64)
		new_width = (torch.ceil(orig_width / patch_width) * patch_width).to(torch.int64)

		raise TypeError(f"Could not infer framework from class {model_class}.")


		def safe_int(x):

Conversation

merveenoyan commented Jun 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Jun 7, 2024

Uh oh!

amyeroberts left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

merveenoyan Jun 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amyeroberts Jun 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

merveenoyan commented Jun 16, 2024

Uh oh!

amyeroberts commented Jun 17, 2024

Uh oh!

amyeroberts left a comment

Choose a reason for hiding this comment

Uh oh!

merveenoyan commented Jun 19, 2024

Uh oh!

merveenoyan commented Jun 19, 2024

Uh oh!

amyeroberts commented Jun 19, 2024

Uh oh!

merveenoyan commented Jun 21, 2024

Uh oh!

merveenoyan commented Jun 21, 2024

Uh oh!

amyeroberts commented Jun 21, 2024

Uh oh!

xenova commented Jun 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

merveenoyan commented Jun 26, 2024

Uh oh!

amyeroberts commented Jun 26, 2024

Uh oh!

merveenoyan commented Jun 27, 2024

Uh oh!

amyeroberts commented Jun 27, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

merveenoyan commented Jun 7, 2024 •

edited

Loading

merveenoyan Jun 11, 2024 •

edited

Loading

amyeroberts Jun 7, 2024 •

edited

Loading

xenova commented Jun 21, 2024 •

edited

Loading