Export TensorFlow models to ONNX with dynamic input shapes by dwyatte · Pull Request #19255 · huggingface/transformers

dwyatte · 2022-09-30T12:40:42Z

What does this PR do?

This PR exports TensorFlow models to ONNX with dynamic input shapes. Previously they were being exported with static input shapes with a batch size of 2 and sequence length of 8. This should bring TensorFlow to ONNX export mostly into parity with PyTorch Models.

Fixes #19238

While fixing this, I noticed the TensorFlow to ONNX export tests weren't actually exporting TensorFlow models because FeaturesManager.get_model_class_for_feature returns a PyTorch model class by default. I've exposed a framework argument on these tests so that FeaturesManager.get_model_class_for_feature can return TensorFlow models. NOTE: Exporting TensorFlow to ONNX seems to be much slower than exporting PyTorch to ONNX so CI duration will increase
I've changed validate_model_outputs to check with a batch size/sequence length different than used during export (now 3 and 9 respectively). There was a TODO about this, but it surfaced an error for BERT, CamemBERT, and RoBERTa multiple-choice tasks onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Add node. Name:'tf_bert_for_multiple_choice/bert/encoder/layer_._0/attention/self/add_1' Status Message: /Users/runner/work/1/s/onnxruntime/core/providers/cpu/math/element_wise_ops.h:503 void onnxruntime::BroadcastIterator::Init(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1, I suspect due to the way these models are defined (tracing fails to properly infer shape somewhere). IMO this is still a net improvement since the ONNX models exported under TensorFlow were previously non-functional except with their static input shapes. I'm skipping these specific configurations during testing for now, but someone should look into this

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

If you know how to use git blame, that is the easiest way, otherwise, here is a rough guide of who to tag.
Please tag fewer than 3 people.

@Rocketknight1, @LysandreJik, @lewtun

HuggingFaceDocBuilderDev · 2022-09-30T12:51:33Z

The documentation is not available anymore as the PR was closed or merged.

lewtun

Thanks a lot for enabling dynamic input shapes and ensuring our tests actually test the TF exports @dwyatte !

Can you please confirm that the slow tests pass by running:

RUN_SLOW=1 pytest tests/onnx/test_onnx_v2.py

It would also be interesting to know how much slower the TF exports are compared to the PyTorch ones, e.g. can you share some timings for a few models?

lewtun · 2022-09-30T13:38:43Z

-        reference_model_inputs = config.generate_dummy_inputs(preprocessor, framework=TensorType.PYTORCH)
+        reference_model_inputs = config.generate_dummy_inputs(
+            preprocessor,
+            batch_size=config.default_fixed_batch + 1,


Nice, simple idea!

dwyatte · 2022-10-01T23:13:17Z

Can you please confirm that the slow tests pass by running RUN_SLOW=1 pytest tests/onnx/test_onnx_v2.py

There were a few failures here (16 failed, 400 passed, 16 skipped, 72972 warnings):

FAILED tests/onnx/test_onnx_v2.py::OnnxExportTestCaseV2::test_pytorch_export_029_clip_default - TypeError: generate_dummy_inputs() got an unexpected keyword argument 'batch_size'
FAILED tests/onnx/test_onnx_v2.py::OnnxExportTestCaseV2::test_pytorch_export_051_deberta_v2_question_answering - onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Deserialize tensor onn...
FAILED tests/onnx/test_onnx_v2.py::OnnxExportTestCaseV2::test_pytorch_export_050_deberta_v2_multiple_choice - AssertionError: deberta-v2, multiple-choice -> Outputs values doesn't match between reference model and ONN...
FAILED tests/onnx/test_onnx_v2.py::OnnxExportTestCaseV2::test_pytorch_export_078_groupvit_default - TypeError: generate_dummy_inputs() got an unexpected keyword argument 'batch_size'
FAILED tests/onnx/test_onnx_v2.py::OnnxExportTestCaseV2::test_pytorch_export_109_owlvit_default - TypeError: generate_dummy_inputs() got an unexpected keyword argument 'batch_size'
FAILED tests/onnx/test_onnx_v2.py::OnnxExportTestCaseV2::test_pytorch_export_110_perceiver_image_classification - onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTIO...
FAILED tests/onnx/test_onnx_v2.py::OnnxExportTestCaseV2::test_pytorch_export_111_perceiver_masked_lm - onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zer...
FAILED tests/onnx/test_onnx_v2.py::OnnxExportTestCaseV2::test_pytorch_export_112_perceiver_sequence_classification - onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEP...
FAILED tests/onnx/test_onnx_v2.py::OnnxExportTestCaseV2::test_pytorch_export_on_cuda_078_groupvit_default - TypeError: generate_dummy_inputs() got an unexpected keyword argument 'batch_size'
FAILED tests/onnx/test_onnx_v2.py::OnnxExportTestCaseV2::test_pytorch_export_125_roformer_multiple_choice - onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got ...
FAILED tests/onnx/test_onnx_v2.py::OnnxExportTestCaseV2::test_pytorch_export_on_cuda_029_clip_default - TypeError: generate_dummy_inputs() got an unexpected keyword argument 'batch_size'
FAILED tests/onnx/test_onnx_v2.py::OnnxExportTestCaseV2::test_pytorch_export_on_cuda_125_roformer_multiple_choice - onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMEN...
FAILED tests/onnx/test_onnx_v2.py::OnnxExportTestCaseV2::test_pytorch_export_on_cuda_109_owlvit_default - TypeError: generate_dummy_inputs() got an unexpected keyword argument 'batch_size'
FAILED tests/onnx/test_onnx_v2.py::OnnxExportTestCaseV2::test_pytorch_export_on_cuda_110_perceiver_image_classification - onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_...
FAILED tests/onnx/test_onnx_v2.py::OnnxExportTestCaseV2::test_pytorch_export_on_cuda_111_perceiver_masked_lm - onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION :...
FAILED tests/onnx/test_onnx_v2.py::OnnxExportTestCaseV2::test_pytorch_export_on_cuda_112_perceiver_sequence_classification - onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTI...

clip, groupvit, and owlvit should be easy fixes to expose the relevant args (or consume via **kwargs) in their generate_dummy_inputs
deberta is failing with my environment on [49d62b0](https://github.com/dwyatte/transformers/commit/49d62b01783416a89acc0b865f7cb8dbab87cd6b) which I branched from
perceiver and roformer are real errors, but seem to be due to static input shapes e.g.,

E           onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Reshape node. Name:'Reshape_57' Status Message: /Users/runner/work/1/s/onnxruntime/core/providers/cpu/tensor/reshape_helper.h:41 onnxruntime::ReshapeHelper::ReshapeHelper(const onnxruntime::TensorShape &, onnxruntime::TensorShapeVector &, bool) gsl::narrow_cast<int64_t>(input_shape.Size()) == size was false. The input tensor cannot be reshaped to the requested shape. Input shape:{3,256,256}, requested shape:{2,256,8,32}

E           onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: token_type_ids for the following indices
E            index: 2 Got: 17 Expected: 15
E            Please fix either the inputs or the model.

A couple of options:

Disable tests for deberta, perceiver, and roformer for this PR while we figure out what's going on there
Don't include the code that automatically adds 1 to the batch size and sequence length during validation in this PR
Refactor the code to pass in batch_size and seq_length to validate_model_outputs to give more control over which models are tested with dynamic input shapes

What do you think / any other ideas?

dwyatte · 2022-10-01T23:39:09Z

It would also be interesting to know how much slower the TF exports are compared to the PyTorch ones, e.g. can you share some timings for a few models?

bert-base-cased
- TensorFlow: 7 passed, 6 skipped, 19031 warnings in 525.40s (0:08:45)
- PyTorch: 7 passed, 6 skipped, 9 warnings in 83.50s (0:01:23)
hf-internal-testing/tiny-albert
- TensorFlow: 6 passed, 6 skipped, 4059 warnings in 28.33s
- PyTorch: 6 passed, 6 skipped, 9 warnings in 14.30s
distilbert-base-cased
- TensorFlow: 6 passed, 6 skipped, 10241 warnings in 293.32s (0:04:53)
- PyTorch: 6 passed, 6 skipped, 15 warnings in 40.01s

So TF is around 2-8x slower on my machine (2.3 GHz 8-Core Intel Core i9). The warnings are mainly deprecation warnings from tf2onnx

dwyatte · 2022-10-03T22:51:00Z

@lewtun any further thoughts on this PR with the goal of supporting dynamic input shapes in ONNX models exported from TensorFlow?

It's not clear to me how tests/onnx/test_onnx_v2.py is used since it doesn't block checks here. Should we skip model/task/framework configurations known to fail a la

transformers/tests/onnx/test_onnx_v2.py

Lines 300 to 303 in 6b36673

    
           # ONNX inference fails on bert, camembert, and roberta multiple-choice when exported with TensorFlow. 
        
           # Skip for now 
        
           if name in ("bert", "camembert", "roberta") and feature == "multiple-choice" and framework == "tf": 
        
               return

Or is it ok to leave the failures if they don't block anything? Is the increase in test time for TF models a concern if it doesn't run regularly?

I suppose part of the answer is whether we want users to experience export failures related to dynamic shapes (which the current code in this PR would do) vs removing explicit dynamic shape validation from the user experience and limiting it to tests.

lewtun · 2022-10-05T08:13:20Z

Hey @dwyatte, thanks for sharing the timings! I'm currently working on dramatically shrinking all the ONNX models we use for internal testing, so a 2-8x slowdown for some models is probably OK.

Regarding how to handle the model validation:

A couple of options:

Disable tests for deberta, perceiver, and roformer for this PR while we figure out what's going on there

Don't include the code that automatically adds 1 to the batch size and sequence length during validation in this PR

Refactor the code to pass in batch_size and seq_length to validate_model_outputs to give more control over which models are tested with dynamic input shapes

I am in favour of option (1) and creating a separate issue to figure out what's wrong in the ONNX export of these 3 models. You can skip these tests by following the same logic you linked to above :)

… validation

dwyatte · 2022-10-05T18:19:22Z

I am in favour of option (1) and creating a separate issue to figure out what's wrong in the ONNX export of these 3 models. You can skip these tests by following the same logic you linked to above

Created #19357 to track this. tests/onnx/test_onnx_v2.py should now be 100% passing/skipped (416 passed, 16 skipped in my env)

lewtun

Thanks a lot for opening an issue to track the problematic ONNX models @dwyatte 🔥 !

This PR LGTM, so gently pinging @sgugger for final approval.

For context in the review: @dwyatte uncovered some edge cases that our ONNX tests didn't cover. This PR currently skips the problematic model heads and we decided to tackle them in a separate issue, since this one is focused on enabling dynamic shapes for TF models

sgugger

LGTM, thanks a lot for working on this!

sgugger · 2022-10-07T12:21:52Z

There was some problem with CircleCI which only ran part of the test suite (and I can't manually re-run it). Could you push an empty commit on your branch (git commit -m "Trigger CI" --allow-empty)?

dwyatte · 2022-10-07T14:44:38Z

There was some problem with CircleCI which only ran part of the test suite (and I can't manually re-run it). Could you push an empty commit on your branch (git commit -m "Trigger CI" --allow-empty)?

I think I was having the same problem described here #18351 (comment)

9496836 ran the CI under the huggingface org, so should be good to go now

sgugger · 2022-10-07T14:53:06Z

Thanks!

dwyatte added 4 commits September 29, 2022 13:54

validate onnx models with a different input geometry than saved with

c7c81b7

only test working features for now

1f1bfa3

simpler test skipping

cb59cfa

rm TODO

1cda037

lewtun reviewed Sep 30, 2022

View reviewed changes

expose batch_size/seq_length on vit

6b36673

lewtun mentioned this pull request Oct 5, 2022

ONNX conversion of deberta_v2 models #19320

Closed

4 tasks

dwyatte mentioned this pull request Oct 5, 2022

List of models/tasks failing ONNX inference #19357

Closed

14 tasks

skip certain name, feature, framework parameterizations known to fail…

0705fe7

… validation

dwyatte requested a review from lewtun October 5, 2022 18:30

lewtun approved these changes Oct 6, 2022

View reviewed changes

sgugger approved these changes Oct 7, 2022

View reviewed changes

dwyatte added 2 commits October 7, 2022 08:07

Trigger CI

99e2b81

Trigger CI

9496836

sgugger merged commit a26d71d into huggingface:main Oct 7, 2022

lewtun mentioned this pull request Oct 11, 2022

[Swin] Replace hard-coded batch size to enable dynamic ONNX export #19475

Merged

3 tasks

dwyatte deleted the tensorflow_onnx_dynamic_shape branch January 31, 2023 18:01

Conversation

dwyatte commented Sep 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Sep 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lewtun left a comment

Choose a reason for hiding this comment

Uh oh!

lewtun Sep 30, 2022

Choose a reason for hiding this comment

Uh oh!

dwyatte commented Oct 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dwyatte commented Oct 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dwyatte commented Oct 3, 2022

Uh oh!

lewtun commented Oct 5, 2022

Uh oh!

dwyatte commented Oct 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lewtun left a comment

Choose a reason for hiding this comment

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

sgugger commented Oct 7, 2022

Uh oh!

dwyatte commented Oct 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgugger commented Oct 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dwyatte commented Sep 30, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 30, 2022 •

edited

Loading

dwyatte commented Oct 1, 2022 •

edited

Loading

dwyatte commented Oct 1, 2022 •

edited

Loading

dwyatte commented Oct 5, 2022 •

edited

Loading

dwyatte commented Oct 7, 2022 •

edited

Loading