[poc][v1.x] ONNX Add an option to un-interleave BERT #20249

Zha0q1 · 2021-05-06T01:46:00Z

Add model (BERT) specific logic to un-interleave the self attention mat mul. This can potentially speed up inference with trt 8.0 whose compiler can recognize the new pattern.

default usage model_specific_logics='gluonnlp_bert'

converted_model_path = mx.onnx.export_model(sym_file, params_file, input_shapes,
                                            input_types, onnx_file,
                                            model_specific_logics='gluonnlp_bert_uninterleaved')

When the model is not bert base (meaning hidden != 768 or num_heads != 12), e.g. bert large
the usage is:

cheat_sheet = {
    'qkv_hidden': 1024,
    'num_heads': 16,
    'head_dim': 64
}
converted_model_path = mx.onnx.export_model(sym_file, params_file, input_shapes,
                                            input_types, onnx_file,
                                            model_specific_logics='gluonnlp_bert_uninterleaved',
                                            cheat_sheet=cheat_sheet)

This option to un-interleave self-attention would also work with bert-variants such as roberta, distilbert, and 'ernie'

The first screenshot is the old graph, the second is the new graph. Note that use of onnx-sim is required

@TristonC @MoisesHer @waytrue17 @josephevans

mxnet-bot · 2021-05-06T01:46:02Z

Hey @Zha0q1 , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

To trigger all jobs: @mxnet-bot run ci [all]
To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [clang, miscellaneous, unix-gpu, edge, centos-cpu, unix-cpu, centos-gpu, windows-cpu, website, windows-gpu, sanity]

Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

waytrue17 · 2021-05-06T18:33:48Z

python/mxnet/onnx/mx2onnx/_op_translations/__init__.py

 # coding: utf-8
 """ONNX export op translation"""

+from . import _gluonnlp_bert


Would this logic benefits specifically for TRT? If so, shall we consider to name it like '_gluonnlp_bert_trt'?

szha · 2021-05-07T00:12:59Z

It probably makes more sense to have gluonnlp export a BERT graph that doesn't involve interleaving than doing it in mxnet. framework shouldn't have knowledge about or rely on the implementation of its ecosystem packages.

szha · 2021-05-07T00:13:59Z

python/mxnet/onnx/mx2onnx/_export_model.py

+    model_specific_logics : str
+        Specifies if model-specific conversion logic should be used. Refer to ./_op_translations/
+    cheat_sheet : dict of str to str
+        This is a dict that stors some hyperparameters values or additional info about the model that
+        would be used in model-specific conversion functions


these options are semantically unclear and hard to maintain

Zha0q1 · 2021-05-07T00:18:11Z

It probably makes more sense to have gluonnlp export a BERT graph that doesn't involve interleaving than doing it in mxnet. framework shouldn't have knowledge about or rely on the implementation of its ecosystem packages.

I agree. Alternatively we might be able to make mx2onnx support loading custom conversion functions dynamically and put these functions to gluonnlp? Or if even that is too hacky this pr can serve as a poc to evaluate the performance benefit with trt 8.0 by un-interleaving the matrix multiplication

Zha0q1 · 2021-05-07T00:19:32Z

@szha @ptrendx for insights

ptrendx · 2021-05-12T23:16:54Z

@Zha0q1 What is the difficulty in doing this transformation after the export? You should be able to modify the ONNX graph itself, right?

optimization for bertgit status

fbb96ba

Zha0q1 requested a review from szha as a code owner May 6, 2021 01:46

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels May 6, 2021

fix sanity

86af75d

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels May 6, 2021

fix sanity

f741272

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels May 6, 2021

waytrue17 reviewed May 6, 2021

View reviewed changes

Zha0q1 added 2 commits May 6, 2021 21:14

add test case

259743c

add doc sring

3ab73cb

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels May 6, 2021

fix sanity

cdf0596

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-work-in-progress PR is still work in progress labels May 6, 2021

szha reviewed May 7, 2021

View reviewed changes

Zha0q1 changed the title ~~[v1.x] ONNX Add an option to un-interleave BERT~~ [poc][v1.x] ONNX Add an option to un-interleave BERT May 7, 2021

mseth10 added pr-awaiting-review PR is waiting for code review and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels May 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[poc][v1.x] ONNX Add an option to un-interleave BERT #20249

[poc][v1.x] ONNX Add an option to un-interleave BERT #20249

Uh oh!

Zha0q1 commented May 6, 2021 •

edited

Loading

Uh oh!

mxnet-bot commented May 6, 2021

Uh oh!

waytrue17 May 6, 2021

Uh oh!

szha commented May 7, 2021

Uh oh!

szha May 7, 2021

Uh oh!

Zha0q1 commented May 7, 2021 •

edited

Loading

Uh oh!

Zha0q1 commented May 7, 2021

Uh oh!

ptrendx commented May 12, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[poc][v1.x] ONNX Add an option to un-interleave BERT #20249

Are you sure you want to change the base?

[poc][v1.x] ONNX Add an option to un-interleave BERT #20249

Uh oh!

Conversation

Zha0q1 commented May 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mxnet-bot commented May 6, 2021

Uh oh!

waytrue17 May 6, 2021

Choose a reason for hiding this comment

Uh oh!

szha commented May 7, 2021

Uh oh!

szha May 7, 2021

Choose a reason for hiding this comment

Uh oh!

Zha0q1 commented May 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Zha0q1 commented May 7, 2021

Uh oh!

ptrendx commented May 12, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Zha0q1 commented May 6, 2021 •

edited

Loading

Zha0q1 commented May 7, 2021 •

edited

Loading