[Tracking Issue][ONNX] Quantized operator support in ONNX importer

### Summary

This issue will be used to track ONNX importer coverage progress of standard and non-standard quantized ops in TVM, and can be used to coordinate distributed efforts on improving quantized importer coverage work across organizations. 

### Status

To this day (Aug 24th 2021) we'd like to account for both standard ONNX quantized ops and non-standard quantized contrib ops introduced by ONNXRT, as shown in the table below:

Shortlist of ops that are emitted by ONNXRT static quantization (higher priority), based on https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/python/tools/quantization/registry.py:

| Operator | Standard | Opset | Status | Owner | PR |
|---|---|---|---|---|---|
| [ConvInteger](https://github.com/onnx/onnx/blob/master/docs/Operators.md#ConvInteger) | Y | 10 | Supported | @jwfromm | https://github.com/apache/tvm/pull/8456 |
| [DequantizeLinear](https://github.com/onnx/onnx/blob/master/docs/Operators.md#DequantizeLinear) | Y | 13, 10 | Supported | @mbrookhart | https://github.com/apache/tvm/pull/7802 |
| [MatMulInteger](https://github.com/onnx/onnx/blob/master/docs/Operators.md#MatMulInteger) | Y | 10 | WIP | @WenheLI |  |
| [QLinearConv](https://github.com/onnx/onnx/blob/master/docs/Operators.md#QLinearConv) | Y | 10 | Supported | @huochaitiantang | https://github.com/apache/tvm/pull/8007 |
| [QLinearMatMul](https://github.com/onnx/onnx/blob/master/docs/Operators.md#QLinearMatMul) | Y | 10 | Supported | @cconvey  | https://github.com/apache/tvm/pull/8952 |
| [QuantizeLinear](https://github.com/onnx/onnx/blob/master/docs/Operators.md#QuantizeLinear) | Y | 13, 10 | Supported | @mbrookhart | https://github.com/apache/tvm/pull/7802 |
| [com.microsoft.QAttention](https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.QAttention) | N | n/a | TODO |  |  |
| [com.microsoft.QLinearAdd](https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.QLinearAdd) | N | n/a | Supported | @mbrookhart | https://github.com/apache/tvm/pull/8305 |
| [com.microsoft.QLinearAveragePool](https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.QLinearAveragePool) | N | n/a | Supported | @quic-sanirudh | https://github.com/apache/tvm/pull/9017 |
| [com.microsoft.QLinearConcat](https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.QLinearConcat) | N | n/a | Supported | @anwang2009 | https://github.com/apache/tvm/pull/8907 |
| [com.microsoft.QLinearGlobalAveragePool](https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.QLinearGlobalAveragePool) | N | n/a | Supported | @quic-sanirudh | https://github.com/apache/tvm/pull/9017 |
| [com.microsoft.QLinearLeakyRelu](https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.QLinearLeakyRelu) | N | n/a | Supported | @gayatripk1  | https://github.com/apache/tvm/pull/9063 |
| [com.microsoft.QLinearMul](https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.QLinearMul) | N | n/a | Supported | @anwang2009 | https://github.com/apache/tvm/pull/8773 |
| [com.microsoft.QLinearSigmoid](https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.QLinearSigmoid) | N | n/a | Supported | @arangasa | https://github.com/apache/tvm/pull/9028 |

Ops for supporting ORT dynamic quantization:
| Operator | Standard | Opset | Status | Owner | PR |
|---|---|---|---|---|---|
| [DynamicQuantizeLinear](https://github.com/onnx/onnx/blob/master/docs/Operators.md#DynamicQuantizeLinear) | Y | 11 | Supported | @mbrookhart | https://github.com/apache/tvm/pull/7802 |
| [com.microsoft.DynamicQuantizeLSTM](https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.DynamicQuantizeLSTM) | N | n/a | TODO |  |  |
| [com.microsoft.DynamicQuantizeMatMul]( https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.DynamicQuantizeMatMul) | N | n/a | WIP | @quic-sanirudh  |  |

Other integer ops that might be relevant:
| Operator | Standard | Opset | Status | Owner | PR |
|---|---|---|---|---|---|
| [com.microsoft.MatMulInteger16](https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.MatMulInteger16) | Y | n/a | Supported | @abhikran-quic | https://github.com/apache/tvm/pull/9186 |
| [com.microsoft.MatMulIntegerToFloat](https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.MatMulIntegerToFloat) | N | n/a  | WIP | @onkar-sima-ai |  |
| [com.microsoft.MulInteger](https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.MulInteger) | N | n/a | WIP | @tasmia-rahman  |  |
| [com.microsoft.ReduceSumInteger](https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.ReduceSumInteger) | N | n/a | WIP |  @FranckQC  |  |

Other ops:
| Operator | Standard | Opset | Status | Owner | PR |
|---|---|---|---|---|---|
| [com.microsoft.QGemm](https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.QGemm) | N | n/a | WIP | @rasagna-quic |  |
| [com.microsoft.QLinearReduceMean](https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.QLinearReduceMean) | N | n/a | WIP | @avquicinc |  |



### Coordination

Improving the importers can be a good onboarding task for engineers that would like to get a more in-depth exposure to the TVM stack. The goal is that if folks want to claim an operator they can feel reassured that their work won't be deprecated by work that is in flight.

We provide reference PRs that can serve as a template to adding a quantized standard op: https://github.com/apache/tvm/pull/7802 by @mbrookhart. As well as non-standard op from the Microsoft ContribOperators set in ONNXRT: https://github.com/apache/tvm/pull/8773 by @anwang2009.

Please comment in this issue if you'd like to add to Relay ONNX importer coverage so I can update the table.



cc @KJlaccHoeUM9l @ehsanmok

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tracking Issue][ONNX] Quantized operator support in ONNX importer #8838

Summary

Status

Coordination

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Operator	Standard	Opset	Status	Owner	PR
ConvInteger	Y	10	Supported	@jwfromm	#8456
DequantizeLinear	Y	13, 10	Supported	@mbrookhart	#7802
MatMulInteger	Y	10	WIP	@WenheLI
QLinearConv	Y	10	Supported	@huochaitiantang	#8007
QLinearMatMul	Y	10	Supported	@cconvey	#8952
QuantizeLinear	Y	13, 10	Supported	@mbrookhart	#7802
com.microsoft.QAttention	N	n/a	TODO
com.microsoft.QLinearAdd	N	n/a	Supported	@mbrookhart	#8305
com.microsoft.QLinearAveragePool	N	n/a	Supported	@quic-sanirudh	#9017
com.microsoft.QLinearConcat	N	n/a	Supported	@anwang2009	#8907
com.microsoft.QLinearGlobalAveragePool	N	n/a	Supported	@quic-sanirudh	#9017
com.microsoft.QLinearLeakyRelu	N	n/a	Supported	@gayatripk1	#9063
com.microsoft.QLinearMul	N	n/a	Supported	@anwang2009	#8773
com.microsoft.QLinearSigmoid	N	n/a	Supported	@arangasa	#9028

Operator	Standard	Opset	Status	Owner	PR
DynamicQuantizeLinear	Y	11	Supported	@mbrookhart	#7802
com.microsoft.DynamicQuantizeLSTM	N	n/a	TODO
com.microsoft.DynamicQuantizeMatMul	N	n/a	WIP	@quic-sanirudh

Operator	Standard	Opset	Status	Owner	PR
com.microsoft.MatMulInteger16	Y	n/a	Supported	@abhikran-quic	#9186
com.microsoft.MatMulIntegerToFloat	N	n/a	WIP	@onkar-sima-ai
com.microsoft.MulInteger	N	n/a	WIP	@tasmia-rahman
com.microsoft.ReduceSumInteger	N	n/a	WIP	@FranckQC

Operator	Standard	Opset	Status	Owner	PR
com.microsoft.QGemm	N	n/a	WIP	@rasagna-quic
com.microsoft.QLinearReduceMean	N	n/a	WIP	@avquicinc

[Tracking Issue][ONNX] Quantized operator support in ONNX importer #8838

Description

Summary

Status

Coordination

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions