[ONNX] Collect quant params of pre-quantized ONNX and generate qnn op #7937
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
For the import of pre-quantized ONNX models, nodes (like
Conv,Gemm,Add,Mul,Sub) between QuantizeLinear and DequantizeLinear should be quantized.This PR is summarized as follows:
1, Collect quantize params for nodes to be quantized, which locate between QuantizeLinear and DequantizeLinear.
2, Generate corresponding qnn ops (like
qnn.conv2d,qnn.dense,qnn.add,qnn.mul,qnn.subtract) for nodes that can be quantized.