[Feature Request] Support for quantized matmul for > 2d inputs for ONNX models.  

I am getting the following error while compiling a quantized Onnx model - 
File "python/tvm/relay/frontend/onnx.py" - 
in _impl_v10
```assert (a_rank == 2) and (b_rank == 2), (AssertionError: QLinearMatMul importer currently requires both 'a' and 'b' tensors to be 2D, but ran(a)=3, rank(b)=3```

It looks like this is occurring due to quantized matmul operation not being supported for >2d inputs. 
In the ToDo comments, it is mentioned that this limitation would be removed in future updates. 

 Is there any plan to enable this feature currently ?

We need this to deploy a model, and if it's not in priority, is there a workaround to this problem? Or something we can do to resolve it . 







Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Support for quantized matmul for > 2d inputs for ONNX models. #12701

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Support for quantized matmul for > 2d inputs for ONNX models. #12701

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions