Skip to content

[Feature Request] Support for quantized matmul for > 2d inputs for ONNX models.  #12701

@ramanps05

Description

@ramanps05

I am getting the following error while compiling a quantized Onnx model -
File "python/tvm/relay/frontend/onnx.py" -
in _impl_v10
assert (a_rank == 2) and (b_rank == 2), (AssertionError: QLinearMatMul importer currently requires both 'a' and 'b' tensors to be 2D, but ran(a)=3, rank(b)=3

It looks like this is occurring due to quantized matmul operation not being supported for >2d inputs.
In the ToDo comments, it is mentioned that this limitation would be removed in future updates.

Is there any plan to enable this feature currently ?

We need this to deploy a model, and if it's not in priority, is there a workaround to this problem? Or something we can do to resolve it .

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions