[Bug] Missing broadcast_to before batch_matmul for CuBLAS

The PR #7348 removes `broadcast_to` before batch_matmul because batch_matmul already supported implicitly broadcast. However, the CuBLAS implementation isn't changed accordingly, which results in the failure of the following case:

```python
import numpy as np

import tvm
from tvm import relay
from tvm.contrib import graph_runtime

sa = (4, 128, 768)
sb = (1, 768, 768)

a = relay.var("a", shape=sa)
b = relay.var("b", shape=sb)
c = relay.nn.batch_matmul(a, b)
f = relay.Function([a, b], c)
mod = tvm.ir.IRModule.from_expr(f)
mod = relay.transform.InferType()(mod)

with tvm.transform.PassContext(opt_level=3):
    lib = relay.build(mod, target="cuda") # change target to "cuda -libs=cublas" will fail

ctx = tvm.gpu(0)
m = graph_runtime.GraphModule(lib["default"](ctx))
p = np.random.uniform(0, 1, sa)
q = np.random.uniform(0 ,1, sb)
m.set_input("a", p)
m.set_input("b", q)

ftimer = m.module.time_evaluator("run", ctx, number=1, repeat=10)
prof_res = np.array(ftimer().results) * 1000
print(np.mean(prof_res))
```

I guess we need to either add the broadcast_to back or support implicitly broadcasting in CuBLAS implementation.

cc @masahi @jwfromm 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Missing broadcast_to before batch_matmul for CuBLAS #7730

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] Missing broadcast_to before batch_matmul for CuBLAS #7730

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions