-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Closed
Description
If we run Auto Scheduler on BERT, eg with these scripts (https://github.com/octoml/Apple-M1-BERT), we will see these error messages during compiling:
Extract tasks...
Compile...
-----------------------------------
Cannot find tuned schedules for target=metal -keys=metal,gpu -max_num_threads=256, workload_key=["ec4f7d9b3c9680b55f74f8646223586b"]. A fallback TOPI schedule is used, which may bring great performance regression or even compilation failure. Compute DAG info:
placeholder = PLACEHOLDER [1, 768]
placeholder = PLACEHOLDER [768, 768]
T_dense(i, j) += (placeholder[i, k]*placeholder[j, k])
However, with codebase in July, this message won't appear. The effect of this bug is significant:
On NVIDIA T4, July codebase BERT inference time is 9ms, while the current main branch is 13ms (with similar estimation time from Auto scheduler).
Unfortunately, I don't have bandwidth to fix this bug in the near weeks.
Contributions are welcomed.
Metadata
Metadata
Assignees
Labels
No labels