Skip to content

[Bug] AutoScheduler / Fuse Pass bug #7135

@antinucleon

Description

@antinucleon

If we run Auto Scheduler on BERT, eg with these scripts (https://github.com/octoml/Apple-M1-BERT), we will see these error messages during compiling:

Extract tasks...
Compile...
-----------------------------------
Cannot find tuned schedules for target=metal -keys=metal,gpu -max_num_threads=256, workload_key=["ec4f7d9b3c9680b55f74f8646223586b"]. A fallback TOPI schedule is used, which may bring great performance regression or even compilation failure. Compute DAG info:
placeholder = PLACEHOLDER [1, 768]
placeholder = PLACEHOLDER [768, 768]
T_dense(i, j) += (placeholder[i, k]*placeholder[j, k])

However, with codebase in July, this message won't appear. The effect of this bug is significant:

On NVIDIA T4, July codebase BERT inference time is 9ms, while the current main branch is 13ms (with similar estimation time from Auto scheduler).

Unfortunately, I don't have bandwidth to fix this bug in the near weeks.

Contributions are welcomed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions