-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Closed
Description
Once #13242 lands, the convolution schedules for microTVM will have a tunable parameter num_outputs:
tvm/python/tvm/topi/arm_cpu/qnn.py
Lines 235 to 239 in f11243a
| # Decide how many sums our function should have running at the same time. Doing | |
| # this lets us do "more work" for each memory load, but doing too many of them causes us to run | |
| # out of registers. Currently this is set to either 1 or 2, but autotuning this value would | |
| # improve performance a lot. | |
| num_sums = 2 |
As the comments states, picking this value is important for performance. It would be awesome to be able to autotune this - the correct value is very dependent on the exact parameters of the convolution, picking correctly will have a >10% impact on performance, and predicting it without autotuning would be challenging (though theoretically possible).
Note that this value is used in the compute function, not in the scheduling function, which makes autotuning harder.
Metadata
Metadata
Assignees
Labels
No labels