When observing the autotuning decorators in evoattention, we observe that there is a dependence on SEQ_LEN, this means that every possible input sequence length autotune is triggered.
As this is not practical, especially for larger sequence lengths, is there some better strategy to coarse autotuning?
When observing the autotuning decorators in evoattention, we observe that there is a dependence on SEQ_LEN, this means that every possible input sequence length autotune is triggered.
As this is not practical, especially for larger sequence lengths, is there some better strategy to coarse autotuning?