[kernels] exception handling for fa kernels#43549
Conversation
danieldk
left a comment
There was a problem hiding this comment.
Looks good to me, but I am not very familiar with this code, so someone else should review it too.
| user_kwargs = { | ||
| "dropout_p": dropout, | ||
| "window_size": sliding_window, | ||
| "deterministic": deterministic, |
There was a problem hiding this comment.
I am not 100% sure about this one. If someone requests deterministic output, it might still be ok to emit non-deterministic output with a warning?
There was a problem hiding this comment.
Yes it depends I think! I will wait for @ArthurZucker and @vasqu inputs since they maintain the fa implementation
There was a problem hiding this comment.
Don't feel super confident about this
- dropout is fair
- sliding window might still unintentionally work if
key_length <= sliding_window - deterministic has the env variable as well
- s_aux is only supported in gpt oss and we check that people use it
Additionally, I'm also for being more lenient at least on deterministic (warning)
TL;DR: The conditions below should be counted in when we raise the error + a small test would be nice
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
What does this PR do?
Before we were just silently skipping parameters that are passed by the user like
s_auxin case they are not supported by the attention backend specified, it would be better to raise an exception instead.cc @danieldk