Generate: assisted decoding now uses generate for the assistant#28030
Generate: assisted decoding now uses generate for the assistant#28030amyeroberts merged 1 commit intohuggingface:mainfrom
generate for the assistant#28030Conversation
2033a7b to
132d428
Compare
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
@amyeroberts the failing test is also failing in the daily CI (i.e. unrelated to this PR, as it doesn't depend on assisted generation), and I can't reproduce it on my end 🤔 |
amyeroberts
left a comment
There was a problem hiding this comment.
Thanks for the refactor and running the slow tests! Looks a lot cleaner ❤️
|
@amyeroberts I can't merge due to the failing test (which is also failing on |
|
(the test is flaky 👉 #28035) |
|
@gante In this case we can merge :) edit: note this was discussed offline as the reason for failing tests was identified and confirmed as independent from this PR |
…ggingface#28030) generate refactor
What does this PR do?
Subset of the original changes in #27979
"Reworks assisted candidate generation to call .generate(), instead of having its own custom generation loop. For most models this is nothing more than a nice abstraction. However, for models with a custom generate() function, this means the assistant model will now make use of it! (🤔 does this mean that DistilWhisper gets better numbers with this refactor?)"
The following tests were run locally and are passing:
RUN_SLOW=1 py.test tests/models/whisper/ -k speculativepy.test tests/ -k test_assisted