Test: generate with torch.compile(model.forward) as a fast test#34544
Test: generate with torch.compile(model.forward) as a fast test#34544gante merged 13 commits intohuggingface:mainfrom
torch.compile(model.forward) as a fast test#34544Conversation
ydshieh
left a comment
There was a problem hiding this comment.
Love this!
Q: Is it really fast ...?
Remark: I feel get_max_cache_length is a better name than get_max_cache_shape but OK I know not great to change name all the time.
@ydshieh yes :D |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
ArthurZucker
left a comment
There was a problem hiding this comment.
I don't mind, tho I don't think our priority should be this (full compile vs compile forward in generate!) + I don't see the test being run in the CI! 🤗
|
Could you just make sure it's run |
|
We need to remove |
|
Before merge, feel free to ping me for a check for (if there is any) flakyness :-) or anything you think I can double check again. |
ArthurZucker
left a comment
There was a problem hiding this comment.
Thanks can ignore my comments and merge 🤗
There was a problem hiding this comment.
is it possible for the HybridCache to inherit from Static cache?
There was a problem hiding this comment.
We might just need an extra class that says CompileCompatible , someone wanted is_static attr˜!
|
(sorry, the PR is not ready yet, a few cases are still failing 👀 I didn't mean to request a review) |


What does this PR do?
Follow-up to #34464
This PR:
test_generate_compile_model_forwardto a fast test. This means we will check generate withtorch.compile(model.forward)at each commit on ALL models that supportStaticCache💛test_generate_compile_model_forwardwhenever possible_supports_static_cache = False #Reasonwhen the model doesn't supporttorch.compile(model.forward)✅
py.test tests/models/ -k test_generate_compileis all green, takes ~2 mins to run on all models on my machine