[Cache] rename max_batch_size -> batch_size in compilable caches#37389
[Cache] rename max_batch_size -> batch_size in compilable caches#37389gante wants to merge 1 commit intohuggingface:mainfrom
max_batch_size -> batch_size in compilable caches#37389Conversation
|
Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the |
There was a problem hiding this comment.
Oh wow, I even forgot we were deprecating the other way. Thanks for digging into it!
Since we talk about batch sizes, I remember this issue (#35444) where user wanted to contribute an actual max_batch_size (especially for enc-dec model cases). Similar to seq length, unused batches are all zeros. I think the feature is nice to have, but I also see we can mess up with users who manipulate directly cache._key_cache. WDYT about it, is it worth supporting?
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
@zucchini-nlp I see, the export use cache with a batch size > input batch size makes sense (export once, reuse with any batch size). It's feasible, the questions are a) code complexity; b) throughput. I'm going to give it a go 🤞 |
|
Yeah, I am also concerned if it will add too much complexity. Thanks, will be cool if it's doable with minimal maintenance cost :) |
|
It's actually quite clean and doesn't seem to have throughput disadvantages 👀 closing this PR in favor of expanding capabilities |
What does this PR do?
WIP, see this comment
Uses
deprecate_kwargto renamemax_batch_sizetobatch_sizein all compilable caches.max_batch_sizeis a bad arg name: it implies that batch sizes smaller thanmax_batch_sizecan use the cache too, which is not the case.Note that this deprecation was started before, but we messed it up along the way:
batch_sizeinstead ofmax_batch_size#32657