Skip to content

[CB] Persistent manager#44435

Merged
remi-or merged 9 commits intomainfrom
cb-persistent
Mar 26, 2026
Merged

[CB] Persistent manager#44435
remi-or merged 9 commits intomainfrom
cb-persistent

Conversation

@remi-or
Copy link
Copy Markdown
Collaborator

@remi-or remi-or commented Mar 4, 2026

This PR adds the option to have a ContinuousBatchingManager not be destroyed after generation is over.
This allows the user to re-use the manager without requiring him to know any other entry point for CB apart from generate_batch or the context manager.
If we want to have sleep function at some point, where graphs are kept, weights are kept or offloaded and cache released, this will be necessary as well.

@remi-or remi-or marked this pull request as draft March 4, 2026 14:17
@remi-or remi-or marked this pull request as ready for review March 26, 2026 10:56
@remi-or remi-or requested a review from ArthurZucker March 26, 2026 11:00
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not default to True makes sense, its per-model and for specific use cases I guess no?

Comment thread src/transformers/generation/continuous_batching/scheduler.py Outdated
@remi-or
Copy link
Copy Markdown
Collaborator Author

remi-or commented Mar 26, 2026

Not default to True makes sense, its per-model and for specific use cases I guess no?

It's less about the model and more about the workflow: this will be useful to have when there are several rounds of generate_batch but not so great if you want to do batch generation then something else that takes space on GPU.

@remi-or remi-or enabled auto-merge March 26, 2026 16:30
@remi-or remi-or added this pull request to the merge queue Mar 26, 2026
Merged via the queue into main with commit 67100cc Mar 26, 2026
30 checks passed
@remi-or remi-or deleted the cb-persistent branch March 26, 2026 22:02
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Mar 27, 2026
* Stacked commits cb-persistent

* Rebase fixes

* style

* ty compliance

* Fix

* nit
NielsRogge pushed a commit to NielsRogge/transformers that referenced this pull request Mar 30, 2026
* Stacked commits cb-persistent

* Rebase fixes

* style

* ty compliance

* Fix

* nit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants