Skip to content

[docs] continuous batching#44896

Merged
stevhliu merged 2 commits intohuggingface:mainfrom
stevhliu:cb
Mar 30, 2026
Merged

[docs] continuous batching#44896
stevhliu merged 2 commits intohuggingface:mainfrom
stevhliu:cb

Conversation

@stevhliu
Copy link
Copy Markdown
Member

updates the continuous batching docs

  • new page for the API reference
  • adds sections for new features like CUDA graphs, async batching, prefix caching, logprobs (depending on when its merged)
  • clearer example of generation with varying loads using continuous_batching_context_manager
  • new page explaining how the system works underneath. it explains the scheduling half of the system, but doesn't cover the memory side yet. i'm thinking it may make more sense to move the paged attention doc in here to fill the gap. what do you think @remi-or ?

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@stevhliu stevhliu requested a review from remi-or March 20, 2026 19:31
Copy link
Copy Markdown
Collaborator

@remi-or remi-or left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I think the memory sides needs expanding I agree, but we can leave that for another PR. This is a very good starting point to keep the doc more up to date. Thanks!

Comment thread docs/source/en/continuous_batching.md Outdated
Comment thread docs/source/en/continuous_batching.md Outdated
@remi-or
Copy link
Copy Markdown
Collaborator

remi-or commented Mar 30, 2026

i'm thinking it may make more sense to move the paged attention doc in here to fill the gap. what do you think @remi-or ?

Not really, this is more about the kernel than the actual page-management system. I can draft something up for that.

@stevhliu stevhliu added this pull request to the merge queue Mar 30, 2026
Merged via the queue into huggingface:main with commit b7074b1 Mar 30, 2026
15 of 19 checks passed
@stevhliu stevhliu deleted the cb branch March 30, 2026 17:17
sirzechs66 pushed a commit to sirzechs66/transformers that referenced this pull request Mar 31, 2026
SangbumChoi pushed a commit to SangbumChoi/transformers that referenced this pull request Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants