Skip to content

generation/stopping_criteria: short-circuit StoppingCriteriaList when all sequences are done#45384

Closed
GitGlimpse895 wants to merge 3 commits intohuggingface:mainfrom
GitGlimpse895:feat/stopping-criteria-short-circuit
Closed

generation/stopping_criteria: short-circuit StoppingCriteriaList when all sequences are done#45384
GitGlimpse895 wants to merge 3 commits intohuggingface:mainfrom
GitGlimpse895:feat/stopping-criteria-short-circuit

Conversation

@GitGlimpse895
Copy link
Copy Markdown

@GitGlimpse895 GitGlimpse895 commented Apr 12, 2026

What does this PR do?

StoppingCriteriaList.__call__ previously evaluated every registered criterion
unconditionally on every generation step, even after is_done was already True
for all sequences in the batch. This adds a single if is_done.all(): break guard
inside the loop to skip remaining criteria once the entire batch is finished.

This is semantically safe because OR-ing any further True values into an
all-True tensor cannot change the result. The saving scales with
(number_of_criteria − 1) × remaining_steps_after_completion, and is most
impactful when StopStringCriteria (which runs a full vocab embedding lookup via
F.embedding each call) appears after a cheaper criterion like MaxLengthCriteria.

A new test test_list_criteria_short_circuits_when_all_done is added to verify
the behaviour using a CountingCriteria sentinel.

  • I confirm that this is not a pure code agent PR.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline, Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@GitGlimpse895 GitGlimpse895 force-pushed the feat/stopping-criteria-short-circuit branch from 78ed29b to b400828 Compare April 13, 2026 12:00
@Rocketknight1
Copy link
Copy Markdown
Member

Does this not create data-dependent control flow that requires a compiler break?

@GitGlimpse895
Copy link
Copy Markdown
Author

You're right that calling .all() on a tensor and using it in a Python if/break creates data-dependent control flow. Under torch.compile this would cause a graph break. However, StoppingCriteriaList.call is invoked from the Python-level generation loop in utils.py, which itself runs outside any compiled region — the model forward pass is compiled, but the decoding loop that calls stopping criteria is not. So no graph break is introduced in practice. That said, I'm happy to add a # pragma: no compile note or restructure if you'd prefer a different approach.

@GitGlimpse895 GitGlimpse895 force-pushed the feat/stopping-criteria-short-circuit branch from b400828 to e4613da Compare April 14, 2026 03:10
@GitGlimpse895 GitGlimpse895 force-pushed the feat/stopping-criteria-short-circuit branch from e4613da to 5fe79c0 Compare April 16, 2026 03:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants