Require input_ids for repetition penalty by ruben-aghayan · Pull Request #45389 · huggingface/transformers

ruben-aghayan · 2026-04-13T03:57:04Z

What does this PR do?

This PR warns when using repetition penalty or ngram repetition penalty in decoder models on input_embed without input_ids args.

Previously, users were able to call repetition penalty on generate calls with input_embeds args. Since they don't actually have tokens, the repetition penalty was not applied to the input args, but only to the generated tokens.
An equivalent call (ie tokens corresponding to those embeddings) would behave differently by applying the repetition penalty to the input tokens.
This change makes it so that the repetition penalty is not applied and a warning is shown.

Testing

pytest tests/generation -vv -k 'not test_text_streamer_decode_kwargs'
test_text_streamer_decode_kwargs was giving an unrelated failure

Code Agent Policy

I confirm that this is not a pure code agent PR.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@gante

afurm · 2026-04-13T05:13:37Z

Does prompt_input_ids.get() (or a follow-up check) need to handle the case where it's a list rather than a tensor? If input_ids is passed as a plain Python list, isinstance(..., torch.Tensor) would be False and this would raise the error even for valid input.

ruben-aghayan · 2026-04-13T06:06:35Z

Does prompt_input_ids.get() (or a follow-up check) need to handle the case where it's a list rather than a tensor? If input_ids is passed as a plain Python list, isinstance(..., torch.Tensor) would be False and this would raise the error even for valid input.

Thank you for your comment!

Is List considered valid input? Generate args are all tensors. Admittedly, input_ids goes in kwargs so is not explicitly typed. But event today, such an input would fail e.g.

  from transformers import AutoTokenizer, AutoModelForCausalLM

  tok = AutoTokenizer.from_pretrained("hf-internal-testing/tiny-random-gpt2")
  model = AutoModelForCausalLM.from_pretrained("hf-internal-testing/tiny-random-gpt2")

  ids = tok("Hello world", return_tensors="pt").input_ids[0].tolist()
  model.generate(input_ids=ids)

produces AttributeError: 'list' object has no attribute 'shape' since it's being treated as a tensor

Rocketknight1 · 2026-04-13T14:34:15Z

@remi-or @McPatate for generation/CB, but feel free to pass it on to someone else if you're not comfortable reviewing it!

remi-or · 2026-04-14T10:22:01Z

I am unfamiliar with encoder and the inputs embeds, so I would prefer if it can be passed on. If no one picks this up I will when I have some room!

Rocketknight1 · 2026-04-15T12:06:45Z

cc @Cyrilvallez for generation maybe, but if you're overloaded we might need to find someone to own generation code!

Cyrilvallez · 2026-04-20T06:39:10Z

Hey @ruben-aghayan! We can indeed raise in such cases, but this code should live inside _get_logits_processor!

ruben-aghayan · 2026-04-25T04:13:54Z

Hey @ruben-aghayan! We can indeed raise in such cases, but this code should live inside _get_logits_processor!

thanks + done

I noticed EncoderRepetitionPenaltyLogitsProcessor above just warns so I switched to warning (this would have been enough for my use case). Also extended it to NoRepeatNGramLogitsProcessor

github-actions · 2026-04-25T04:32:33Z

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=45389&sha=036192

ruben-aghayan · 2026-04-26T22:39:38Z

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=45389&sha=036192

looks unrelated?

Cyrilvallez · 2026-04-27T07:27:13Z

+                inputs_embeds = model_kwargs.get("inputs_embeds") if model_kwargs is not None else None
+                if inputs_embeds is not None and (input_ids_seq_length is None or input_ids_seq_length == 0):
+                    warnings.warn(
+                        "Passing `repetition_penalty` requires some form of `input_ids` to be passed to "
+                        "`generate`, ignoring the argument.",
+                        UserWarning,
+                    )
+                else:
+                    processors.append(RepetitionPenaltyLogitsProcessor(penalty=generation_config.repetition_penalty))


We don't want to skip, only warn that it will apply the repetition only on new tokens, vs applying it to the full sequence inclucing prompt

Guard repetition penalty for inputs_embeds

3a4294c

ruben-aghayan force-pushed the fix-repetition-penalty-inputs-embeds branch from d1d35b7 to 3a4294c Compare April 13, 2026 05:46

ruben-aghayan changed the title ~~Guard repetition penalty for inputs_embeds~~ Require input_ids for repetition penalty Apr 13, 2026

ruben-aghayan marked this pull request as draft April 25, 2026 01:44

ruben-aghayan force-pushed the fix-repetition-penalty-inputs-embeds branch 3 times, most recently from b84bcc7 to 1721159 Compare April 25, 2026 04:07

Move repetition penalty guard to logits processor

08ac3d8

ruben-aghayan force-pushed the fix-repetition-penalty-inputs-embeds branch from 1721159 to 08ac3d8 Compare April 25, 2026 04:11

ruben-aghayan marked this pull request as ready for review April 25, 2026 04:15

Merge branch 'main' into fix-repetition-penalty-inputs-embeds

0361926

Cyrilvallez reviewed Apr 27, 2026

View reviewed changes

ruben-aghayan marked this pull request as draft April 28, 2026 18:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Require input_ids for repetition penalty#45389

Require input_ids for repetition penalty#45389
ruben-aghayan wants to merge 3 commits intohuggingface:mainfrom
ruben-aghayan:fix-repetition-penalty-inputs-embeds

ruben-aghayan commented Apr 13, 2026 •

edited

Loading

Uh oh!

afurm commented Apr 13, 2026

Uh oh!

ruben-aghayan commented Apr 13, 2026

Uh oh!

Rocketknight1 commented Apr 13, 2026

Uh oh!

remi-or commented Apr 14, 2026

Uh oh!

Rocketknight1 commented Apr 15, 2026

Uh oh!

Cyrilvallez commented Apr 20, 2026

Uh oh!

ruben-aghayan commented Apr 25, 2026

Uh oh!

github-actions Bot commented Apr 25, 2026

Uh oh!

ruben-aghayan commented Apr 26, 2026

Uh oh!

Cyrilvallez Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

ruben-aghayan commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Testing

Code Agent Policy

Before submitting

Who can review?

Uh oh!

afurm commented Apr 13, 2026

Uh oh!

ruben-aghayan commented Apr 13, 2026

Uh oh!

Rocketknight1 commented Apr 13, 2026

Uh oh!

remi-or commented Apr 14, 2026

Uh oh!

Rocketknight1 commented Apr 15, 2026

Uh oh!

Cyrilvallez commented Apr 20, 2026

Uh oh!

ruben-aghayan commented Apr 25, 2026

Uh oh!

github-actions Bot commented Apr 25, 2026

Uh oh!

ruben-aghayan commented Apr 26, 2026

Uh oh!

Cyrilvallez Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ruben-aghayan commented Apr 13, 2026 •

edited

Loading