Skip to content

Drop noisy generate warnings when do_sample=False (or num_beams=1)#45559

Closed
ArthurZucker wants to merge 1 commit intohuggingface:mainfrom
ArthurZucker:drop-sampling-flag-warnings
Closed

Drop noisy generate warnings when do_sample=False (or num_beams=1)#45559
ArthurZucker wants to merge 1 commit intohuggingface:mainfrom
ArthurZucker:drop-sampling-flag-warnings

Conversation

@ArthurZucker
Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker commented Apr 22, 2026

Summary

GenerationConfig.validate() was warning about sampling-only flags (temperature, top_p, top_k, min_p, top_h, typical_p, epsilon_cutoff, eta_cutoff) whenever do_sample was not True, and about beam-only flags (early_stopping, length_penalty) whenever num_beams == 1 -- even when those values were inherited from the model's generation_config.json. In practice, nearly every popular Hub model ships with a non-default temperature/top_p, so users got a warning for every generate(do_sample=False) call.

This PR threads a user_set_attributes set through GenerationConfig.__init__ and update(), so validate() can distinguish attributes the caller explicitly provided from values inherited from the model's default config. The sampling-only and beam-only warnings now only fire for user-set attributes.

Behavior

  • generate(do_sample=False) on a model whose hub config has temperature=0.6, top_p=0.9: silent (values inherited, not user intent).
  • generate(do_sample=False, top_p=0.8): warns about top_p (user explicitly set both -- almost certainly a mistake).
  • GenerationConfig(do_sample=False, temperature=0.5): warns (both explicit).
  • generation_config.validate(strict=True) (e.g. from save_pretrained) with no user_set_attributes preserves the original "refuse to save bad configs" behavior.

cc @Cyrilvallez

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general i agree, users don't want to see warnings if they just turn off sampling iwth one flag. Though I am not sure about the reverse case, when they pass sampling params and the default config has do_sample=False. Kinda misleading if we silently fallback to greedy generation, no?

Comment on lines -639 to +660
if self.do_sample is not True:
#
# The warning is suppressed for flags that weren't explicitly set by the caller (see `_is_user_set`): values
# inherited from a model's `generation_config.json` are harmless when the user opts into greedy decoding.
# We also require `do_sample` itself to be user-set -- otherwise the non-sampling mode was inherited and the
# user never expressed intent to skip sampling, so flagging their sampling kwargs would be misleading.
if self.do_sample is not True and _is_user_set("do_sample"):
greedy_wrong_parameter_msg = (
"`do_sample` is set not to set `True`. However, `{flag_name}` is set to `{flag_value}` -- this flag is only "
"used in sample-based generation modes. You should set `do_sample=True` or unset `{flag_name}`."
"`do_sample` is set to `{do_sample}`. However, `{flag_name}` is set to `{flag_value}` -- this flag is "
"only used in sample-based generation modes. You should set `do_sample=True` or unset `{flag_name}`."
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great opportunity to move to hub-dataclass validation 😄

Comment on lines +182 to +188
# Inverse provenance case: `do_sample` inherited from a model's config (so not user-set this call), user only
# sets a sampling flag. The conflict shouldn't produce noise because the user never asked for greedy.
logger.warning_once.cache_clear()
greedy_hub_config = GenerationConfig(do_sample=False) # mimics a model's default config forcing greedy
with CaptureLogger(logger) as captured_logs:
greedy_hub_config.update(top_p=0.8)
self.assertEqual(len(captured_logs.out), 0)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am not sure about this one. Beginner users might expect this to just work and sample with top-p, while we silently fallback to greedy

Comment on lines 711 to 723
# 2.4. check `num_return_sequences`
if self.num_return_sequences is not None and self.num_return_sequences > 1:
if self.num_beams is None or self.num_beams == 1:
if not self.do_sample:
raise ValueError(
"Greedy methods (do_sample != True) without beam search do not support "
f"`num_return_sequences` different than 1 (got {self.num_return_sequences})."
)
elif (
self.num_beams is not None
and self.num_return_sequences is not None
and self.num_return_sequences > self.num_beams
):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the is-user check is only when greedy decoding, no changes on other params?

@Cyrilvallez
Copy link
Copy Markdown
Member

Completely messed up this branch by mistake 😅 Opened #45619 with the correct diffs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants