Skip to content

[Generate] Allow custom config values in generate config#43181

Merged
vasqu merged 4 commits intohuggingface:mainfrom
vasqu:custom-generate-config-values
Jan 12, 2026
Merged

[Generate] Allow custom config values in generate config#43181
vasqu merged 4 commits intohuggingface:mainfrom
vasqu:custom-generate-config-values

Conversation

@vasqu
Copy link
Copy Markdown
Contributor

@vasqu vasqu commented Jan 8, 2026

As per title, followup to a series of PRs that address the new config logic

tl;dr: custom values in an uploaded generation config are ignored atm

Co-authored-by: Eric Bezzam <ebezzam@users.noreply.github.com>
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Comment thread src/transformers/generation/utils.py Outdated
@vasqu vasqu marked this pull request as ready for review January 9, 2026 15:31
@vasqu vasqu requested review from ebezzam and zucchini-nlp January 9, 2026 15:31
# (custom entries are carried over).
global_defaults = self.generation_config._get_default_generation_params()
generation_config.update(**self.generation_config.to_dict(), defaults_only=True)
generation_config.update(**self.generation_config.to_dict(), defaults_only=True, allow_custom_entries=True)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that is what you meant earlier. I assumed that generation_config at this point will already be self.generation_config

Btw, if a few lines above we create not generation_config = GenerationConfig() but generation_config = self.generation_config it will be much easier. I can't think of why it is a bad idea. WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, not the case, e.g.

from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.2-3B-Instruct",
    device_map="auto",
    torch_dtype="auto",
    attn_implementation="flash_attention_2",
).eval()
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")
tokenizer.pad_token = tokenizer.eos_token
inputs = tokenizer(
    ["Hello, how are you?", "is this life?"],
    padding=True,
    padding_side="left",
    return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=50, generation_config=GenerationConfig(do_sample=False))
print(tokenizer.batch_decode(outputs, skip_special_tokens=True))

We will have different generation configs so it's not guaranteed.

Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed 🥲
I guess there is no cleaner option then

Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, let's allow loading custom fields when running generation_config.from_pretrained. After the recent changes, there is no need for a _from_model_config private attr and we are guaranteed to get only generation attr at init time. So we could just save all kwargs at __init__ as attributes, criiw

Comment thread src/transformers/generation/utils.py Outdated
Copy link
Copy Markdown
Contributor

@ebezzam ebezzam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @vasqu and @zucchini-nlp for iterating! I tried these changes in my VibeVoice branch, and they work 👍

Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the CI is flaky again 🥲

@github-actions
Copy link
Copy Markdown
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43181&sha=cb4bda

@vasqu vasqu merged commit 1b2adc0 into huggingface:main Jan 12, 2026
25 checks passed
@vasqu vasqu deleted the custom-generate-config-values branch January 12, 2026 14:03
@vasqu
Copy link
Copy Markdown
Contributor Author

vasqu commented Jan 12, 2026

Finally got it

SangbumChoi pushed a commit to SangbumChoi/transformers that referenced this pull request Jan 23, 2026
…ce#43181)

* allow custom values to carry over

Co-authored-by: Eric Bezzam <ebezzam@users.noreply.github.com>

* slightly different version, moving everything to update itself

* fix typo

---------

Co-authored-by: Eric Bezzam <ebezzam@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants