🚨 Generation config defaults are now None#42702
🚨 Generation config defaults are now None#42702zucchini-nlp merged 24 commits intohuggingface:mainfrom
None#42702Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| generation_params = {} | ||
| default_config = self.__class__().to_dict() if not self.has_no_defaults_at_init else {} | ||
| for key in GenerationConfig._get_default_generation_params().keys(): | ||
| if hasattr(self, key) and getattr(self, key) is not None and key not in default_config: | ||
| generation_params[key] = getattr(self, key) | ||
|
|
There was a problem hiding this comment.
could have been simplified because we no longer have any generation params in model.config
There was a problem hiding this comment.
Just for my understanding, generation config is top level either way. Hence, we no longer need to discern for submodels etc in composite models (which could have possibly different config values here)
There was a problem hiding this comment.
yeah, that and also because these lines are more of a workaround for old models (e.g. bart). New models don't have any generation params in model config anyway, we don't allow it for quite a long time
|
Ready for review! |
vasqu
left a comment
There was a problem hiding this comment.
Let's still add an 🚨 even if it's not completely breaking, I rather be safe than sorry here. We never know
First round of comments, my biggest issue would be the kwargs vs generation config passing. But you also left a note there.
| generation_params = {} | ||
| default_config = self.__class__().to_dict() if not self.has_no_defaults_at_init else {} | ||
| for key in GenerationConfig._get_default_generation_params().keys(): | ||
| if hasattr(self, key) and getattr(self, key) is not None and key not in default_config: | ||
| generation_params[key] = getattr(self, key) | ||
|
|
There was a problem hiding this comment.
Just for my understanding, generation config is top level either way. Hence, we no longer need to discern for submodels etc in composite models (which could have possibly different config values here)
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
NoneNone
|
run-slow: bart, csm, dia, encoder_decoder, musicgen, rag, reformer, speech_encoder_decoder, vision_encoder_decoder, whisper |
|
This comment contains models: ["models/bart", "models/csm", "models/dia", "models/encoder_decoder", "models/musicgen", "models/rag", "models/reformer", "models/speech_encoder_decoder", "models/vision_encoder_decoder", "models/whisper"] |
CI Results✅ No failing test specific to this PR 🎉 ! |
|
@vasqu requesting another review :) |
vasqu
left a comment
There was a problem hiding this comment.
LGTM, just a few last nits
| generation_params = {} | ||
| default_config = self.__class__().to_dict() if not self.has_no_defaults_at_init else {} | ||
| for key in GenerationConfig._get_default_generation_params().keys(): | ||
| if hasattr(self, key) and getattr(self, key) is not None and key not in default_config: | ||
| generation_params[key] = getattr(self, key) | ||
|
|
| elif self.num_beams == 1: | ||
| if self.do_sample is False: | ||
| elif self.num_beams is None or self.num_beams == 1: | ||
| if self.do_sample is not True: |
There was a problem hiding this comment.
Looks like it was missed here?
| # user-defined kwargs or `generation_config` > `self.generation_config` > global default values | ||
| # NOTE: doesn't make sense to allow kwargs and `generation_config`. Might be strict and make them mutually exclusive? |
There was a problem hiding this comment.
Fair enough but maybe we can break for v5? Not super important but it gives us a good opportunity to do so.
Either way, let's upgrade this to a TODO (as well).
|
Will merge when CI is fixed on main |
|
Hi, First of all, thanks for your reactivity and addressing the underlying issue: On the other hand, I did a quick pass over the generation code, and I think there is a subtle semantic point worth double-checking: None is not always just "unset", it can also mean "disable this behavior". Concretely,
Therefore, a situation can occur during training where:
In that scenario, I'll continue digging into this, but flagging it early so we can discuss it before merging. |
|
@albertvillanova I think if Unfortunately we have no way to 100% know what users wants when they set values to For ex, if everyone prepared custom configs as below, we can fix your issue. I'm afraid it's not the case for most users my_generation_config = model.generation_config
my_generation_config.top_k = None
model.generate(inputs, generation_config=my_generation_config) |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: bart, csm, dia, encoder_decoder, musicgen, rag, reformer, speech_encoder_decoder, vision_encoder_decoder, whisper |
|
I think we can merge? @zucchini-nlp |
| modified_values = {} | ||
| global_default_generation_config = GenerationConfig() | ||
| model_generation_config = self.generation_config | ||
| # we iterate over the model's generation config: it may hold custom keys, which we'll want to copy |
There was a problem hiding this comment.
@vasqu, @zucchini-nlp are custom keys no longer copied over?
Seems like only the ones here are defaulted at this line, which wouldn't copy custom keys in self.generation_config anymore
| global_defaults = self.generation_config._get_default_generation_params() | ||
| generation_config.update(**self.generation_config.to_dict(), defaults_only=True) | ||
| generation_config.update(**global_defaults, defaults_only=True) | ||
|
|
There was a problem hiding this comment.
something like this just after?
# add custom keys not in global defaults
for key, value in self.generation_config.to_dict().items():
if not hasattr(generation_config, key):
setattr(generation_config, key, value)* this way betetr maybe? * delete legacy from bart and mvp * import not found * fix some tests * fix more tests * revert smth to run tests again * i though I fixed it already, but there were more models * commit and check tests, clean-up later * assisted deocding shoudl work now * docs and whisper * fix a few more tests * no circular import errors pls * wording * add a test for defaults following TRL example * nit * Update src/transformers/configuration_utils.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * Update src/transformers/generation/candidate_generator.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * Update src/transformers/generation/utils.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * comments * final fix tests * more comments --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
What does this PR do?
As per title, must have been done long time ago but could't break BC. The current impl breaks BC only half-way, i.e. the generation loop is not affected and will keep using the old defaults. The biggest difference is for users to init, access, modify, etc. the model's generation config directly: