Add generate kwargs to AutomaticSpeechRecognitionPipeline#20935
Add generate kwargs to AutomaticSpeechRecognitionPipeline#20935bofenghuang wants to merge 15 commits intohuggingface:mainfrom
AutomaticSpeechRecognitionPipeline#20935Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Thanks for the addition.
I proposed several changes:
- Using
kwargsdirectly is indeed done intext-generationpipeline, but it actually showed limits becasemax_lengthis both used forgenerateand the tokenization part, so I now try to avoid it.
pipeline(..., generate_kwargs={"num_beams": 5}) is not exactly as elegant, but at least it enables ALL parameters without risking clashing later on with another parameter.
Since max_new_tokens is likely to be the most used parameter, I think we can definitely lift it as a first class parameter. pipeline(...., max_new_tokens=5) works.
That way we don't risk more obscure parameter clashing and we can still enable complex use cases.
I added an error in the odd case where a user would do pipeline(..., max_new_tokens=5, generate_kwargs={"max_new_tokens": 10}) .
Wdyt ?
Do you mind adding small test for this feature, and update the docstring ?
Otherwise LGTM
|
@Narsil it's indeed better this way. Thanks for the explanation! |
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
|
Hi @Narsil, Some tests of
Personally I prefer the 1st one since the other one may introduce some silent errors. What's your opinion? |
In general I would agree with you. Pipelines accepting so many parameters I would tend to keep it simple, and maybe just change line 183 - self.decoder = kwargs["decoder"]
+ self.decoder = kwargs.pop("decoder")This would be just so the signature is kept at a minimum (the docstring should be good) and avoiding accepting |
|
This is the sort of function complexity that I think is more detrimental than helping unfortunately: https://huggingface.co/docs/transformers/v4.25.1/en/main_classes/text_generation#transformers.GenerationMixin.generate |
The error occurs in the line 173 where |
* Fix error message * Fix code quality
Fixing error message
Ah so it happens before then, let's do it you way then does work ? |
No it's a syntax error :( Can we do this ? - def __init__(self, feature_extractor: Union["SequenceFeatureExtractor", str], *args, **kwargs):
+ def __init__(self, feature_extractor: Union["SequenceFeatureExtractor", str], decoder: Optional[Union["BeamSearchDecoderCTC", str]] = None, *args, **kwargs): |
|
This will interpret Can you try : + def __init__(self, feature_extractor: Union["SequenceFeatureExtractor", str], *, decoder: Optional[Union["BeamSearchDecoderCTC", str]] = None, **kwargs):Maybe ? |
No we need |
Remove it there too. |
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
…nghuang/transformers into add-gen-kwargs-asr-pipeline
|
@Narsil Oups, the commit history seems to be messed up. Let me create a new one! |
|
Closed as the other one is cleaner #20952 |
What does this PR do?
Hi @Narsil 👋,
In this PR, I tried to add generate arguments to
AutomaticSpeechRecognitionPipelinein order to run pipeline with seq2seq models using beam search, contrastive search, etc. I followed the style inTextGenerationPipeline.Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.