Add generate kwargs to `AutomaticSpeechRecognitionPipeline` by bofenghuang · Pull Request #20935 · huggingface/transformers

bofenghuang · 2022-12-29T11:02:12Z

What does this PR do?

In this PR, I tried to add generate arguments to AutomaticSpeechRecognitionPipeline in order to run pipeline with seq2seq models using beam search, contrastive search, etc. I followed the style in TextGenerationPipeline.

import torch
from transformers import pipeline

pipe = pipeline(model="openai/whisper-base", device=0, torch_dtype=torch.float16)

pipe("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/1.flac", max_new_tokens=5)
# {'text': ' He hoped'}

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

HuggingFaceDocBuilderDev · 2022-12-29T11:16:43Z

The documentation is not available anymore as the PR was closed or merged.

Narsil

Thanks for the addition.

I proposed several changes:

Using kwargs directly is indeed done in text-generation pipeline, but it actually showed limits becase max_length is both used for generate and the tokenization part, so I now try to avoid it.

pipeline(..., generate_kwargs={"num_beams": 5}) is not exactly as elegant, but at least it enables ALL parameters without risking clashing later on with another parameter.

Since max_new_tokens is likely to be the most used parameter, I think we can definitely lift it as a first class parameter. pipeline(...., max_new_tokens=5) works.
That way we don't risk more obscure parameter clashing and we can still enable complex use cases.

I added an error in the odd case where a user would do pipeline(..., max_new_tokens=5, generate_kwargs={"max_new_tokens": 10}) .

Wdyt ?

Do you mind adding small test for this feature, and update the docstring ?

Otherwise LGTM

bofenghuang · 2022-12-29T13:52:07Z

@Narsil it's indeed better this way. Thanks for the explanation!

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

bofenghuang · 2022-12-29T16:05:22Z

Hi @Narsil,

Some tests of ctc_with_lm models failed. I think we could

Lift decoder in __init__ as an individual argument
Add **kwargs into _sanitize_parameters

Personally I prefer the 1st one since the other one may introduce some silent errors. What's your opinion?

Narsil · 2022-12-29T16:49:01Z

Personally I prefer the 1st one since the other one may introduce some silent errors. What's your opinion?

In general I would agree with you. Pipelines accepting so many parameters I would tend to keep it simple, and maybe just change line 183

- self.decoder = kwargs["decoder"]
+ self.decoder = kwargs.pop("decoder")

This would be just so the signature is kept at a minimum (the docstring should be good) and avoiding accepting decoder as a positioned arguments instead of a keyword one. (I know we can do that within the signature, but it does complexify the docs, notably this part: https://huggingface.co/docs/transformers/v4.25.1/en/main_classes/pipelines#transformers.AutomaticSpeechRecognitionPipeline

Narsil · 2022-12-29T16:49:47Z

This is the sort of function complexity that I think is more detrimental than helping unfortunately: https://huggingface.co/docs/transformers/v4.25.1/en/main_classes/text_generation#transformers.GenerationMixin.generate

bofenghuang · 2022-12-29T17:04:09Z

In general I would agree with you. Pipelines accepting so many parameters I would tend to keep it simple, and maybe just change line 183
- self.decoder = kwargs["decoder"]
+ self.decoder = kwargs.pop("decoder")

The error occurs in the line 173 where _sanitize_parameters is called in parent :(

* Fix error message * Fix code quality

Fixing error message

Narsil · 2022-12-30T13:30:36Z

The error occurs in the line 173 where _sanitize_parameters is called in parent :(

Ah so it happens before then, let's do it you way then

does

__init__(self, ....,, *args, *, decoder, **kwargs)

work ?
(Try and force to disable positional argument for decoder ?

bofenghuang · 2022-12-30T14:16:49Z

does
__init__(self, ....,, *args, *, decoder, **kwargs)
work ? (Try and force to disable positional argument for decoder ?

No it's a syntax error :(

Can we do this ?

- def __init__(self, feature_extractor: Union["SequenceFeatureExtractor", str], *args, **kwargs):
+ def __init__(self, feature_extractor: Union["SequenceFeatureExtractor", str], decoder: Optional[Union["BeamSearchDecoderCTC", str]] = None, *args, **kwargs):

Narsil · 2022-12-30T15:34:47Z

This will interpret AutomaticSpeecRecognitionPipeline(feature_extractor, model) and interpret model as decoder which will lead to confusing errors.

Can you try :

+ def __init__(self, feature_extractor: Union["SequenceFeatureExtractor", str], *, decoder: Optional[Union["BeamSearchDecoderCTC", str]] = None, **kwargs):

Maybe ?

bofenghuang · 2022-12-30T15:52:41Z

Can you try :

+ def __init__(self, feature_extractor: Union["SequenceFeatureExtractor", str], *, decoder: Optional[Union["BeamSearchDecoderCTC", str]] = None, **kwargs):

Maybe ?

No we need *args for the line 173

Narsil · 2022-12-30T16:13:23Z

Can you try :

+ def __init__(self, feature_extractor: Union["SequenceFeatureExtractor", str], *, decoder: Optional[Union["BeamSearchDecoderCTC", str]] = None, **kwargs):

Maybe ?

No we need *args for the line 173

Remove it there too.

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

…nghuang/transformers into add-gen-kwargs-asr-pipeline

bofenghuang · 2022-12-30T19:20:29Z

@Narsil Oups, the commit history seems to be messed up. Let me create a new one!

bofenghuang · 2022-12-30T19:33:23Z

Closed as the other one is cleaner #20952

Add generate kwargs to AutomaticSpeechRecognitionPipeline

9460e9a

Narsil reviewed Dec 29, 2022

View reviewed changes

Comment thread src/transformers/pipelines/automatic_speech_recognition.py Outdated

Comment thread src/transformers/pipelines/automatic_speech_recognition.py Outdated

Comment thread src/transformers/pipelines/automatic_speech_recognition.py Outdated

bofenghuang and others added 4 commits December 29, 2022 16:11

Sanitize generate_kwargs parameters

4182e3f

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

Lift up max_new_tokens as function argument

741c796

Update docstring

525e15d

Add test for generation kwargs

75b43ad

mmcdermott and others added 3 commits December 30, 2022 02:35

Adds type checking to PreTrainedConfig. (huggingface#20926)

491a33d

Fix error message in WhisperFeatureExtractor (huggingface#20936)

881fa71

* Fix error message * Fix code quality

Fixing DistilBert error message (huggingface#20945)

1729244

Fixing error message

bofenghuang and others added 7 commits December 30, 2022 19:01

Add generate kwargs to AutomaticSpeechRecognitionPipeline

ad0a6d2

Sanitize generate_kwargs parameters

bc07553

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

Lift up max_new_tokens as function argument

1cbd46b

Update docstring

5352ad0

Add test for generation kwargs

bb1e3de

Lift decoder as individual argument

04f044c

Merge branch 'add-gen-kwargs-asr-pipeline' of https://github.com/bofe…

7044531

…nghuang/transformers into add-gen-kwargs-asr-pipeline

bofenghuang mentioned this pull request Dec 30, 2022

Add generate kwargs to AutomaticSpeechRecognitionPipeline #20952

Merged

5 tasks

bofenghuang closed this Dec 30, 2022

bofenghuang deleted the add-gen-kwargs-asr-pipeline branch January 6, 2023 10:39

Conversation

bofenghuang commented Dec 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Dec 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Narsil left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bofenghuang commented Dec 29, 2022

Uh oh!

bofenghuang commented Dec 29, 2022

Uh oh!

Narsil commented Dec 29, 2022

Uh oh!

Narsil commented Dec 29, 2022

Uh oh!

bofenghuang commented Dec 29, 2022

Uh oh!

Narsil commented Dec 30, 2022

Uh oh!

bofenghuang commented Dec 30, 2022

Uh oh!

Narsil commented Dec 30, 2022

Uh oh!

bofenghuang commented Dec 30, 2022

Uh oh!

Narsil commented Dec 30, 2022

Uh oh!

bofenghuang commented Dec 30, 2022

Uh oh!

bofenghuang commented Dec 30, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

bofenghuang commented Dec 29, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Dec 29, 2022 •

edited

Loading

Narsil left a comment •

edited

Loading