[WIP] seq2seq example by mgoldey · Pull Request #3402 · huggingface/transformers

mgoldey · 2020-03-23T22:24:08Z

This PR presents an example seq2seq use case and bug fixes necessary for this to execute with reasonable accuracy.

The utils_seq2seq.py file defines the data format for training data, and the run_seq2seq.py file takes training, development, and test data and produces a model. The README.md discusses how to execute this toy problem. The specific toy problem in use here is formatting a date string to the American style, which is a trivial example. On my local setup using GPUs, this example executes within 5 minutes. Production models should include more learnings.

I welcome feedback about how to strengthen performance here and the best route to increase testing.

This relies on a few bug fixes which have been incorporated in this branch

Without a fix for AttributeError: 'Model2Model' object has no attribute 'prepare_model_kwargs' in 2.5.1 #3038, PreTrainedEncoderDecoder won't instantiate at all.
Without a fix for update the config.is_decoder=True before initialize the decoder #2435, BERT models fail completely on this use case as the BERT decoder isn't instantiated correctly without CrossAttention in that case.
I strongly suspect that the input to the decoder in the PreTrainedEncoderDecoder class is incorrect as present in the code base, and commit 9fcf73a has a proposed fix. It doesn't make sense to have the expected token ids as input to the decoder when the decoder needs to learn how to decode from the embeddings. Incomplete understanding - will fix

mgoldey · 2020-03-23T22:37:47Z

As a non-blocking question, I do note that a lot of the examples use argparse to parse comparatively long lists of arguments. I've maintained the extant style in this PR to avoid causing noise and confusion Would it be acceptable if I broke with this style to use a JSON file to store all the arguments for an experiment?

patrickvonplaten · 2020-03-27T09:56:23Z

Hi @mgoldey, sorry for only responding now. Thanks a lot for adding a seq2seq example :-) I will take a look early next week and maybe we can have a quick chat how to merge this PR and #3383.

mgoldey · 2020-03-27T14:38:21Z

That sounds good. I'm still tweaking things on my end for accuracy and improved logic as I get more familiar with the code base here. I'll see if I can rebase of #3383 by then, depending on my other workload. Feel free to reach out via google hangouts if you're comfortable.

…orrectly

…mbeddings as input to decoder when no decoder_input_ids or decoder_inputs_embeds provided

dbaxter240 · 2020-05-11T22:38:01Z

+    features = []
+
+    cls_token, sep_token, pad_token = cls_token = (
+        tokenizer.cls_token,


Hi, I've been using your example to guide my own setup of a seq2seq model (thank you!) Is this line in your pull request a typo? Think it's just supposed to be cls_token, sep_token, pad_token = ( ...

dbaxter240 · 2020-05-12T20:55:43Z

+            formatted_tokens += [pad_token_label_id]
+            segment_ids += [cls_token_segment_id]
+        # gpt2 has no cls_token
+        elif model_type not in ["gpt2"]:


I believe this is supposed to be

elif model_type in ['gpt2']

instead of "not in"

patrickvonplaten · 2020-06-03T13:07:48Z

Sorry, to answer only now! I'll will soon add a Encoder-Decoder google colab that shows how to use seq2seq

mgoldey · 2020-06-03T14:18:00Z

Thanks - fine to close. We've moved forward without using seq2seq due to poor overall accuracy with the scale of data in place.

thomwolf assigned patrickvonplaten Mar 24, 2020

mgoldey mentioned this pull request Mar 26, 2020

revert unpin isort commit #3449

Merged

mgoldey changed the title ~~seq2seq example~~ [WIP] seq2seq example Mar 27, 2020

mgoldey added 12 commits April 22, 2020 13:41

pull in changes from huggingface#2435 to insure decoder initializes c…

8880395

…orrectly

allow for other files in save directory

1b8ce80

put bert?-specific kwargs in if-statement; allow for encoder output e…

5ac54a6

…mbeddings as input to decoder when no decoder_input_ids or decoder_inputs_embeds provided

tidy comments

c8f183a

add seq2seq example

3950761

don't try to remove checkpoints

94fcdac

enable lamb by default

e3e0a5a

add optimizer to args

2e45a20

move more bert-specific kwargs_decoder

4012447

change target use case

d924e3b

remove bert for now

531584f

update with example output for gpt2 run

b782352

mgoldey force-pushed the mgoldey/seq2seq branch from 8e3e8d4 to b782352 Compare April 23, 2020 15:54

mgoldey changed the base branch from master to clean_encoder_decocer_modeling April 23, 2020 15:55

mgoldey added 2 commits April 23, 2020 10:58

update README

eff28ae

isort then black

fd93be7

dbaxter240 reviewed May 11, 2020

View reviewed changes

dbaxter240 reviewed May 12, 2020

View reviewed changes

dbaxter240 mentioned this pull request May 18, 2020

Issues with the EncoderDecoderModel for sequence to sequence tasks #4443

Closed

patrickvonplaten closed this Jun 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] seq2seq example#3402

[WIP] seq2seq example#3402
mgoldey wants to merge 14 commits intohuggingface:clean_encoder_decocer_modelingfrom
greenkeytech:mgoldey/seq2seq

mgoldey commented Mar 23, 2020 •

edited

Loading

Uh oh!

mgoldey commented Mar 23, 2020

Uh oh!

patrickvonplaten commented Mar 27, 2020

Uh oh!

mgoldey commented Mar 27, 2020 •

edited

Loading

Uh oh!

dbaxter240 May 11, 2020

Uh oh!

dbaxter240 May 12, 2020

Uh oh!

patrickvonplaten commented Jun 3, 2020

Uh oh!

mgoldey commented Jun 3, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mgoldey commented Mar 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mgoldey commented Mar 23, 2020

Uh oh!

patrickvonplaten commented Mar 27, 2020

Uh oh!

mgoldey commented Mar 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dbaxter240 May 11, 2020

Choose a reason for hiding this comment

Uh oh!

dbaxter240 May 12, 2020

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten commented Jun 3, 2020

Uh oh!

mgoldey commented Jun 3, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mgoldey commented Mar 23, 2020 •

edited

Loading

mgoldey commented Mar 27, 2020 •

edited

Loading