Skip to content

[wip] [fsmt] possible support of iwslt14 in fsmt#8374

Closed
stas00 wants to merge 1 commit intohuggingface:masterfrom
stas00:iwslt14
Closed

[wip] [fsmt] possible support of iwslt14 in fsmt#8374
stas00 wants to merge 1 commit intohuggingface:masterfrom
stas00:iwslt14

Conversation

@stas00
Copy link
Copy Markdown
Contributor

@stas00 stas00 commented Nov 6, 2020

This is an attempt to see whether fsmt can support older fairseq archs, based on this request: #8233

Currently it's just changing the conversion code directly to see if it can be converted.

python src/transformers/convert_fsmt_original_pytorch_checkpoint_to_pytorch.py --fsmt_checkpoint_path ./fairseq-en-el-model/checkpoint_best.pt --pytorch_dump_folder_path ./model-data

@lighteternal, please have a look.

I made the model configuration args based on the arch configuration. The only issue at the moment is sizes of encoder decoder - for some reason the vocab size seems to be reversed?

$wc -l fairseq-en-el-model/*txt
 12892 fairseq-en-el-model/dict.el.txt
  9932 fairseq-en-el-model/dict.en.txt

So this is English to Greek, correct? And not the other way around, correct? So the source is 9932 and target is 12892-long.

Your issue mentioned "Greek<->English", but this model must be one way - which is it?

When I run the script:

        size mismatch for model.encoder.embed_tokens.weight: copying a param with shape torch.Size([12896, 512]) from checkpoint, the shape in current model is torch.Size([9936, 512]).
        size mismatch for model.decoder.output_projection.weight: copying a param with shape torch.Size([9936, 512]) from checkpoint, the shape in current model is torch.Size([12896, 512]).

So it suggests that the encoder is 12896- long, which should be the other way around, no? Unless it was trained on Greek to English.

Well, you can also experiment with the conversion.

# model config
fsmt_model_config_file = os.path.join(pytorch_dump_folder_path, "config.json")

args = {
Copy link
Copy Markdown
Contributor Author

@stas00 stas00 Nov 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lighteternal, also please check I got these defaults right.

wmt19 fairseq supplies all these in the checkpoint.

@lighteternal
Copy link
Copy Markdown

lighteternal commented Nov 6, 2020

You were right, it was an EN to EL model so the src-tgt were inversed, however I changed the args and I tried again.
It gave me an error:

RuntimeError: Error(s) in loading state_dict for FSMTForConditionalGeneration:
        size mismatch for model.decoder.embed_tokens.weight: copying a param with shape torch.Size([12896, 512]) from checkpoint, the shape in current model is torch.Size([9936, 512]).

which I fixed by changing one of the ignore keys due to a typo probably, in line 251 (see comment below):

    # remove unneeded keys
    ignore_keys = [
        "model.model",
        "model.encoder.version",
        "model.decoder.version",
        "model.encoder_embed_tokens.weight", 
        "model.decoder.embed_tokens.weight",#here, the original was: model.decoder_embed_tokens.weight
        "model.encoder.embed_positions._float_tensor",
        "model.decoder.embed_positions._float_tensor",
    ]
    for k in ignore_keys:
        model_state_dict.pop(k, None)

After that, the conversion script concludes succesfully:

Generating data/wmt16-el-en-dist/vocab-src.json
Generating data/wmt16-el-en-dist/vocab-tgt.json
Generating data/wmt16-el-en-dist/merges.txt
Generating data/wmt16-el-en-dist/config.json
Generating data/wmt16-el-en-dist/tokenizer_config.json
Generating data/wmt16-el-en-dist/pytorch_model.bin
Conversion is done!

Last step is to upload the files to s3
cd data
transformers-cli upload wmt16-el-en-dist

I guess the next thing is to try to load this model locally to test that it's working.

@stas00
Copy link
Copy Markdown
Contributor Author

stas00 commented Nov 7, 2020

Fantastic! Yes, load it up and let us know whether it works.

If you want ready scripts to adapt from, perhaps try https://github.com/stas00/porting/blob/master/transformers/fairseq-wmt19/scripts/fsmt-translate.py
and if you convert the reversed model (hint: just swap src and tgt languages in the conversion script) you can even do a paraphrase:
https://github.com/stas00/porting/blob/master/transformers/fairseq-wmt19/scripts/fsmt-paraphrase.py

@stas00
Copy link
Copy Markdown
Contributor Author

stas00 commented Nov 7, 2020

which I fixed by changing one of the ignore keys due to a typo probably, in line 251 (see comment below):
remove unneeded keys
ignore_keys = [
"model.decoder.embed_tokens.weight",#here, the original was: model.decoder_embed_tokens.weight

I don't think that this solution works. While you made the conversion script complete, your ported model now has random weights for model.decoder.embed_tokens.weight - you probably are going to see garbage output.

The error is:

size mismatch for model.decoder.embed_tokens.weight: copying a param with shape torch.Size([9936, 512]) from checkpoint, the shape in current model is torch.Size([12896, 512]).
  • source here is 12892 (el)
  • and target is 9932 (en)

For some reason it's creating a decoder with the size of the encoder dict.

I think there might have been a bug introduced since I wrote and used this script. I will debug and get back to you.

@stas00
Copy link
Copy Markdown
Contributor Author

stas00 commented Nov 7, 2020

Hmm, it looks like fairseq has introduced some breaking changes - that's why the script wasn't working out of the box. The args in the checkpoint appears to be mostly empty, so none of the wmt19 models can be converted either. Will investigate.

This is the breaking change: facebookresearch/fairseq@3b27ed7

They did away with the args object

@stas00
Copy link
Copy Markdown
Contributor Author

stas00 commented Nov 7, 2020

OK, I updated the conversion script to support the latest fairseq and it now converts your model out of the box w/o needing any changes, please use the version in this PR if it hasn't been merged yet. #8377

Please let me know whether the results are satisfactory and you get a good translation out of it - note it uses some default hparams (see the script) so you can adjust those to your liking.

Once you're happy you can upload the model to s3 as explained here: https://huggingface.co/transformers/model_sharing.html

@stas00
Copy link
Copy Markdown
Contributor Author

stas00 commented Nov 7, 2020

Resolved in #8377

@stas00 stas00 closed this Nov 7, 2020
@stas00 stas00 deleted the iwslt14 branch November 7, 2020 04:58
@lighteternal
Copy link
Copy Markdown

This is great @stas00, I just tried it locally and it works as intended. Many thanks!

One quick question, during training with fairseq, the tokenization also converted all letters to lower-case (to reduce the vocab I assume) so now in order to get correct translations the input text needs to be lowercase only. I can of course add a line to my script to do that automatically, but I was wondering how I can force the uploaded model to do that (so that anyone wanting to test it doesn't have to download it locally and add that additional line). Probably with a config argument...?

I am uploading this to s3 soon, your help has been invaluable 👍

@stas00
Copy link
Copy Markdown
Contributor Author

stas00 commented Nov 7, 2020

Excellent. I'm glad to hear we sorted it out.

So have you validated that the translation works and the bleu score evals are satisfactory? You can do it easily with transformers using the examples here: https://github.com/huggingface/transformers/tree/master/scripts/fsmt (scripts starting with eval_).

One quick question, during training with fairseq, the tokenization also converted all letters to lower-case (to reduce the vocab I assume) so now in order to get correct translations the input text needs to be lowercase only. I can of course add a line to my script to do that automatically, but I was wondering how I can force the uploaded model to do that (so that anyone wanting to test it doesn't have to download it locally and add that additional line). Probably with a config argument...?

It's currently not supported as all the models I worked with didn't have that restriction. I will add this functionality to the transformers implementation of FSMT now that I know that this is still needed. I will let you know when this is done. Perhaps hold off on making the release to ensure that your model works out of box. You will also need to update your ported model's config when this is done.

Out of curiosity, if you don't mind sharing, is there a special reason why you chose to train an older more restricted architecture and not one of the newer ones? Surely, losing the normal casing would be a hurdle for practical use.

@lighteternal
Copy link
Copy Markdown

So have you validated that the translation works and the bleu score evals are satisfactory? You can do it easily with transformers using the examples here: https://github.com/huggingface/transformers/tree/master/scripts/fsmt (scripts starting with eval_).

I validated the bleu and chrF scores on the fairseq equivalent of the model (before converting it to huggingface) on the Tatoeba testset, but now that there are additional evaluation scripts I will try these as well!

It's currently not supported as all the models I worked with didn't have that restriction. I will add this functionality to the transformers implementation of FSMT now that I know that this is still needed. I will let you know when this is done. Perhaps hold off on making the release to ensure that your model works out of box. You will also need to update your ported model's config when this is done.

Thanks for this, sure I can wait if there's the option of adding that feature too!

Out of curiosity, if you don't mind sharing, is there a special reason why you chose to train an older more restricted architecture and not one of the newer ones? Surely, losing the normal casing would be a hurdle for practical use.

Tbh, I was just following a fairseq guide that was suggesting this arch over the following possible choices:
Possible choices: transformer, transformer_iwslt_de_en, transformer_wmt_en_de, transformer_vaswani_wmt_en_de_big, transformer_vaswani_wmt_en_fr_big, transformer_wmt_en_de_big, transformer_wmt_en_de_big_t2t, multilingual_transformer, multilingual_transformer_iwslt_de_en, fconv, fconv_iwslt_de_en, fconv_wmt_en_ro, fconv_wmt_en_de, fconv_wmt_en_fr, nonautoregressive_transformer, nonautoregressive_transformer_wmt_en_de, nacrf_transformer, iterative_nonautoregressive_transformer, iterative_nonautoregressive_transformer_wmt_en_de, cmlm_transformer, cmlm_transformer_wmt_en_de, levenshtein_transformer, levenshtein_transformer_wmt_en_de, levenshtein_transformer_vaswani_wmt_en_de_big, levenshtein_transformer_wmt_en_de_big, insertion_transformer, bart_large, bart_base, mbart_large, mbart_base, mbart_base_wmt20, lstm, lstm_wiseman_iwslt_de_en, lstm_luong_wmt_en_de, transformer_lm, transformer_lm_big, transformer_lm_baevski_wiki103, transformer_lm_wiki103, transformer_lm_baevski_gbw, transformer_lm_gbw, transformer_lm_gpt, transformer_lm_gpt2_small, transformer_lm_gpt2_medium, transformer_lm_gpt2_big, transformer_align, transformer_wmt_en_de_big_align, hf_gpt2, hf_gpt2_medium, hf_gpt2_large, hf_gpt2_xl, transformer_from_pretrained_xlm, lightconv, lightconv_iwslt_de_en, lightconv_wmt_en_de, lightconv_wmt_en_de_big, lightconv_wmt_en_fr_big, lightconv_wmt_zh_en_big, lightconv_lm, lightconv_lm_gbw, fconv_self_att, fconv_self_att_wp, fconv_lm, fconv_lm_dauphin_wikitext103, fconv_lm_dauphin_gbw, lstm_lm, roberta, roberta_base, roberta_large, xlm, masked_lm, bert_base, bert_large, xlm_base, s2t_berard, s2t_berard_256_3_3, s2t_berard_512_3_2, s2t_berard_512_5_3, s2t_transformer, s2t_transformer_s, s2t_transformer_sp, s2t_transformer_m, s2t_transformer_mp, s2t_transformer_l, s2t_transformer_lp, wav2vec, wav2vec2, wav2vec_ctc, wav2vec_seq2seq, dummy_model, transformer_lm_megatron, transformer_lm_megatron_11b, transformer_iwslt_de_en_pipeline_parallel, transformer_wmt_en_de_big_pipeline_parallel, model_parallel_roberta, model_parallel_roberta_base, model_parallel_roberta_large

If you could indicate a more recent architecture that has about the same number of parameters (I have a constraint on complexity as I am using a single GTX2080SUPER) I would be happy to re-train!

@stas00
Copy link
Copy Markdown
Contributor Author

stas00 commented Nov 7, 2020

So have you validated that the translation works and the bleu score evals are satisfactory? You can do it easily with transformers using the examples here: https://github.com/huggingface/transformers/tree/master/scripts/fsmt (scripts starting with eval_).

I validated the bleu and chrF scores on the fairseq equivalent of the model (before converting it to huggingface) on the Tatoeba testset, but now that there are additional evaluation scripts I will try these as well!

My only concern here is that the forced lower-case which won't be the case with the references bleu scores are evaled against.

Out of curiosity, if you don't mind sharing, is there a special reason why you chose to train an older more restricted architecture and not one of the newer ones? Surely, losing the normal casing would be a hurdle for practical use.

Tbh, I was just following a fairseq guide that was suggesting this arch over the following possible choices:
`Possible choices: transformer, transformer_iwslt_de_en, transformer_wmt_en_de, transformer_vaswani_wmt_en_de_big, [...]

If you could indicate a more recent architecture that has about the same number of parameters (I have a constraint on complexity as I am using a single GTX2080SUPER) I would be happy to re-train!

I re-read the guide and I'm not sure what you mean when you said: "was suggesting this arch over the following possible choices" - I can't find any recommendations to use this particular model over the dozens of the ones you listed. e.g. how did you know that it's a smaller model than some others?

I'm gradually getting to know the fairseq models and have only dealt with wmt-variations of transformer. I suppose you can see all the variations defined here and below: https://github.com/pytorch/fairseq/blob/master/fairseq/models/transformer.py#L985
So primarily these appear to differ in the size and shape of the model.

When you did the training, was there an option not to force lowercase input or did it come automatic with the transformer_iwslt_de_en? I don't see an option to toggle this on/off in fairseq-train command. And looking around the code I don't quite see a configurable option to do so.

@lighteternal
Copy link
Copy Markdown

True, the forced lower-case may give slightly higher BLEU score.

I re-read the guide and I'm not sure what you mean when you said: "was suggesting this arch over the following possible choices" - I can't find any recommendations to use this particular model over the dozens of the ones you listed. e.g. how did you know that it's a smaller model than some others?

By "suggesting" I mean that I just used the pre-defined arch on the available script (see below):

CUDA_VISIBLE_DEVICES=0 fairseq-train \
    data-bin/iwslt14.tokenized.de-en \
    --arch transformer_iwslt_de_en --share-decoder-input-output-embed \
    --optimizer adam --adam-betas '(0.9, 0.98)' --clip-norm 0.0 \
    --lr 5e-4 --lr-scheduler inverse_sqrt --warmup-updates 4000 \
    --dropout 0.3 --weight-decay 0.0001 \
    --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
    --max-tokens 4096 \
    --eval-bleu \
    --eval-bleu-args '{"beam": 5, "max_len_a": 1.2, "max_len_b": 10}' \
    --eval-bleu-detok moses \
    --eval-bleu-remove-bpe \
    --eval-bleu-print-samples \
    --best-checkpoint-metric bleu --maximize-best-checkpoint-metric

I didn't know it would lead to a smaller model before digging it up a bit and discovering that e.g. there's a difference to the ffd hidden layer dimension. I experimented with some of them (not all) and since I am not an expert ofc I ended up on that one based on a.the fact that it actually worked, b. that the perplexity I was getting was getting lower quicker that other cases, and c.more importantly based on whether I would get an OOM after a while (most of the cases) :P.
Do you think that there would be a significant gain from trying a newer architecture that you have in mind?

When you did the training, was there an option not to force lowercase input or did it come automatic with the transformer_iwslt_de_en? I don't see an option to toggle this on/off in fairseq-train command. And looking around the code I don't quite see a configurable option to do so.

The lowercase command comes at the data preparation script provided in the guide: https://github.com/pytorch/fairseq/blob/master/examples/translation/prepare-iwslt14.sh
(line 11). It was not part of the fairseq training/preprocessing.

@stas00
Copy link
Copy Markdown
Contributor Author

stas00 commented Nov 7, 2020

If you could indicate a more recent architecture that has about the same number of parameters (I have a constraint on complexity as I am using a single GTX2080SUPER) I would be happy to re-train!

I think most of the recent ones are quite much bigger, but one that we ported that may warrant your attention is the distilled variation: https://github.com/jungokasai/deep-shallow/
(ported scripts https://github.com/huggingface/transformers/tree/master/scripts/fsmt - start with convert-allenai-,
the wmt model cards are at the end of the list here https://huggingface.co/allenai)

@stas00
Copy link
Copy Markdown
Contributor Author

stas00 commented Nov 7, 2020

I didn't know it would lead to a smaller model before digging it up a bit and discovering that e.g. there's a difference to the ffd hidden layer dimension. I experimented with some of them (not all) and since I am not an expert ofc I ended up on that one based on a.the fact that it actually worked, b. that the perplexity I was getting was getting lower quicker that other cases, and c.more importantly based on whether I would get an OOM after a while (most of the cases) :P.
Do you think that there would be a significant gain from trying a newer architecture that you have in mind?

I'm relatively new to this myself, so I haven't tried enough variations yet to make such recommendations. Perhaps asking at the forums stating your limitations would lead to some excellent recommendations - or perhaps what you have done is just fine - it all depends on your needs. Do check out the distilled approach I mentioned in the comment above.

The lowercase command comes at the data preparation script provided in the guide: https://github.com/pytorch/fairseq/blob/master/examples/translation/prepare-iwslt14.sh
(line 11). It was not part of the fairseq training/preprocessing.

Oh, I see, this is totally circumstantial - you just trained on lower-cased input so this is the world it knows. This makes total sense. Thank you for helping me understand this nuance.

@stas00
Copy link
Copy Markdown
Contributor Author

stas00 commented Nov 7, 2020

OK, I implemented the lowercase config, and wanted to automate the discovery of when this option should be pre-set, but the latter didn't work - my detector code discovered up-case letters - I looked at both vocabs you supplied and both have upcase letters in them - a lot of them in the el one and some in en-one (merges/code too).

I tried this very simple heuristic:

    # detect whether this is a do_lower_case situation, which can be derived by checking whether we
    # have at least one upcase letter in the source vocab
    do_lower_case = True
    for k in src_vocab.keys():
        if not k.islower():
            do_lower_case = False
            break

I suppose this has to be set manually then because you know you trained mainly on lowercase - but perhaps there is a bug somewhere on the fairseq side and should not have any upcase letters in either of the 3 files if it were to be properly lower-cased?

@stas00
Copy link
Copy Markdown
Contributor Author

stas00 commented Nov 7, 2020

The PR that adds lower-case support is here #8389, but for the converter to work with the recent fairseq #8377 is needed to be merged first, or you can do just this over 8389:

diff --git a/src/transformers/convert_fsmt_original_pytorch_checkpoint_to_pytorch.py b/src/transformers/convert_fsmt_original_pytorch_checkpoint_to_pytorch.py
index 2cc42718..61ef9010 100755
--- a/src/transformers/convert_fsmt_original_pytorch_checkpoint_to_pytorch.py
+++ b/src/transformers/convert_fsmt_original_pytorch_checkpoint_to_pytorch.py
@@ -113,7 +113,7 @@ def convert_fsmt_checkpoint_to_pytorch(fsmt_checkpoint_path, pytorch_dump_folder
         fsmt_folder_path, checkpoint_file, data_name_or_path, archive_map=models, **kwargs
     )

-    args = dict(vars(chkpt["args"]))
+    args = vars(chkpt["args"]["model"])

Or if you don't want to mess with these, we can wait until both are merged.

In either case before uploading to s3 you will need to manually set "do_lower_case": true in tokenizer_config.json of the converted model - since as I mentioned in the comment above there is no way of automatically detecting the need to preset "do_lower_case": true during conversion as all vocabs in your model have uppercase letters in them.

@lighteternal
Copy link
Copy Markdown

I think most of the recent ones are quite much bigger, but one that we ported that may warrant your attention is the distilled variation: https://github.com/jungokasai/deep-shallow/
(ported scripts https://github.com/huggingface/transformers/tree/master/scripts/fsmt - start with convert-allenai-,
the wmt model cards are at the end of the list here https://huggingface.co/allenai)

I will try this for sure, although I remember that the --arch transformer used in the script led to OOM in my machine.

Or if you don't want to mess with these, we can wait until both are merged.
In either case before uploading to s3 you will need to manually set "do_lower_case": true in tokenizer_config.json of the converted model - since as I mentioned in the comment above there is no way of automatically detecting the need to preset "do_lower_case": true during conversion as all vocabs in your model have uppercase letters in them.

Well I couldn't wait, so I tried following your steps and it works perfectly! :D Kudos once again!
Now, my only question is should I upload the EN2EL and EL2EN models or wait until the PR is merged? I guess that the transformers version that is currently loading all s3-uploaded models is not up-to-date yet, so it will actually miss the capital letters.

@stas00
Copy link
Copy Markdown
Contributor Author

stas00 commented Nov 8, 2020

I will try this for sure, although I remember that the --arch transformer used in the script led to OOM in my machine.

I'm not sure whether fairseq has some doc that compares the different arch configs, but this piece of code seems to be pretty clear on the differences: https://github.com/pytorch/fairseq/blob/master/fairseq/models/transformer.py#L985

The default transformer uses pretty big layers - so it requires a lot of gpu memory.

Well I couldn't wait, so I tried following your steps and it works perfectly! :D Kudos once again!

Fantastic!

I suppose you're not concerned with the upcase letters in the dict/merge files of your pre-trained model. I'd have thought that fairseq pre-processor would lowercased all inputs. But if you think it's no problem, then all is good.

Now, my only question is should I upload the EN2EL and EL2EN models or wait until the PR is merged? I guess that the transformers version that is currently loading all s3-uploaded models is not up-to-date yet, so it will actually miss the capital letters.

You have to first wait till the lowercasing-PR merged - probably Mon or early next week, and then AFAIK the online version doesn't get updated automatically - the models' code doesn't change often - so we will have to ask for this to happen. And once you see the demo working on the site, then it's in the clear to share with others.

@lighteternal
Copy link
Copy Markdown

Thank you for the code showcasing the differences between models. I couldn't find a doc with that info.

I suppose you're not concerned with the upcase letters in the dict/merge files of your pre-trained model. I'd have thought that fairseq pre-processor would lowercased all inputs. But if you think it's no problem, then all is good.

Probably the perl command included in the fairseq preparation script didn't catch all cases; I can't think of another explanation. In any case I will be modifying this script to re-train without lower-casing and with a bigger number of BPE tokens, just to see if I get a more convenient model that doesn't need the lower-case argument (and hopefully without losing much BLEU).

You have to first wait till the lowercasing-PR merged - probably Mon or early next week, and then AFAIK the online version doesn't get updated automatically - the models' code doesn't change often - so we will have to ask for this to happen. And once you see the demo working on the site, then it's in the clear to share with others.

OK so I'm waiting for the merge, then upload and probably come back in this thread to request an update on the online version if possible. Thanks for the help @stas00 !

@stas00
Copy link
Copy Markdown
Contributor Author

stas00 commented Nov 9, 2020

FYI, the lower-casing PR has been merged, so please let me know whether you're waiting to re-train with mixed-casing or whether you want to upload the lower-case model and I will then ask to update the code on the models server.

@lighteternal
Copy link
Copy Markdown

I already started the mixed-casing training and I was thinking I can upload all 4 of them (lower EN2EL, lower EL2EN, mixed EN2EL, mixed EL2EN) together. The mixed ones also have a bigger vocabulary (almost double) and BLEU scores are very similar to the older lower case ones.

@stas00
Copy link
Copy Markdown
Contributor Author

stas00 commented Nov 9, 2020

Great!

I made the request to update the server - I will update when this is done.

It's good to know that there is not much difference with the bigger vocab. I'm curious to how the scores would have been different if your original model was truly lower-case (since it's not at the moment if you check the vocab). (this is just for my learning should you ever run this test)

@lighteternal
Copy link
Copy Markdown

Hi @stas00, just pinging to check if the code on the model hub is updated.
I also trained cased models, and I uploaded one to the hub already: https://huggingface.co/lighteternal/SSE-TUC-mt-en-el-cased

But it returns an error when used online from the link above:

Unrecognized configuration class for this kind of AutoModel: AutoModelForCausalLM. Model type should be one of CamembertConfig, XLMRobertaConfig, RobertaConfig, BertConfig, OpenAIGPTConfig, GPT2Config, TransfoXLConfig, XLNetConfig, XLMConfig, CTRLConfig, ReformerConfig, BertGenerationConfig, XLMProphetNetConfig, ProphetNetConfig.

I also noticed that the tag directly above the field box is falsely assigned as "text-generation".

@Narsil
Copy link
Copy Markdown
Contributor

Narsil commented Nov 18, 2020

Hi, it seems your model card is defined in markdown format not in Yaml: https://huggingface.co/lighteternal/SSE-TUC-mt-en-el-cased leading to incorrect pipeline detection (hence the error you are seeing). Can you try setting the pieline correctly ?

https://huggingface.co/docs#how-are-model-tags-determined

Let us know if it works better

@lighteternal
Copy link
Copy Markdown

lighteternal commented Nov 18, 2020

Thanks @Narsil I updated the model card, but it doesn't seem to have made any difference yet. Is it possible that it takes some time to change pipeline?

@lighteternal
Copy link
Copy Markdown

It appears that editing the README from the browser, doesn't work; after pulling, editing and pushing again it worked! Many thanks! :)

@julien-c
Copy link
Copy Markdown
Member

Thanks @Narsil I updated the model card, but it doesn't seem to have made any difference yet. Is it possible that it takes some time to change pipeline?

Maybe we forgot to hook a refresh on commits from the website, @Pierrci?

@Pierrci
Copy link
Copy Markdown
Member

Pierrci commented Nov 18, 2020

Indeed, I pushed a fix that will be deployed soon, next time you edit a readme or another file on the website the changes will reflect instantly @lighteternal, thanks for reporting this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants