Update to the latest version (31 Oct) by ratthachat · Pull Request #1 · ratthachat/transformers

ratthachat · 2020-10-30T23:09:28Z

Just update to my forked repo

* Create README.md * Update model_cards/ynie/roberta-large-snli_mnli_fever_anli_R1_R2_R3-nli/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Add Meta information for dataset identifier. Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Create README.md * Update README.md

Close #8030

#8030

…8006) * fixing #8001 * make T5 tokenizer serialization more robust - style

Minor typo fixes to the tokenizer summary

* Add mixed precision evaluation * use original flag

* distributed training * fix * fix formatting * wording

* Add MLflow integration class Add integration code for MLflow in integrations.py along with the code that checks that MLflow is installed. * Add MLflowCallback import Add import of MLflowCallback in trainer.py * Handle model argument Allow the callback to handle model argument and store model config items as hyperparameters. * Log parameters to MLflow in batches MLflow cannot log more than a hundred parameters at once. Code added to split the parameters into batches of 100 items and log the batches one by one. * Fix style * Add docs on MLflow callback * Fix issue with unfinished runs The "fluent" api used in MLflow integration allows only one run to be active at any given moment. If the Trainer is disposed off and a new one is created, but the training is not finished, it will refuse to log the results when the next trainer is created. * Add MLflow integration class Add integration code for MLflow in integrations.py along with the code that checks that MLflow is installed. * Add MLflowCallback import Add import of MLflowCallback in trainer.py * Handle model argument Allow the callback to handle model argument and store model config items as hyperparameters. * Log parameters to MLflow in batches MLflow cannot log more than a hundred parameters at once. Code added to split the parameters into batches of 100 items and log the batches one by one. * Fix style * Add docs on MLflow callback * Fix issue with unfinished runs The "fluent" api used in MLflow integration allows only one run to be active at any given moment. If the Trainer is disposed off and a new one is created, but the training is not finished, it will refuse to log the results when the next trainer is created.

* Fix minor typos Fix minor typos in the docs. * Update docs/source/preprocessing.rst Clearer data structure description. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

…doc (#8053)

--wwm cant be used as an argument given run_language_modeling.py and should be changed to --whole_word_mask

… pad_token (#8043) * make sure padding is implemented for non-padding tokens models as well * add better error message * add better warning * remove results files * Update examples/seq2seq/seq2seq_trainer.py * remove unnecessary copy line * correct usage of labels * delete test files

…8026) * mc for new cross lingual sentence model * fat text * url spelling fix * more url spelling fixes * slight thanks change * small improvements in text * multilingual word xchange * change colab link * xval fold number * add model links * line break in model names * Update README.md * Update README.md * new examples link * new examples link * add evaluation dataset name * add more about multi lingual * typo fix * typo * typos * hyperparameter typos * hyperparameter typo * add metadata * add metadata * Update README.md * typo fix * Small improvement

* Fixes in preparation for doc styling * More fixes * Better syntax * Fixes * Style * More fixes * More fixes

* Important files * Styling them all * Revert "Styling them all" This reverts commit 7d02939. * Syling them for realsies * Fix syntax error * Fix benchmark_utils * More fixes * Fix modeling auto and script * Remove new line * Fixes * More fixes * Fix more files * Style * Add FSMT * More fixes * More fixes * More fixes * More fixes * Fixes * More fixes * More fixes * Last fixes * Make sphinx happy

Signed-off-by: mymusise <mymusise1@gmail.com>

* Add model card for Gujarati-XLM-R-Base * Update README.md Add the model card for the Gujarati-XLM-R-Base. * Apply suggestions from code review Co-authored-by: Julien Chaumond <chaumond@gmail.com>

…ge (#8041)

* Create README.md * Update model_cards/gurkan08/bert-turkish-text-classification/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Create README.md * metadata Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* add readme * add readmes * Add metadata

Co-authored-by: yantan <yantan@effyic.com>

* Fix doc errors and typos across the board * Fix a typo * Fix the CI * Fix more typos * Fix CI * More fixes * Fix CI * More fixes * More fixes

* Smarter prediction loop and no- -> no_ in console args * Fix test

) * Add a template for example scripts and apply it to mlm * Formatting * Fix test * Add plm script * Add a template for example scripts and apply it to mlm * Formatting * Fix test * Add plm script * Add a template for example scripts and apply it to mlm * Formatting * Fix test * Add plm script * Styling

* Fix typo: indinces -> indices * Fix some more * Fix some more * Fix some more * Fix CI

* ADD: add whole word mask proxy for both eng and chinese * MOD: adjust format * MOD: reformat code * MOD: update import * MOD: fix bug * MOD: add import * MOD: fix bug * MOD: decouple code and update readme * MOD: reformat code * Update examples/language-modeling/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/run_language_modeling.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/run_language_modeling.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/run_language_modeling.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/run_language_modeling.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * change wwm to whole_word_mask * reformat code * reformat * format * Code quality * ADD: update chinese ref readme * MOD: small changes * MOD: small changes2 * update readme * fix eval ref file miss bug * format file * MOD: move ref code to contrib * MOD: add delimeter check * reformat code * refomat code * Update examples/language-modeling/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Fixing some warnings in DeBerta * Fixing docs with their rewritten version.

* Test TF GPU CI * Change cache * Fix missing torch requirement * Fix some model tests Style * LXMERT * MobileBERT * Longformer skip test * XLNet * The rest of the tests * RAG goes OOM in multi gpu setup * YAML test files * Last fixes * Skip doctests * Fill mask tests * Yaml files * Last test fix * Style * Update cache * Change ONNX tests to slow + use tiny model

* Start plumbing * Marian close * Small stubs for all children * Fixed bart * marian working * pegasus test is good, but failing * Checkin tests * More model files * Subtle marian, pegasus integration test failures * Works well * rm print * boom boom * Still failing model2doc * merge master * Equivalence test failing, all others fixed * cleanup * Fix embed_scale * Cleanup marian pipeline test * Undo extra changes * Smaller delta * Cleanup model testers * undo delta * fix tests import structure * cross test decorator * Cleaner set_weights * Respect authorized_unexpected_keys * No warnings * No warnings * style * Nest tf import * black * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * functional dropout * fixup * Fixup * style_doc * embs * shape list * delete slow force_token_id_to_be_generated func * fixup Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Finish the cleanup of the language-modeling examples * Update main README * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Propagate changes Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Replace swish with silu * revert nn.silu to nn.swish due to older version * simplify optimized silu conditional and fix format * Update activations.py * Update activations_tf.py * Update modeling_flax_utils.py * Update modeling_openai.py * add swish testcase * add pytorch swish testcase * Add more robust python version check * more formatting fixes Co-authored-by: TFUsers <TFUsers@gmail.com>

* Minor style improvements: 1. Use `@nn.compact` rather than `@compact` (as to not make it seem like compact is a standard Python decorator. 2. Move attribute docstrings from two `__call__` methods to comments on the attributes themselves. (This was probably a remnant from the pre-Linen version where the attributes were arguments to `call`.) * Use black on the Flax modeling code

* make sure that logging_first_step evaluates * fix bug with incorrect loss on logging_first_step * fix style * logging_first_step only logs, not evals

There is a tiny typo in the code "transformers/examples/language-modeling/run_mlm_wwm.py" at line 284. [Details.](huggingface#9012)

easonnie and others added 30 commits October 24, 2020 03:16

[doc prepare_seq2seq_batch] fix docs (#8013)

38f6739

[Model Card] DJSammy/bert-base-danish-uncased_BotXO,ai (#8025)

5148f43

* Create README.md * Update README.md

Fixup #8025

efc4a21

Close #8030

[model_cards] bert-base-danish Fixup

7087d9b

#8030

[tokenizers] Fixing #8001 - Adding tests on tokenizers serialization (#…

79eb391

…8006) * fixing #8001 * make T5 tokenizer serialization more robust - style

Remove codecov.yml

829b9f8

Minor typo fixes to the tokenizer summary (#8045)

9aa2826

Minor typo fixes to the tokenizer summary

Add mixed precision evaluation (#8036)

c153bcc

* Add mixed precision evaluation * use original flag

[docs] [testing] distributed training (#7993)

101186b

* distributed training * fix * fix formatting * wording

fsmt slow test uses lists (#8031)

f20aec1

update version for scipy (#7998)

20a0894

Cleanup pytorch tests (#8033)

8bbe824

Fix label name in DataCollatorForNextSentencePrediction test (#8048)

0774786

Tiny TF Bart fixes (#8023)

8be9cb0

minor model card description updates (#8051)

b0a9076

Minor error fix of 'bart-large-cnn' details in the pretrained_models …

a9ac1db

…doc (#8053)

add mutliclass field to default zero shot example

fbcddb8

Update README.md (#8050)

098ddc2

--wwm cant be used as an argument given run_language_modeling.py and should be changed to --whole_word_mask

Fix + Test (#8049)

cbad90d

fixing crash (#8057)

7ff7c49

[TF] from_pt should respect authorized_unexpected_keys (#8056)

bc9332b

Fix TF training arguments instantiation (#8063)

3a10764

Doc fixes in preparation for the docstyle PR (#8061)

04a17f8

* Fixes in preparation for doc styling * More fixes * Better syntax * Fixes * Style * More fixes * More fixes

fix doc bug (#8082)

985bba9

Signed-off-by: mymusise <mymusise1@gmail.com>

mrm8488 and others added 28 commits October 29, 2020 08:19

Create README.md (#8017)

52cea7d

Model Card for Gujarati-XLM-R-Base (#8038)

ba2ad3a

* Add model card for Gujarati-XLM-R-Base * Update README.md Add the model card for the Gujarati-XLM-R-Base. * Apply suggestions from code review Co-authored-by: Julien Chaumond <chaumond@gmail.com>

Add two model_cards: ethanyt/guwenbert-base and ethanyt/guwenbert-lar…

b215090

…ge (#8041)

Create README.md (#8075)

5d76859

* Create README.md * Update model_cards/gurkan08/bert-turkish-text-classification/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>

Create README.md (#8088)

234a6dc

* Create README.md * metadata Co-authored-by: Julien Chaumond <chaumond@gmail.com>

Create README.md (#8089)

cc8941d

Add model_cards (#7969)

e566adc

* add readme * add readmes * Add metadata

Update README.md (#8090)

2388760

Update widget examples. (#8149)

4731a00

Co-authored-by: yantan <yantan@effyic.com>

Fix doc errors and typos across the board (#8139)

969859d

* Fix doc errors and typos across the board * Fix a typo * Fix the CI * Fix more typos * Fix CI * More fixes * Fix CI * More fixes * More fixes

Document tokenizer_class in configurations (#8152)

b0f1c0e

Smarter prediction loop and no- -> no_ in console args (#8151)

acf5640

* Smarter prediction loop and no- -> no_ in console args * Fix test

[s2s] distillBART docs for paper replication (#8150)

49e4fec

improve error checking (#8157)

c83cec4

Fix typo: indinces -> indices (#8159)

fdf893c

* Fix typo: indinces -> indices * Fix some more * Fix some more * Fix some more * Fix CI

[CI] Better reports #2 (#8163)

0538820

Fixing some warnings in DeBerta (#8176)

7e36dee

* Fixing some warnings in DeBerta * Fixing docs with their rewritten version.

Fix typo: s/languaged/language/ (#8165)

6279072

Doc fixes and filter warning in wandb (#8189)

089cc10

Remove deprecated arguments from new run_clm (#8197)

9eb3a41

Fix two bugs with --logging_first_step (#8193)

8f1c960

* make sure that logging_first_step evaluates * fix bug with incorrect loss on logging_first_step * fix style * logging_first_step only logs, not evals

ratthachat merged commit 28e589c into ratthachat:master Oct 30, 2020

ratthachat pushed a commit that referenced this pull request Dec 14, 2020

Fix typo huggingface#9012 (#1) (huggingface#9038)

91ab02a

There is a tiny typo in the code "transformers/examples/language-modeling/run_mlm_wwm.py" at line 284. [Details.](huggingface#9012)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update to the latest version (31 Oct)#1

Update to the latest version (31 Oct)#1
ratthachat merged 92 commits intoratthachat:masterfrom
huggingface:master

ratthachat commented Oct 30, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

ratthachat commented Oct 30, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants