run_swag.py should use AdamW by jeff-da · Pull Request #951 · huggingface/transformers

jeff-da · 2019-08-02T14:58:50Z

run_swag.py doesn't compile currently, BertAdam is removed (per readme).

codecov-io · 2019-08-02T15:02:53Z

Codecov Report

Merging #951 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #951   +/-   ##
=======================================
  Coverage   79.04%   79.04%           
=======================================
  Files          34       34           
  Lines        6242     6242           
=======================================
  Hits         4934     4934           
  Misses       1308     1308

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 44dd941...a5e7d11. Read the comment docs.

codecov-io · 2019-08-02T15:02:53Z

Codecov Report

Merging #951 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #951   +/-   ##
=======================================
  Coverage   79.04%   79.04%           
=======================================
  Files          34       34           
  Lines        6242     6242           
=======================================
  Hits         4934     4934           
  Misses       1308     1308

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 44dd941...a5e7d11. Read the comment docs.

thomwolf · 2019-08-07T08:12:29Z

examples/single_model_scripts/run_swag.py

@@ -467,10 +467,13 @@ def main():
                if (step + 1) % args.gradient_accumulation_steps == 0:
                    if args.fp16:


We don't need this seperate adjustement of learning rate for fp16 anymore with the schedulers

thomwolf · 2019-08-07T08:14:38Z

examples/single_model_scripts/run_swag.py

-                                 warmup=args.warmup_proportion,
-                                 t_total=num_train_optimization_steps)
+            optimizer = AdamW(optimizer_grouped_parameters, lr=args.learning_rate)
+            scheduler = WarmupLinearSchedule(optimizer,


The scheduler should be created also if we are in fp16.
The fp16 optimizer should now be created like in the run_glue example where there is no distinction between fp16 and normal operation.

thomwolf · 2019-08-07T08:15:43Z

Added a few comments. If you take a look at the run_glue and run_squad examples, you'll see they are much simpler now in term of optimizer setup. This example could take advantage of the same refactoring if you want to give it a look!

thomwolf · 2019-08-30T12:14:45Z

Thanks for this @jeff-da, we'll close this PR in favor of #1004 for now.
Feel free to re-open if there are other things you would like to change.

SWAG runner should use AdamW

a5e7d11

thomwolf reviewed Aug 7, 2019

View reviewed changes

erenup mentioned this pull request Aug 11, 2019

Refactoring old run_swag.py #1004

Merged

thomwolf closed this Aug 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

run_swag.py should use AdamW#951

run_swag.py should use AdamW#951
jeff-da wants to merge 1 commit intohuggingface:masterfrom
octouw:master

jeff-da commented Aug 2, 2019 •

edited

Loading

Uh oh!

codecov-io commented Aug 2, 2019

Uh oh!

codecov-io commented Aug 2, 2019 •

edited

Loading

Uh oh!

thomwolf Aug 7, 2019

Uh oh!

thomwolf Aug 7, 2019

Uh oh!

thomwolf commented Aug 7, 2019

Uh oh!

thomwolf commented Aug 30, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -467,10 +467,13 @@ def main():
		if (step + 1) % args.gradient_accumulation_steps == 0:
		if args.fp16:

Conversation

jeff-da commented Aug 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-io commented Aug 2, 2019

Codecov Report

Uh oh!

codecov-io commented Aug 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

thomwolf Aug 7, 2019

Choose a reason for hiding this comment

Uh oh!

thomwolf Aug 7, 2019

Choose a reason for hiding this comment

Uh oh!

thomwolf commented Aug 7, 2019

Uh oh!

thomwolf commented Aug 30, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jeff-da commented Aug 2, 2019 •

edited

Loading

codecov-io commented Aug 2, 2019 •

edited

Loading