Add wmt translation example#3428
Conversation
03a688d to
66dc34e
Compare
3bd6d51 to
c8e515a
Compare
LysandreJik
left a comment
There was a problem hiding this comment.
Cool, that looks clean. I guess we don't need all the usual arguments (data parallel, fp16 and others) since this is an evaluation script and not a training script?
It doesn't seem to run on GPU, does it take a long time to compute? Shouldn't we try to cast to a GPU if there's one available?
Yeah we should definitely try to run it on a GPU - will take a look at that :-) |
|
Not sure whether we need fp16 and multi-gpu training. I think single GPU training is enough and t5 + wmt does not take much memory. But happy to take a look into it if you guys think it's worth it :-) @thomwolf @LysandreJik @julien-c |
a5b69f7 to
abdc617
Compare
|
Code quality test fails because of unpinned isort library (see #3449) |
| @@ -0,0 +1,51 @@ | |||
| ***This script evaluates the [T5 Model](https://arxiv.org/pdf/1910.10683.pdf) ``t5-base`` on the English to German WMT dataset. Please note that the results in the paper were attained using a ``t5-base`` model fine-tuned on translation, so that results will be slightly worse here*** | |||
Codecov Report
@@ Coverage Diff @@
## master #3428 +/- ##
=======================================
Coverage 52.51% 52.51%
=======================================
Files 100 100
Lines 17051 17051
=======================================
Hits 8954 8954
Misses 8097 8097Continue to review full report at Codecov.
|
713524e to
c3bce97
Compare
PR adds translation example for T5.
It uses the
sacrebleuBLEU scorer.I adapted the README.md a bit so that users are aware that models in official paper were attained with finetuned T5 @craffel