[seq2seq] correctly handle mt5#9879
Conversation
sgugger
left a comment
There was a problem hiding this comment.
LGTM as a quick fix. @patil-suraj when you port this to the new run_seq2seq, it would be great to try to find a way to make this not use any special code for a given model (but that's out of scope of this PR).
patil-suraj
left a comment
There was a problem hiding this comment.
LGTM! Thanks Stas for fixing this!
I'm working on it in #9844, it's not finished though. We might need to add |
|
If we need to add some methods to deal with the special cases, I would prefer it (otherwise the script might fail with new seq2seq models). |
This PR fixes
seq2seq/utils.pyto handlemt5like it doest5.Ideally there should be a test, which would require creating a tiny model for mt5, but I'm being told this code is going away anyway, so there is no point investing energy into it.
Fixes: #9865
@patil-suraj, @sgugger