Fix GA loss bugs and add unit test#35121
Fix GA loss bugs and add unit test#35121ArthurZucker merged 9 commits intohuggingface:mainfrom techkang:main
Conversation
muellerzr
left a comment
There was a problem hiding this comment.
Thanks! These solutions make sense and ran the tests myself. Left some nits for less-leeway on the test closeness. cc @ArthurZucker
There was a problem hiding this comment.
Let's be a bit more aggressive and do 0.15 (this passes). I still feel that's quite big but I can't figure out why (my tests showed 0.001 should be doable).
| self.assertLess(max(diff_truth), 0.3, f"Difference {max(diff_truth)} is not within 0.3") | |
| self.assertLess(max(diff_truth), 0.15, f"Difference {max(diff_truth)} is not within 0.15") |
There was a problem hiding this comment.
It is strange that I tested on both Mac and Windows that max(diff_truth) is 0.144. So maybe 0.15 may failed on some other machine.
There was a problem hiding this comment.
Done! I use TinyStories to narrow down gap to the same as you. The code is submitted.
There was a problem hiding this comment.
| self.assertLess(max(diff_truth), 0.3, f"Difference {max(diff_truth)} is not within 0.3") | |
| self.assertLess(max(diff_truth), 0.2, f"Difference {max(diff_truth)} is not within 0.2") |
Similarly we can be aggressive here too
There was a problem hiding this comment.
I managed to reduce the gap to 1e-4 by padding all input labels to the same length. However, this method did not work for the GPT-2 model. I will continue to explore other solutions.
ArthurZucker
left a comment
There was a problem hiding this comment.
Sorry @muellerzr but this does not solve:
FAILED examples/pytorch/test_pytorch_examples.py::ExamplesTests::test_run_speech_recognition_seq2seq - TypeError: Wav2Vec2Model.forward() got an unexpected keyword argument 'num_items_in_batch'
so I am not sure I understand. Related to #35113 and #35128.
We can't merge with the broken test
|
@ArthurZucker The Wav2Vec2Model bug is because SpeechEncoderDecoderModel takes variable argument as forward paramters: But it dispatches the paramater to it's encoder and decode which doesn't accept variable argument: I think the better solution is to modify it's decode to accept variable argument. I proposed a new commit and the test succeed. |
|
Finally all unit test passed. Please check again. @muellerzr @ArthurZucker |
There was a problem hiding this comment.
Thanks a lot @techkang !
I did not dive enough on the test, my bad 🤗
Merging ASAP and doing the patch
| import datasets | ||
|
|
||
| model_name = "distilgpt2" | ||
| model_name = "nickypro/tinyllama-110M" |
There was a problem hiding this comment.
Okay, would be nice if we had a safetensors model here but alright.
* fix GA bugs and add unit test * narrow down model loss unit test diff gap * format code to make ruff happy * send num_items_in_batch argument to decoder * fix GA loss bug in BertLMHeadModel * use TinyStories-33M to narrow down diff gap * fotmat code * missing .config * avoid add extra args --------- Co-authored-by: kangsheng <kangsheng@meituan.com>
What does this PR do?
There are two ways to fix GA loss bugs:
num_items_in_batchin loss function defined by model. In this case,model_accepts_loss_kwargsisTrue.compute_loss_func.However, previes unit test only test for the second condition. So I introduced a new unit test to cover the first condition and fix the bugs by the way.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@muellerzr @ArthurZucker