Why does setting different batch_size affect the test results?
Why does setting different batch_size affect the test results?