Thanks for this great work.
However, there still exits a problem making me confused, that is how to calculate the Z in NCE loss.
In the implementation of the code, I found that Z is defined as '2876934.2 / 1281167 * self.data_len',
I wondering what's the meaning of 2876934.2 and 1281167
THANKS FOR ANY HELPFUL ADVICES:)