It looks like the batchnorm doesn't take into account the masking: https://github.com/freewym/espresso/blob/6fca6cacd9d475d2676c527999e2d1bde08e7cbb/espresso/models/speech_tdnn.py#L170 Surely this isn't right? However I don't know how to take it into account.
It looks like the batchnorm doesn't take into account the masking:
espresso/espresso/models/speech_tdnn.py
Line 170 in 6fca6ca
Surely this isn't right?
However I don't know how to take it into account.