Use config.layer_norm_eps in some nn.LayerNorm.#20699
Conversation
| """Construct the overlapping patch embeddings.""" | ||
|
|
||
| def __init__(self, patch_size, stride, num_channels, hidden_size): | ||
| def __init__(self, config, patch_size, stride, num_channels, hidden_size): |
There was a problem hiding this comment.
Need this new argument so we can use eps=config.layer_norm_eps. As this is an internal class, should be fine.
| """Construct the overlapping patch embeddings.""" | ||
|
|
||
| def __init__(self, patch_size, stride, num_channels, hidden_size): | ||
| def __init__(self, config, patch_size, stride, num_channels, hidden_size): |
There was a problem hiding this comment.
Need this new argument so we can use eps=config.layer_norm_eps. As this is an internal class, should be fine.
|
The documentation is not available anymore as the PR was closed or merged. |
|
Just to confirm:
It has impact even for integration tests: the change is the constant
I agree - but I am not sure, for recent models, if all these attributes are set according to the papers, or people just used add new model like templates ... |
|
This is too breaking I think. We need to be more careful on new models added that this attribute is consistently used but I don't think we should touch old models like this as it will change the results of the forward. |
|
OK! I will keep this list of models to skip in the WIP PR where we add a test for checking unused config attributes. |
|
Close as it is too breaking! |
What does this PR do?
epsof those LayerNorm layers from (the default)1e-5to1e-12, and the outputs will have slightly differences before/after this PR.Similar to #20554, but this time instead of removing the attribute from config, we use
config.layer_norm_epsin somenn.LayerNorm.