You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A sharded BERT has been implemented in our shardformer branch. We need to make sure its numerical values can be aligned with the non-sharded version, including the output and gradients.
A sharded BERT has been implemented in our shardformer branch. We need to make sure its numerical values can be aligned with the non-sharded version, including the output and gradients.