Skip to content

[shardformer] align BERT value #3906

@FoolPlayer

Description

@FoolPlayer

A sharded BERT has been implemented in our shardformer branch. We need to make sure its numerical values can be aligned with the non-sharded version, including the output and gradients.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

✅ Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions