Skip to content

[Deepspeed zero3] lazy weights init  #12272

@stas00

Description

@stas00

I'm pretty sure we need to follow up to the lazy weights init feature #11471
and add under zero3 deepspeed.zero.GatheredParameters here (or inside _init_weights):

https://github.com/huggingface/transformers/pull/11471/files#diff-6b72b98c4c2dcfc6cc606843917733f5d858374fbc22a735ff483bbc0c1e63eaR1275-R1276

plus need a test.

Metadata

Metadata

Assignees

Labels

DeepSpeedWIPLabel your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions