Skip to content

[booster] Implement CheckpointIO for Native PyTorch #3053

@FrankLeeeee

Description

@FrankLeeeee

Overview

CheckpointIO takes care of the Booster.save and Booster.load logic to allow for model saving/resuming/loading. It should be noted that CheckpointIO is often used in pair with the Plugin as a Plugin can possibly require a specific saving/loading strategy. However, we should propose general ones for normal pytorch model and a DTensor-based model. As the DTensor is under development, we should focus on the native PyTorch implementation first.

Wanna track the development progress? Take a look at

proposal: #3046
project kanban: API Refactoring

Goal

The CheckpointIO should allow the user to save/load the native PyTorch model/optimizer/lr schduler.

Metadata

Metadata

Assignees

Labels

APIrelated to API changesenhancementNew feature or request

Type

No type

Projects

Status

✅ Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions