Skip to content

[api] implemented the checkpoint io module#3205

Merged
ver217 merged 3 commits intohpcaitech:mainfrom
FrankLeeeee:feature/checkpoint
Mar 23, 2023
Merged

[api] implemented the checkpoint io module#3205
ver217 merged 3 commits intohpcaitech:mainfrom
FrankLeeeee:feature/checkpoint

Conversation

@FrankLeeeee
Copy link
Copy Markdown
Contributor

📌 Checklist before creating the PR

  • I have created an issue for this PR for traceability
  • The title follows the standard format: [doc/gemini/tensor/...]: A concise description
  • I have added relevant tags if possible for us to better distinguish different PRs

🚨 Issue number

Link this PR to your issue with words like fixed to automatically close the linked issue upon merge

e.g. fixed #1234, closed #1234, resolved #1234

Fixed #3053

📝 What does this PR do?

Summarize your work here.
if you have any plots/diagrams/screenshots/tables, please attach them here.

This PR added the checkpoint IO module for abstraction of the loading/saving logic for training and inference. Both unsharded/sharded checkpoint is supported. A test is provided to test the unsharded checkpoint saving/loading.

💥 Checklist before requesting a review

  • I have linked my PR to an issue (instruction)
  • My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
  • I have performed a self-review of my code
  • I have added thorough tests.
  • I have added docstrings for all the functions/methods I implemented

⭐️ Do you enjoy contributing to Colossal-AI?

  • 🌝 Yes, I do.
  • 🌚 No, I don't.

Tell us more if you don't enjoy contributing to Colossal-AI.

@FrankLeeeee FrankLeeeee added enhancement New feature or request API related to API changes Run Build and Test labels Mar 22, 2023
Comment thread colossalai/checkpoint_io/checkpoint_io_base.py
@github-actions
Copy link
Copy Markdown
Contributor

The code coverage for the changed files is 63%.

Click me to view the complete report
Name                                                     Stmts   Miss  Cover
----------------------------------------------------------------------------
colossalai/checkpoint_io/__init__.py                         3      0   100%
colossalai/checkpoint_io/checkpoint_io_base.py              95     54    43%
colossalai/checkpoint_io/general_checkpoint_io.py           35      8    77%
tests/test_checkpoint_io/test_general_checkpoint_io.py      36      1    97%
----------------------------------------------------------------------------
TOTAL                                                      169     63    63%

@ver217 ver217 merged commit cd142fb into hpcaitech:main Mar 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

API related to API changes enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[booster] Implement CheckpointIO for Native PyTorch

2 participants