PyTorch provides a new distributed checkpoint saving implementation:
https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html
This implementation can support async checkpoint and other operations in a relatively simple way, and exposes two rewritable interfaces, Writer and Loader. Therefore, theoretically, veTurboIO can be introduced into this implementation path by implementing these two interfaces.
PyTorch provides a new distributed checkpoint saving implementation:
https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html
This implementation can support async checkpoint and other operations in a relatively simple way, and exposes two rewritable interfaces, Writer and Loader. Therefore, theoretically, veTurboIO can be introduced into this implementation path by implementing these two interfaces.