Skip to content

Make init function atomic #42

@BenjaminBossan

Description

@BenjaminBossan

I just had a look at init and it's not atomic:

def init(*, model: Union[str, Path], requirements: List[str], dst: Union[str, Path]):
"""Initialize a scikit-learn based HuggingFace repo.
Given a model pickle and a set of required packages, this function
initializes a folder to be a valid HuggingFace scikit-learn based repo.
Parameters
----------
model: str, or Path
The path to a model pickle file.
requirements: list of str
A list of required packages. The versions are then extracted from the
current environment.
dst: str, or Path
The path to a non-existing or empty folder which is to be initialized.
Returns
-------
None
"""
dst = Path(dst)
if dst.exists() and next(dst.iterdir(), None):
raise OSError("None-empty dst path already exists!")
dst.mkdir(parents=True, exist_ok=True)
shutil.copy2(src=model, dst=dst)
model_name = Path(model).name
_create_config(model_path=model_name, requirements=requirements, dst=dst)

E.g. if the last step (_create_config) fails, the user fixes that step, then calls init again, it will suddenly fail in an earlier step because the directory is not empty anymore. A possible fix would be to create a temp directory, put everything there, and in the last step move that directory to the destination. @adrinjalali @merveenoyan WDYT?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions