Skip to content

Reloading of corrupted cached files for PersistentDataset #5723

@Phinnik

Description

@Phinnik

I use PersistentDataset in my train loop. I needed to abort train script and when I run it again, one of cached files was corrupted because of abortion. PersistentDataset did not finish saving this cache file.
It threw an exception RuntimeError: Invalid magic number; corrupt file?

I would like to add a possibility to handle corrupt files. If cached file is corrupted, PersistenDataset will continue like there was no cached file and replace it with a good one.

I see, that if I add "except" block here, I can handle such behaviour.

May I take this ticket and make a PR?

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions