tests: mark test_data_cloud::test_pull_git_imports as flaky#3790
Conversation
|
Also kind of interesting that this hasn't show up before in other tests where we are deleting the cache dir and files between dvc command runs |
There was a problem hiding this comment.
Wow, great findings @pmrowla. Were you able to locate reproduce locally?
Also kind of interesting that this hasn't show up before in other tests where we are deleting the cache dir and files between dvc command runs
AFAIK, we also check for mtime and size before returning checksum. Maybe the "mtime" also matched here?
I was not able to reproduce it locally - I'm not even sure what the best way to go about reproducing it would be? Maybe something like setting up a VM and manually creating a disk partition with a really low max inode count to force re-use of inodes on file deletion? |
|
@pmrowla Great research! We used to have similar problems with inodes back when we had |
Fixes #3570.
❗ I have followed the Contributing to DVC checklist.
📖 If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here. If the CLI API is changed, I have updated tab completion scripts.
❌ I will check DeepSource, CodeClimate, and other sanity checks below. (We consider them recommendatory and don't expect everything to be addressed. Please fix things that actually improve code or fix bugs.)
Thank you for the contribution - we'll try to review it as soon as possible. 🙏
As far as I can tell there is nothing wrong with the test as written. I think the issue is related to inodes sometimes being re-used when we delete files in between
pullcalls.From the logs, inode
2307858is reused after the file deletion calls. Note that the test failure is due tofoonot being re-fetched during the secondpullcall after the deletions.For the second
pullcall, what we do is:dirintonew_dirnew_dir/barfoofoo, which coincidentally matches the inode for the deletednew_dir/barnew_dir/bar(inode2307858) still existsnew_dir/bar(but thinks its the checksum for newfoo)new_dir/baris already in cache (since it was fetched in step 1)2307858are still correct, and eventually fail to fetch/checkoutfooSo I think this may actually be a minor state bug, although I'm not sure how serious it is given that manually deleting your
.dvc/cachedirectory (but not state db) in this way is not exactly a typical use case.