Skip to content

add: do not verify hardlink if file is empty#3428

Merged
efiop merged 3 commits into
treeverse:masterfrom
skshetry:fix-3390
Mar 11, 2020
Merged

add: do not verify hardlink if file is empty#3428
efiop merged 3 commits into
treeverse:masterfrom
skshetry:fix-3390

Conversation

@skshetry
Copy link
Copy Markdown
Collaborator

@skshetry skshetry commented Mar 2, 2020

Fixes #3390

  • ❗ Have you followed the guidelines in the Contributing to DVC list?

  • 📖 Check this box if this PR does not require documentation updates, or if it does and you have created a separate PR in dvc.org with such updates (or at least opened an issue about it in that repo). Please link below to your PR (or issue) in the dvc.org repo.

  • ❌ Have you checked DeepSource, CodeClimate, and other sanity checks below? We consider their findings recommendatory and don't expect everything to be addressed. Please review them carefully and fix those that actually improve code or fix bugs.

Thank you for the contribution - we'll try to review it as soon as possible. 🙏

@skshetry skshetry self-assigned this Mar 2, 2020
Comment thread dvc/remote/local.py Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 2, 2020

Codecov Report

Merging #3428 into master will not change coverage by %.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3428   +/-   ##
=======================================
  Coverage   93.08%   93.08%           
=======================================
  Files         140      140           
  Lines        8515     8515           
=======================================
  Hits         7926     7926           
  Misses        589      589           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 05cc023...4e9289f. Read the comment docs.

Comment thread dvc/remote/local.py Outdated
Copy link
Copy Markdown
Contributor

@gurobokum gurobokum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think to fix the issue another way:

  1. in _verify_link check the case when the file is not a link and has size 0 bytes
  2. if it's True - just don't raise an Exception
  3. put the comment with the issue and short description about the case

It allows to fix it in one place without passing inconsistent return value

UPD
I see @pared propsed the same in the comment above

@skshetry
Copy link
Copy Markdown
Collaborator Author

We (@efiop and I) discussed this on 1o1. A better way to fix this would be to try to create a temporary file to check if the System supports {ref,hard,sym}link. Though, it'll need some refactor, so, I'll create an issue for working on this later.

Comment thread dvc/remote/local.py
Comment on lines +96 to +98
if link_type == "hardlink" and self.getsize(path_info) == 0:
return

Copy link
Copy Markdown
Contributor

@efiop efiop Mar 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not great that it will be checking the size for each link it creates, might get expensive. Though, it does that in hardlink anyway...

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did that anyway on the hardlink() though.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skshetry Maybe there is some nicer way to do this? Like making hardlink(and other link class methods) verify themselves? This is an honest question, I don't know myself either 🙂

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skshetry Could keep it as is and create a ticket for it to reconsider later. Just asking.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this suggestion #3428 (comment) more. Running verification on hardlink itself for only once (by setting self.cache_type_confirmed) is also hackish.

I'd say, we revisit this on next sprint and fix it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skshetry Ok, please create an issue and add it to the next sprint.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just check for cache_type_verified at the beggining again. I know that is duplication, but don't see any obvious workaround.

@skshetry skshetry requested review from gurobokum and pared March 10, 2020 10:00
@efiop
Copy link
Copy Markdown
Contributor

efiop commented Mar 10, 2020

@skshetry Check your tests, travis failed.

Copy link
Copy Markdown
Contributor

@pared pared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the conclusion for this issue is that we will crete issue to fix the way we handle confirmation of cache type?

Comment thread tests/func/test_add.py Outdated
Copy link
Copy Markdown
Contributor

@pared pared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One additional comment in discussion with @efiop, not anything major.

Comment thread tests/func/test_add.py Outdated
@efiop
Copy link
Copy Markdown
Contributor

efiop commented Mar 11, 2020

@skshetry Please check the tests.

@efiop efiop merged commit 682275d into treeverse:master Mar 11, 2020
@skshetry skshetry deleted the fix-3390 branch March 12, 2020 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add: empty files add broken when cache mode is hardlinks

4 participants