[WIP] Support for alternative hashes (#1676)#2022
Conversation
shcheklein
left a comment
There was a problem hiding this comment.
#1920 - address comments from the previous PR
There was a problem hiding this comment.
Why not simply:
return bool(modchecksum.checksum_types_from_str(...))?
There was a problem hiding this comment.
Could be:
return next((self.info[t] for t in self.remote.checksum_types() if t in self.info), None)
# or
return next(filter(bool, map(self.info.get, self.remote.checksum_types())), None) :)
There was a problem hiding this comment.
Or
return self.info[self.checksum]There was a problem hiding this comment.
Make this properties as above?
There was a problem hiding this comment.
Also is that an error if we have more than one here? If yes than I suggest adding an assert.
There was a problem hiding this comment.
Use a decorator to abstract away caching?
There was a problem hiding this comment.
No need to call .keys() just types = set(checksum_info). Actually Python has optimized code path to make sets from dicts. intersection is computed twice.
There was a problem hiding this comment.
Is this addition_check only purpose to pass there self.changed_cache? If so won't it be more clear to simply use bool parameter?
There was a problem hiding this comment.
Shouldn't this go to appropriate modules and just stay contained there? This will eliminate one level of indirection - these constants.
There was a problem hiding this comment.
They will go to the appropriate modules for sure 👍
There was a problem hiding this comment.
Looks like file_md5/file_checksum second result is never used. Maybe now is a good time to drop it and return only hex.
There was a problem hiding this comment.
This is just:
SCHEMA.update({
Optional(PARAM_CMD): Or(str, None),
...
})There was a problem hiding this comment.
Using ors instead of any() here will simplify code and provide short-circuit behavior.
There was a problem hiding this comment.
checksum = {k: d[k] for k in modchecksum.CHECKSUM_MAP if d.get(k)}
# Or if you don't expect falsy values in `d`:
checksum = {k: d[k] for k in modchecksum.CHECKSUM_MAP if k in d}There was a problem hiding this comment.
Or using funcy:
from funcy import project
checksum = project(d, modchecksum.CHECKSUM_MAP)Funcy is my lib :)
There was a problem hiding this comment.
r = {k: v for k, v in self.checksum.items() if v} Or using funcy:
from funcy import compact
r = compact(self.checksum)There was a problem hiding this comment.
The whole thing is merging of two dicts with a filter:
from itertools import chain
d1 = self.checksum
d2 = {...}
return {k: v for k, v in chain(d1.items(), d2.items()) if v}
# or with funcy
from funcy import merge, compact
d1 = self.checksum
d2 = {...}
return compact(merge(d1, d2))There was a problem hiding this comment.
Could be
if not set(checksum_types) <= set(supported_types):
return NoneThere was a problem hiding this comment.
A good place to use list comprehension.
There was a problem hiding this comment.
These look like functional tests not unit ones. Also could be easily written in new pytest style using dvc fixture:
def test_compatibility(dvc):
ret = main(...)
assert ret == 0
# ....|
@Suor thanks for reviewing this. It was started by an external contributor and am trying to get it done, I don't expect it to be fast though. I'll ping you again when it's in a better shape and some fundamental issues are solved. I'll address your comments along the way, thanks! |
rename get_checksum_type_list to checksum_types
[x ] Have you followed the guidelines in our
Contributing document?
Does your PR affect documented changes or does it add new functionality
that should be documented? If yes, have you created a PR for
dvc.org documenting it or at
least opened an issue for it? If so, please add a link to it.
Fixes #1676
Based on a fork by @vyloy .