Display path relative to temporary repo when get/import-ing files#2798
Conversation
b23a015 to
a836705
Compare
|
Thanks for the PR @skshetry ! Just a heads up: some tests failed, please take a look at travis logs. 🙂 |
There was a problem hiding this comment.
@efiop, this test is failing for SSH/HDFS/s3/GS as they are remote. Others are passing.
Should I just separate this test (and, keep it only for local)?
There was a problem hiding this comment.
I have separated the test for now. Tell me, I feel, I have done something wrong here 😃
There was a problem hiding this comment.
Now the three old tests are not tested for other remotes than local.
There was a problem hiding this comment.
Considering all the test complications I suggest not touching existing ones, but adding a separate pytest-style test to check output stringification for an output in external repo.
There was a problem hiding this comment.
Now the three old tests are not tested for other remotes than local.
They do run, as the above TestCase is not changed at all. Or, did you mean this specific example?
Considering all the test complications I suggest not touching existing ones, but adding a separate pytest-style test to check output stringification for an output in external repo.
You mean, for each remote separately? Please, clarify.
There was a problem hiding this comment.
They do run, as the above TestCase is not changed at all. Or, did you mean this specific example?
Oh, I see, I was inattentive. And OutputLOCALBase not really used confused me.
You mean, for each remote separately? Please, clarify.
You don't need to do anything for each remote. You are only changing local one and thus only need to add tests for it. You can do that without touching existing tests at all and instead create two new test functions to check the specific changes you introduced.
There was a problem hiding this comment.
@skshetry @Suor is saying that you've modified TestOutputLOCAL which is used in tests for other remotes, hence why the tests are failing and why not all tests are now running for those remotes. You need to not touch the old tests, but create a new one pytest-style test to test that stringification works properly for local output when you are in and outside of the repo. 🙂
There was a problem hiding this comment.
This pattern is used over and over in our code. This is probably a good time to extract this into some path_isin() utility.
There was a problem hiding this comment.
Now the three old tests are not tested for other remotes than local.
There was a problem hiding this comment.
The original issue this PR aims to solve was about outputs in external repo and thus outside it. Testing with stage of the same repo might not cover that or might break in the future.
There was a problem hiding this comment.
Agreed. We could have two tests. One with Stage(dvc_repo) to test that it works correctly when in the repo, and one more test with Stage(erepo.dvc) to test that it works when outside of the repo. Should be enough.
There was a problem hiding this comment.
Considering all the test complications I suggest not touching existing ones, but adding a separate pytest-style test to check output stringification for an output in external repo.
There was a problem hiding this comment.
If yes, then simply using as argument will do the job.
There was a problem hiding this comment.
Looks like it's not necessary. Removing as erepo fixture already creates one.
There was a problem hiding this comment.
I think all those test could be done with one, where parametrize child, path and expected result. But I guess it is a matter of taste.
There was a problem hiding this comment.
Parametrizing for one line test is just a boiler plate.
There was a problem hiding this comment.
I tried that one but I found parameterizing expected_result as well too confusing for myself (and, too much repetition). What do you think?
There was a problem hiding this comment.
I would still prefer several asserts in a single test. This is mere indirection, no upsides.
There was a problem hiding this comment.
I also prefer having multiple asserts, but I think, this is easier to follow and understand.
pared
left a comment
There was a problem hiding this comment.
One question, one note, besides that LGTM.
There was a problem hiding this comment.
They do run, as the above TestCase is not changed at all. Or, did you mean this specific example?
Oh, I see, I was inattentive. And OutputLOCALBase not really used confused me.
You mean, for each remote separately? Please, clarify.
You don't need to do anything for each remote. You are only changing local one and thus only need to add tests for it. You can do that without touching existing tests at all and instead create two new test functions to check the specific changes you introduced.
There was a problem hiding this comment.
I would call it path_isin() to be more in line with PathInfo.isin().
There was a problem hiding this comment.
If yes, then simply using as argument will do the job.
There was a problem hiding this comment.
Parametrizing for one line test is just a boiler plate.
There was a problem hiding this comment.
What about is_child_path('a/b/', 'a/b')? Comparing only things built with os.path.join() doesn't quite handle it.
There was a problem hiding this comment.
@Suor I think that os.path.join(path, "") which is inside is_child_path handles this one.
There was a problem hiding this comment.
Yes, I forgot to test about the case (which fails atm). I'll add that (using os.path.join(), as I am not sure about testing via raw strings.)
From what I understood, you are asking me to also test strings directly? But, that'd not be platform agnostic. That specific string test might need skipping on Windows (to add to that, the implementation is already quite dependent on os.path.join() although it claims it's purely lexical :) ).
Also, should there be a test for demonstrating support for PathInfo? (saw #2137, up-level references might fail on PathInfo, maybe normpath could help.)
There was a problem hiding this comment.
@Suor, I refactored for PathInfo, as well as added few tests for absolute path and the case you just mentioned.
|
I have 12 commits already. I'd prefer rebasing/squashing (or, is it not preferred for some reasons?) |
@pared @MrOutis is the team policy flexible now? @skshetry I think it's totally up to you to manage commits in your branch in any way! And it's of course better for us in terms of merging if they look nice and clean :) |
Suor
left a comment
There was a problem hiding this comment.
Looks like we reached cleanup. So there are many comments below, but they should go easy.
There was a problem hiding this comment.
We don't split imports like that anymore.
There was a problem hiding this comment.
The last os.path.join() is not needed.
There was a problem hiding this comment.
return child != parent and ...
There was a problem hiding this comment.
I would still prefer several asserts in a single test. This is mere indirection, no upsides.
There was a problem hiding this comment.
Why are using pass by name here? I suggest treating those args as positional only.
There was a problem hiding this comment.
I thought, it'd be less confusing here if I added the keywords. Removing ...
There was a problem hiding this comment.
Test is about substring not some completely different path, test name is inpresize.
There was a problem hiding this comment.
I guess it won't work in Python 3.5 and earlier.
There was a problem hiding this comment.
Oops, didn't test on Python3.5. Fixing ...
There was a problem hiding this comment.
Move this to dvc.utils.fs please.
There was a problem hiding this comment.
I thought, dvc.utils.fs were mostly related to low-level like inodes and stuffs. Moving ...
There was a problem hiding this comment.
@Suor, I have moved it to dvc.utils.fs and the tests as well.
704c420 to
f07a70a
Compare
|
@Suor, I have rebased, please review again. |
| """Check if given `child` path is inside `parent`.""" | ||
|
|
||
| def normalize_path(path): | ||
| return os.path.normpath(fspath(path)) |
There was a problem hiding this comment.
It should be fspath_py35() instead of fspath() here, since in Python 3.6+ you may pass Path-like object to os utils directly.
|
|
||
| def test_path_isin_on_same_path(): | ||
| path = os.path.join("path", "to", "folder") | ||
| path2 = os.path.join(path, "") |
There was a problem hiding this comment.
| path2 = os.path.join(path, "") | |
| path_with_sep = os.path.join(path, "") |
|
@skshetry Please check tests, they are failing. |
|
Ah, wait, those failures are probably unrelated. Looking into it... |
|
Fixed the tests on master, everything should be smooth now. Sorry for the inconvenience 🙂 |
|
When this patch is merged, we should create ticket for replacing |
|
Ah, I see @MrOutis already approved. |
|
Thanks @skshetry ! 🎉 |
|
@skshetry Please don't forget to create a ticket for #2798 (comment) . 🙂 |
❗ Have you followed the guidelines in the Contributing to DVC list?
📖 Check this box if this PR does not require documentation updates, or if it does and you have created a separate PR in dvc.org with such updates (or at least opened an issue about it in that repo). Please link below to your PR (or issue) in the dvc.org repo.
❌ Have you checked DeepSource, CodeClimate, and other sanity checks below? We consider their findings recommendatory and don't expect everything to be addresses. Please review them carefully and fix those that actually improve code or fix bugs.
Thank you for the contribution - we'll try to review it as soon as possible. 🙏
Description
This PR fixes #2605. This will print the relative path of files being downloaded in the console.