import: fix importing into subdirectory bug#4340
Conversation
| scheme = urlparse(out).scheme | ||
| if os.name == "nt" and scheme == abspath.drive[0].lower(): | ||
| # urlparse interprets windows drive letters as URL scheme | ||
| scheme = "" | ||
| if ( | ||
| os.path.exists(dirname) # out might not exist yet, so | ||
| and PathInfo(abspath).isin_or_eq(repo.root_dir) | ||
| not scheme | ||
| and abspath.isin_or_eq(repo.root_dir) |
There was a problem hiding this comment.
out may be a local path or a remote://... url, but os.path.exists isn't a good enough check for whether or not out is a valid local path.
In the case where we are importing with -o subdir/file, if subdir doesn't already exist it is still a valid local output path (and we need to makedirs(subdir) later)
| if not os.path.exists(wdir): | ||
| makedirs(wdir) | ||
|
|
There was a problem hiding this comment.
Would it make sense to error-out if it doesn't exist? Kinda like cp and other tools do.
There was a problem hiding this comment.
Creating the directory if it doesn't exist feels like the more natural import -o behavior for me, but given that the current docs for -o say:
If an existing directory is specified, the target data will be placed inside.
it would also be okay for us to error out here.
There was a problem hiding this comment.
please, don't forget to update the docs if we change this :)
There was a problem hiding this comment.
@pmrowla We also don't makedirs in dvc run, but rather raise an error that wdir doesn't exist, so it seems it would be more consistent if we raise and error here too. Same with dvc add that shares the same logic with dvc run. I wonder why that https://github.com/iterative/dvc/blob/35dd1fda9cfb4b645ae431f4621f2d1853d4fb7c/dvc/stage/utils.py#L27 check wasn't triggered here as well 🤔
There was a problem hiding this comment.
The issue before was that we set wdir to os.cwd() if the output subdirectory didn't already exist, because we assumed that subdir not existing actually meant it was a remote URL and not a local path
There was a problem hiding this comment.
Behavior has been updated here so that we use the correct subdir as wdir, and then error out in utils.check_stage_path if it doesn't already exist
efiop
left a comment
There was a problem hiding this comment.
Thank you! Could double check the docs as well, please? 🙏
Codecov Report
@@ Coverage Diff @@
## master #4340 +/- ##
=======================================
Coverage 91.19% 91.19%
=======================================
Files 177 177
Lines 12209 12209
=======================================
Hits 11134 11134
Misses 1075 1075 Continue to review full report at Codecov.
|
❗ I have followed the Contributing to DVC checklist.
📖 If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
Thank you for the contribution - we'll try to review it as soon as possible. 🙏
Fixes #4169.
Makes
dvc import ... -o subdir/filebehave the same whether or notsubdiralready exists (new output will besubdir/file.dvc)docs: treeverse/dvc.org#1668