-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New option jobs for dvc import
#4977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
660ee32
66d1744
53b4316
4c2325c
f6fa37b
ae368c1
5cca616
498e94e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -68,14 +68,14 @@ def save(self): | |
| def dumpd(self): | ||
| return {self.PARAM_PATH: self.def_path, self.PARAM_REPO: self.def_repo} | ||
|
|
||
| def download(self, to): | ||
| def download(self, to, jobs=None): | ||
| cache = self.repo.cache.local | ||
|
|
||
| with self._make_repo(cache_dir=cache.cache_dir) as repo: | ||
| if self.def_repo.get(self.PARAM_REV_LOCK) is None: | ||
| self.def_repo[self.PARAM_REV_LOCK] = repo.get_rev() | ||
|
|
||
| _, _, cache_infos = repo.fetch_external([self.def_path]) | ||
| _, _, cache_infos = repo.fetch_external([self.def_path], jobs=jobs) | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It wasn't quite apparent how this was being used. Finally, at the 10th/11th level, I found it using
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same to me, they need refactoring. |
||
|
|
||
| cache.checkout(to.path_info, cache_infos[0]) | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -472,7 +472,8 @@ def run( | |||||
| self.remove_outs(ignore_remove=False, force=False) | ||||||
|
|
||||||
| if not self.frozen and self.is_import: | ||||||
| sync_import(self, dry, force) | ||||||
| jobs = kwargs.get("jobs", None) | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @skshetry . With this change tests on Windows would fail, I have reverted it.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @karajan1001 Oops, didn't notice this before merging. I think windows tests were failing for an unrelated reason. We've fixed them yesterday, they were failing because of gitpython. |
||||||
| sync_import(self, dry, force, jobs) | ||||||
| elif not self.frozen and self.cmd: | ||||||
| run_stage(self, dry, force, **kwargs) | ||||||
| else: | ||||||
|
|
||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this might come from some other DVC commands, but let's reconsider this message please?
Number of jobsis not very informative. Number of parallel connections? Number of download jobs?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shcheklein Let's not do that please, it is totally out of scope for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I do see your point π
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could just refer to
dvc pull/fetch/statushere, to smooth this out. E.g.Please refer todvc fetchhelp for descriptionor something like that.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not referring to push/pull or any other commands (we can create a ticket for this if needed).
this messagein my comment was about this specific help message only.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shcheklein I understand that. This one was copied over from push/pull/etc, so the questions might arise there as well. If we use
Please refer to dvc fetch ...here we'll dodge the bullet π While still keeping the analogy correct, since this is pretty much apullfrom an external repo.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm ... I guess, my take on this it's fine to change only here (and later propagate if needed, and if it's needed at all). Also
Please refer to dvc fetchwill force us to kinda go and seefetch, and what's there, etc, etc. Also it complicates UX. To be honest, I haven't see that kind of redirects in the help messages.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agreed that we can change it here, and submit another one for push/pull/etc.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact per #4838
import --jobsoption has a special meaning? "This external tracked data might be stored in a remote DVC repository. In this situation --job which controls the parallelism level for DVC to download data from remote storage." So seems like it's not really the same as in other commands?Unrelated, but
dvc get-url -hused to refer toimport-url(for theurlarg. details) I think. Now onlyimport-urlhas the full list of URLs.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorgeorpinel
dvc importclones an external repo, and then pull down the data. So I think it means the same as what indvc pull.