erepo: fix relative url on default remote (#2756)#3378
Conversation
|
@pared @efiop added a test, I'm not entirely sure about it. What I tried to do there is to reproduce the script in the description of the issue, but maybe that is too much and it would be better to just check that the path is absolute and that it exists? not sure. I will wait for your input on that. |
|
Note that this change only modifies the default remote, should all remotes with relative urls be taken care of instead? |
|
Thanks for the feedback everyone. Just pushed a new version based on your comments. Branching has been moved to it's own function ( If other local relative remotes happen to need handling, |
pared
left a comment
There was a problem hiding this comment.
Seems good, left few comments.
- use `tmp_dir` fixture - commit config file changes
|
@pared thank you! just made the changes you requested. |
| "url": cache_dir | ||
| } | ||
| self.config["core"]["remote"] = "auto-generated-upstream" | ||
| upstream_name = self.config["core"].get("remote") |
There was a problem hiding this comment.
remote_name is better name then upstream_name IMHO cause it is used in _fix_local_remote and more intuitive
There was a problem hiding this comment.
Yes, but this one isn't any remote, right? what is being checked here is the default remote, not the other remotes (if any). Anyway, both names work for me, let me know what you think.
| src_repo.close() | ||
|
|
||
| def _fix_local_remote(self, src_repo, remote_name): | ||
| # Keep the relative path for a local remote relative |
There was a problem hiding this comment.
I think the comment is not relevant:
- Cause the paths are absolute as
Config.RelPath - There are no any conversions of the paths
There was a problem hiding this comment.
Ok, I guess the wording is not correct in that comment. The path is absolute, but it was built from a relative URL, and thats why it is being changed with the path from the source repo.
What do you think of making the test a check to see if the path is relative instead?
if isinstance(new_remote["url"], RelPath):
...This makes it explicit that what we care about there is paths with urls relative to the repository, with no more clarification required.
There was a problem hiding this comment.
@tizoc maybe if not os.path.isabs(new_url) then? It would work with strings and pathlib objects. Just worried about this being easilly broken with some hand-written dict config in tests or elsewhere.
There was a problem hiding this comment.
Ok, I had not thought about that. But you are right, if not os.path.isabs(new_url) would work the same and without those possible breakages. The benefit of checking for RelPath instances was only to better express intent.
| upstream_name = self.config["core"].get("remote") | ||
| src_repo = Repo(self.url) | ||
| try: | ||
| if upstream_name: |
There was a problem hiding this comment.
try:
self._add_upstream(src_repo)
...and move the logic of resolving upstream into the _add_upstream method. Cause here the upstream is added, what type of it and how - can be handled inside the method
If you want to use conditions here then methods can be named differently, for example as
try:
if remote_name:
self._add_upstream_from_remote(..) # not sure in relevance of `add_upstream`
else:
self._add_upstream_from_repo(...)There was a problem hiding this comment.
I'm thinking that maybe the two separate flows are not required? Consider my question here #3378 (comment)
If instead of going one way or another depending on if there is a default remote set, I go through every remote and replace every RelPath with the one from the source repo, and then I check if there is a default repo set up (and handle that accordingly) there will be a single branch of execution.
Something like this:
def _fix_remotes(self):
if not os.path.isdir(self.url):
return
src_repo = Repo(self.url)
try:
self._fix_source_relative_remotes(src_repo)
if "remote" not in self.config["core"]:
self._set_upstream(src_repo)
finally:
src_repo.close()
def _fix_source_relative_remotes(self, src_repo):
for remote_name, new_remote in self.config["remote"].items():
if isinstance(new_remote["url"], RelPath):
old_remote = src_repo.config["remote"][remote_name]
new_remote["url"] = old_remote["url"]/cc @efiop
There was a problem hiding this comment.
Maybe rename _set_upstream to _set_upstream_to_repo (also, the comment there is probably not needed anymore, it is describing what the three lines of code in that function already express clearly).
There was a problem hiding this comment.
@tizoc I think two branches of execution are necessary, since _set_upstream is only needed when there is no default remote and _fix_upstream is only needed when there is a default remote but it is a local one with relpath url.
@JIoJIaJIu So you mean moving
if upstream_name:
self._fix_local_remote(src_repo, upstream_name)
else:
self._add_upstream(src_repo)
into a new method? Well, we could, but I don't see much value in doing that, tbh. I would be fine with the current approach as well.
There was a problem hiding this comment.
@efiop my reasoning is, if relative urls are imported from the source repo, we want all of them to be kept relative to the source repo, right? not just the one that happens to be the default. If that is the case, that is independent from if there is a default remote set or not.
There was a problem hiding this comment.
@tizoc But we don't really care about other remotes in this case. We only care about default remote, as it is the only thing that we are actually going to use, at least for now.
|
I just pushed the requested changes. I have not included the change I proposed in #3378 (comment) If you think doing that would be an improvement I will commit that too (but using |
| if new_remote["url"] != old_remote["url"]: | ||
| new_remote["url"] = old_remote["url"] | ||
|
|
There was a problem hiding this comment.
Would at least add a comment about what is going on here, though.
There was a problem hiding this comment.
Done. Btw, should I squash all commits into a single one?
There was a problem hiding this comment.
As you wish. But we'll squash automatically anyways 🙂 Though if you think this is a final patch, you need to remove WIP prefix from the header.
There was a problem hiding this comment.
Ah cool! no need then. "WIP" removed.
|
Thanks @tizoc ! 🙏 |
❗ Have you followed the guidelines in the Contributing to DVC list?
📖 Check this box if this PR does not require documentation updates, or if it does and you have created a separate PR in dvc.org with such updates (or at least opened an issue about it in that repo). Please link below to your PR (or issue) in the dvc.org repo.
❌ Have you checked DeepSource, CodeClimate, and other sanity checks below? We consider their findings recommendatory and don't expect everything to be addressed. Please review them carefully and fix those that actually improve code or fix bugs.
Thank you for the contribution - we'll try to review it as soon as possible. 🙏
Pending