tree,remote: add support for WebDAV#4256
Conversation
There was a problem hiding this comment.
Yes, should be safe, the webdavclient3 API seems to be rather fixed. Just used what I got out of pip freeze...
There was a problem hiding this comment.
It'd be great if you could implement walk_files, no pressure though.
There was a problem hiding this comment.
You probably don't need root here, as that url from `REMOTE_COMMON should work, no?
There was a problem hiding this comment.
Yes, probably it is not necessary any more, I used it to specify the location of the WebDAV API endpoint (e.g. /public.php/webdav). But I still have a bit of confusion there...
Since I now terminate the makedirs at the path part of the initial path_info (which comes from the url of REMOTE_COMMON), the root option might be removed to not confuse others as well.
I somehow still wonder if there could be a valid use case for the option... At least there must be a reason for webdavclient3 to offer this option?
There was a problem hiding this comment.
Cannot we split that into hostname and root from the url:
Eg: http://owncloud.com/remote.php/webdav -> http://owncloud.com and /remote.php/webdav?
There was a problem hiding this comment.
port is not passed, try something like following, perhaps?
http_info = HTTPURLInfo(self.path_info.url)
hostname = http_info.replace(path="").urlThere was a problem hiding this comment.
Yes, you are right, I somehow forgot that there is more in the path_info than scheme and host... Should be fixed now.
There was a problem hiding this comment.
Thanks @iksnagreb for the PR. Just tried locally, works for the most part (I hit an issue with port though, see above). Also, I am quite not sure with webdav:// urls though (and, also, root can just be deduced from url as I mentioned above). 🙂
Relates to iterative#1153, treeverse/dvc#4256
There was a problem hiding this comment.
These are only used in webdav.py, so let's move them there. Also i'm not sure we need a special WebDAVConfigError exception. We could just raise ConfigError or at least inherit from it.
There was a problem hiding this comment.
Some of these comments are not useful, they are declaring obvious things.
There was a problem hiding this comment.
I don't think this method is needed.
There was a problem hiding this comment.
Yes, it is probably not needed. I have implemented it, because it comes almost for free as it directly matches the webdavclient3 method (the same goes for move/remove). Shall I remove it?
There was a problem hiding this comment.
Yeah, let's remove it to not leave dead code around. If we ever need it - it will be trivial to introduce.
There was a problem hiding this comment.
Shame this library doesn't support progress bars for download/upload, the ui will not be very helpful :(
We need to at least put a dummy progress bar for this.
There was a problem hiding this comment.
Is etag or something like that part of the webdav standard?
There was a problem hiding this comment.
As WebDAV is an extension of HTTP, it should be part of the standard.
Webdav support is based on https://pypi.org/project/webdavclient3/ and supports basic download/upload operation, directory creation as well as existence, file hash and isdir query. Copy, move and remove are also implemented, though probably not used yet. WebdavURLInfo is taken from https://github.com/shizacat/dvc/tree/remote-webdav Fixes iterative#1153
Webdav token auth, certificate and key path and connection timeout are configurable. Webdav username might be specified or extracted from URL. Refs iterative#1153
Refs iterative#1153
Refs iterative#1153
This enables the WebDAV api location (e.g. '/public.php/webdav') to be part of the remote 'url' configuration instead of beeing specified separately via the 'root' option. The 'root' option may then be used to specify real directories at the WebDAV storage, although using it to set the api location is still possible. Refs iterative#1153
Context: treeverse#4256 (comment) Refs iterative#1153
The WebDAV 'root' option was rather confusing and should be handled by the initial 'path_info' from the config 'url' option. Context: treeverse#4256 (comment) While stripping the path/root from the hostname the port got lost, which is fixed now by simply using the URLInfo 'replace' method as suggested. Context: treeverse#4256 (comment) The WebDAV client connection is tested by probing the existence of the root (self.path_info.path). Refs iterative#1153
Context: treeverse#4256 (comment) Refs: iterative#1153
Context: treeverse#4256 (comment) Refs iterative#1153
Refs iterative#1153
Context: treeverse#4256 (comment) Refs iterative#1153
Context: treeverse#4256 (comment) Refs iterative#1153
Context: treeverse#4256 (comment) Refs iterative#1153
efiop
left a comment
There was a problem hiding this comment.
This is great! Thank you!
I know that we've discussed tests in discord, but as @pmrowla and @skshetry noted, maybe we could use a docker-based fixture for testing this. E.g. see https://github.com/iterative/dvc/blob/master/tests/remotes/azure.py , where we use azurite docker image declared in https://github.com/iterative/dvc/blob/master/tests/docker-compose.yml . Maybe we could use something like https://hub.docker.com/r/bytemark/webdav to test webdav the same way?
But regardless of the docker-based test, you've confirmed that this implementation works for you, so let's merge for now 🙂
|
We can also use |
❗ I have followed the Contributing to DVC checklist.
📖 If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
❌ I will check DeepSource, CodeClimate, and other sanity checks below. (We consider them recommendatory and don't expect everything to be addressed. Please fix things that actually improve code or fix bugs.)
Thank you for the contribution - we'll try to review it as soon as possible. 🙏
Should fix #1153
Took WebDAVURLInfo from #3647
Relates to treeverse/dvc.org#1617
TODO