-
Notifications
You must be signed in to change notification settings - Fork 409
docs: consistent remote location URLs + related copy edits #1695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
b350213
81bcdba
91a46df
eef6283
e864a40
db15a96
64b2a9c
5dc0cb1
144c458
4619506
fe810b1
eaef091
aa87054
ee0f42a
500ac14
1805191
2fe9b75
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -12,9 +12,8 @@ Download a file or directory from a supported URL (for example `s3://`, | |
| usage: dvc get-url [-h] [-q | -v] url [out] | ||
|
|
||
| positional arguments: | ||
| url Location of the data to download. | ||
| See supported URLs below. | ||
| out Destination path to put files in | ||
| url (See supported URLs in the description.) | ||
| out Destination path to put files in. | ||
| ``` | ||
|
|
||
| ## Description | ||
|
|
@@ -34,23 +33,33 @@ directory will be placed inside. | |
|
|
||
| DVC supports several types of (local or) remote locations (protocols): | ||
|
|
||
| | Type | Description | `url` format | | ||
| | ------- | -------------- | ------------------------------------------ | | ||
| | `local` | Local path | `/path/to/local/data` | | ||
| | `s3` | Amazon S3 | `s3://mybucket/data` | | ||
| | `gs` | Google Storage | `gs://mybucket/data` | | ||
| | `ssh` | SSH server | `ssh://user@example.com/path/to/data` | | ||
| | `hdfs` | HDFS to file\* | `hdfs://user@example.com/path/to/data.csv` | | ||
| | `http` | HTTP to file\* | `https://example.com/path/to/data.csv` | | ||
| | Type | Description | `url` format example | | ||
| | -------- | ---------------------------- | ---------------------------------------------------------- | | ||
| | `s3` | Amazon S3 | `s3://bucket/data` | | ||
| | `azure` | Microsoft Azure Blob Storage | `azure://container/data` | | ||
| | `gdrive` | Google Drive | `gdrive://<folder-id>/data` | | ||
| | `gs` | Google Cloud Storage | `gs://bucket/data` | | ||
| | `ssh` | SSH server | `ssh://user@example.com/path/to/data` | | ||
| | `hdfs` | HDFS to file\* | `hdfs://user@example.com/path/to/data.csv` | | ||
| | `http` | HTTP to file\* | `https://example.com/path/to/data.csv` | | ||
| | `webdav` | WebDav to file\* | `webdavs://example.com/public.php/webdav/path/to/data.csv` | | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not familiar TBH but apparently yes there's some sort of endpoint. Not sure if it's typically PHP. I took this from https://dvc.org/doc/command-reference/remote/add#supported-storage-types But let me check, maybe it's quick to figure this out. ⌛
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From https://linuxconfig.org/webdav-server-setup-on-ubuntu-linux and https://docs.microsoft.com/en-us/iis/install/installing-publishing-technologies/installing-and-configuring-webdav-on-iis WebDAV is setup similar to any HTTP server so no special need for PHP here... We should probably review all these in a separate PR though, so merging this for now (will create a ticket).
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| | `local` | Local path | `/path/to/local/data` | | ||
|
jorgeorpinel marked this conversation as resolved.
|
||
|
|
||
| > If you installed DVC via `pip` and plan to use cloud services as remote | ||
| > storage, you might need to install these optional dependencies: `[s3]`, | ||
| > `[azure]`, `[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Alternatively, use `[all]` to | ||
| > include them all. The command should look like this: `pip install "dvc[s3]"`. | ||
| > (This example installs `boto3` library along with DVC to support S3 storage.) | ||
|
|
||
| \* HDFS and HTTP **do not** support downloading entire directories, only single | ||
| files. | ||
| \* Notes on remote locations: | ||
|
|
||
| - HDFS, HTTP, and WebDav **do not** support downloading entire directories, only | ||
| single files. | ||
|
|
||
| - `remote://myremote/path/to/file` notation just means that a DVC | ||
| [remote](/doc/command-reference/remote) `myremote` is defined and when DVC is | ||
| running. DVC automatically expands this URL into a regular S3, SSH, GS, etc | ||
| URL by appending `/path/to/file` to the `myremote`'s configured base path. | ||
|
|
||
| Another way to understand the `dvc get-url` command is as a tool for downloading | ||
| data files. On GNU/Linux systems for example, instead of `dvc get-url` with | ||
|
|
@@ -73,19 +82,6 @@ $ wget https://example.com/path/to/data.csv | |
|
|
||
| <details> | ||
|
|
||
| ### Click and expand for a local example | ||
|
|
||
| ```dvc | ||
| $ dvc get-url /local/path/to/data | ||
| ``` | ||
|
|
||
| The above command will copy the `/local/path/to/data` file or directory into | ||
| `./dir`. | ||
|
|
||
| </details> | ||
|
|
||
| <details> | ||
|
|
||
| ### Click for Amazon S3 example | ||
|
jorgeorpinel marked this conversation as resolved.
|
||
|
|
||
| This command will copy an S3 object into the current working directory with the | ||
|
|
@@ -157,3 +153,16 @@ $ dvc get-url https://example.com/path/to/file | |
| ``` | ||
|
|
||
| </details> | ||
|
|
||
| ### Click and expand for a local example | ||
|
|
||
| ```dvc | ||
| $ dvc get-url /local/path/to/data | ||
| ``` | ||
|
|
||
| The above command will copy the `/local/path/to/data` file or directory into | ||
| `./dir`. | ||
|
|
||
| </details> | ||
|
|
||
| <details> | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -12,9 +12,8 @@ usage: dvc remote add [-h] [--global | --system | --local] [-q | -v] | |
| [-d] [-f] name url | ||
|
|
||
| positional arguments: | ||
| name Name of the remote | ||
| url Remote location. | ||
| See full list of supported URLs below. | ||
| name Name of the remote. | ||
| url (See supported URLs in the examples below.) | ||
| ``` | ||
|
|
||
| ## Description | ||
|
|
@@ -94,7 +93,7 @@ The following are the types of remote storage (protocols) supported: | |
| > [Create a Bucket](https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html). | ||
|
|
||
| ```dvc | ||
| $ dvc remote add -d s3remote url s3://my-bucket/my-key | ||
| $ dvc remote add -d s3remote url s3://mybucket/path | ||
| ``` | ||
|
|
||
| By default, DVC expects your AWS CLI is already | ||
|
|
@@ -134,7 +133,7 @@ configure the remote's `endpointurl` explicitly: | |
| For example: | ||
|
|
||
| ```dvc | ||
| $ dvc remote add -d myremote s3://my-bucket/path/to/dir | ||
| $ dvc remote add -d myremote s3://mybucket/path/to/dir | ||
|
Comment on lines
-137
to
+136
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cosmetic: I removed |
||
| $ dvc remote modify myremote endpointurl \ | ||
| https://object-storage.example.com | ||
| ``` | ||
|
|
@@ -146,7 +145,7 @@ S3 remotes can also be configured entirely via environment variables: | |
| ```dvc | ||
| $ export AWS_ACCESS_KEY_ID="<my-access-key>" | ||
| $ export AWS_SECRET_ACCESS_KEY="<my-secret-key>" | ||
| $ dvc remote add -d myremote s3://my-bucket/my/key | ||
| $ dvc remote add -d myremote s3://mybucket/my/path | ||
| ``` | ||
|
|
||
| For more information about the variables DVC supports, please visit | ||
|
|
@@ -159,7 +158,7 @@ For more information about the variables DVC supports, please visit | |
| ### Click for Microsoft Azure Blob Storage | ||
|
|
||
| ```dvc | ||
| $ dvc remote add -d myremote azure://my-container-name/path | ||
| $ dvc remote add -d myremote azure://mycontainer/path | ||
| $ dvc remote modify --local myremote connection_string \ | ||
| 'my-connection-string' | ||
| ``` | ||
|
|
@@ -173,7 +172,7 @@ variables: | |
|
|
||
| ```dvc | ||
| $ export AZURE_STORAGE_CONNECTION_STRING='<my-connection-string>' | ||
| $ export AZURE_STORAGE_CONTAINER_NAME='my-container-name' | ||
| $ export AZURE_STORAGE_CONTAINER_NAME='mycontainer' | ||
| $ dvc remote add -d myremote 'azure://' | ||
| ``` | ||
|
|
||
|
|
@@ -410,7 +409,7 @@ region. | |
| > [Create a Bucket](https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html). | ||
|
|
||
| ```dvc | ||
| $ dvc remote add -d myremote s3://mybucket/myproject | ||
| $ dvc remote add -d myremote s3://mybucket/path | ||
| Setting 'myremote' as a default remote. | ||
|
|
||
| $ dvc remote modify myremote region us-east-2 | ||
|
|
@@ -420,7 +419,7 @@ The <abbr>project</abbr>'s config file (`.dvc/config`) now looks like this: | |
|
|
||
| ```ini | ||
| ['remote "myremote"'] | ||
| url = s3://mybucket/myproject | ||
| url = s3://mybucket/path | ||
| region = us-east-2 | ||
| [core] | ||
| remote = myremote | ||
|
|
@@ -430,19 +429,19 @@ The list of remotes should now be: | |
|
|
||
| ```dvc | ||
| $ dvc remote list | ||
| myremote s3://mybucket/myproject | ||
| myremote s3://mybucket/path | ||
| ``` | ||
|
|
||
| You can overwrite existing remotes using `-f` with `dvc remote add`: | ||
|
|
||
| ```dvc | ||
| $ dvc remote add -f myremote s3://mybucket/mynewproject | ||
| $ dvc remote add -f myremote s3://mybucket/another-path | ||
| ``` | ||
|
|
||
| List remotes again to view the updated remote: | ||
|
|
||
| ```dvc | ||
| $ dvc remote list | ||
|
|
||
| myremote s3://mybucket/mynewproject | ||
| myremote s3://mybucket/another-path | ||
| ``` | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -103,7 +103,7 @@ remote = myremote | |
| > [Create a Bucket](https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html). | ||
|
|
||
| ```dvc | ||
| $ dvc remote add newremote s3://mybucket/myproject | ||
| $ dvc remote add newremote s3://mybucket/path | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changed possibly confusing concepts like "project" or "key" to a more generic "path" in URLs (mostly S3 URLs had this problem).
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. since we change this - we might want to use
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Probably not in this one under the Customize an additional S3 remote example (meant as a continuation of the previous example where a default local remote is added). |
||
| $ dvc remote modify newremote endpointurl https://object-storage.example.com | ||
| ``` | ||
|
|
||
|
|
@@ -115,7 +115,7 @@ url = /path/to/remote | |
| [core] | ||
| remote = myremote | ||
| ['remote "newremote"'] | ||
| url = s3://mybucket/myproject | ||
| url = s3://mybucket/path | ||
| endpointurl = https://object-storage.example.com | ||
| ``` | ||
|
|
||
|
|
@@ -124,7 +124,7 @@ endpointurl = https://object-storage.example.com | |
| ```dvc | ||
| $ dvc remote list | ||
| myremote /path/to/remote | ||
| newremote s3://mybucket/myproject | ||
| newremote s3://mybucket/path | ||
| ``` | ||
|
|
||
| ## Example: Change the name of a remote | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.