diff --git a/content/docs/command-reference/get-url.md b/content/docs/command-reference/get-url.md index 5f4b1312ec..e4ee535110 100644 --- a/content/docs/command-reference/get-url.md +++ b/content/docs/command-reference/get-url.md @@ -12,9 +12,8 @@ Download a file or directory from a supported URL (for example `s3://`, usage: dvc get-url [-h] [-q | -v] url [out] positional arguments: - url Location of the data to download. - See supported URLs below. - out Destination path to put files in + url (See supported URLs in the description.) + out Destination path to put files in. ``` ## Description @@ -34,14 +33,17 @@ directory will be placed inside. DVC supports several types of (local or) remote locations (protocols): -| Type | Description | `url` format | -| ------- | -------------- | ------------------------------------------ | -| `local` | Local path | `/path/to/local/data` | -| `s3` | Amazon S3 | `s3://mybucket/data` | -| `gs` | Google Storage | `gs://mybucket/data` | -| `ssh` | SSH server | `ssh://user@example.com/path/to/data` | -| `hdfs` | HDFS to file\* | `hdfs://user@example.com/path/to/data.csv` | -| `http` | HTTP to file\* | `https://example.com/path/to/data.csv` | +| Type | Description | `url` format example | +| -------- | ---------------------------- | ---------------------------------------------------------- | +| `s3` | Amazon S3 | `s3://bucket/data` | +| `azure` | Microsoft Azure Blob Storage | `azure://container/data` | +| `gdrive` | Google Drive | `gdrive:///data` | +| `gs` | Google Cloud Storage | `gs://bucket/data` | +| `ssh` | SSH server | `ssh://user@example.com/path/to/data` | +| `hdfs` | HDFS to file\* | `hdfs://user@example.com/path/to/data.csv` | +| `http` | HTTP to file\* | `https://example.com/path/to/data.csv` | +| `webdav` | WebDav to file\* | `webdavs://example.com/public.php/webdav/path/to/data.csv` | +| `local` | Local path | `/path/to/local/data` | > If you installed DVC via `pip` and plan to use cloud services as remote > storage, you might need to install these optional dependencies: `[s3]`, @@ -49,8 +51,15 @@ DVC supports several types of (local or) remote locations (protocols): > include them all. The command should look like this: `pip install "dvc[s3]"`. > (This example installs `boto3` library along with DVC to support S3 storage.) -\* HDFS and HTTP **do not** support downloading entire directories, only single -files. +\* Notes on remote locations: + +- HDFS, HTTP, and WebDav **do not** support downloading entire directories, only + single files. + +- `remote://myremote/path/to/file` notation just means that a DVC + [remote](/doc/command-reference/remote) `myremote` is defined and when DVC is + running. DVC automatically expands this URL into a regular S3, SSH, GS, etc + URL by appending `/path/to/file` to the `myremote`'s configured base path. Another way to understand the `dvc get-url` command is as a tool for downloading data files. On GNU/Linux systems for example, instead of `dvc get-url` with @@ -73,19 +82,6 @@ $ wget https://example.com/path/to/data.csv
-### Click and expand for a local example - -```dvc -$ dvc get-url /local/path/to/data -``` - -The above command will copy the `/local/path/to/data` file or directory into -`./dir`. - -
- -
- ### Click for Amazon S3 example This command will copy an S3 object into the current working directory with the @@ -157,3 +153,16 @@ $ dvc get-url https://example.com/path/to/file ```
+ +### Click and expand for a local example + +```dvc +$ dvc get-url /local/path/to/data +``` + +The above command will copy the `/local/path/to/data` file or directory into +`./dir`. + + + +
diff --git a/content/docs/command-reference/import-url.md b/content/docs/command-reference/import-url.md index fe12923b97..30e8b811d2 100644 --- a/content/docs/command-reference/import-url.md +++ b/content/docs/command-reference/import-url.md @@ -14,9 +14,8 @@ usage: dvc import-url [-h] [-q | -v] [--file ] [--no-exec] url [out] positional arguments: - url Location of the data to import. - See supported URLs below. - out Destination path to put files in + url (See supported URLs in the description.) + out Destination path to put files in. ``` ## Description @@ -56,16 +55,18 @@ source. DVC supports several types of (local or) remote locations (protocols): -| Type | Description | `url` format | -| -------- | --------------------------------------------------- | ------------------------------------------ | -| `local` | Local path | `/path/to/local/data` | -| `s3` | Amazon S3 | `s3://mybucket/data` | -| `azure` | Microsoft Azure Blob Storage | `azure://my-container-name/path/to/data` | -| `gs` | Google Cloud Storage | `gs://mybucket/data` | -| `ssh` | SSH server | `ssh://user@example.com/path/to/data` | -| `hdfs` | HDFS to file (explanation below) | `hdfs://user@example.com/path/to/data.csv` | -| `http` | HTTP to file with _strong ETag_ (explanation below) | `https://example.com/path/to/data.csv` | -| `remote` | Remote path (see explanation below) | `remote://myremote/path/to/data` | +| Type | Description | `url` format example | +| -------- | --------------------------------- | ---------------------------------------------------------- | +| `s3` | Amazon S3 | `s3://bucket/data` | +| `azure` | Microsoft Azure Blob Storage | `azure://container/data` | +| `gdrive` | Google Drive | `gdrive:///data` | +| `gs` | Google Cloud Storage | `gs://bucket/data` | +| `ssh` | SSH server | `ssh://user@example.com/path/to/data` | +| `hdfs` | HDFS to file\* | `hdfs://user@example.com/path/to/data.csv` | +| `http` | HTTP to file with _strong ETag_\* | `https://example.com/path/to/data.csv` | +| `webdav` | WebDav to file\* | `webdavs://example.com/public.php/webdav/path/to/data.csv` | +| `local` | Local path | `/path/to/local/data` | +| `remote` | Remote path\* | `remote://remote-name/data` | > If you installed DVC via `pip` and plan to use cloud services as remote > storage, you might need to install these optional dependencies: `[s3]`, @@ -73,10 +74,10 @@ DVC supports several types of (local or) remote locations (protocols): > include them all. The command should look like this: `pip install "dvc[s3]"`. > (This example installs `boto3` library along with DVC to support S3 storage.) -Specific explanations: +\* Notes on remote locations: -- HDFS and HTTP **do not** support downloading entire directories, only single - files. +- HDFS, HTTP, and WebDav **do not** support downloading entire directories, only + single files. - In case of HTTP, [strong ETag](https://en.wikipedia.org/wiki/HTTP_ETag#Strong_and_weak_validation) diff --git a/content/docs/command-reference/remote/add.md b/content/docs/command-reference/remote/add.md index ec1b03c0e3..64c8087664 100644 --- a/content/docs/command-reference/remote/add.md +++ b/content/docs/command-reference/remote/add.md @@ -12,9 +12,8 @@ usage: dvc remote add [-h] [--global | --system | --local] [-q | -v] [-d] [-f] name url positional arguments: - name Name of the remote - url Remote location. - See full list of supported URLs below. + name Name of the remote. + url (See supported URLs in the examples below.) ``` ## Description @@ -94,7 +93,7 @@ The following are the types of remote storage (protocols) supported: > [Create a Bucket](https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html). ```dvc -$ dvc remote add -d s3remote url s3://my-bucket/my-key +$ dvc remote add -d s3remote url s3://mybucket/path ``` By default, DVC expects your AWS CLI is already @@ -134,7 +133,7 @@ configure the remote's `endpointurl` explicitly: For example: ```dvc -$ dvc remote add -d myremote s3://my-bucket/path/to/dir +$ dvc remote add -d myremote s3://mybucket/path/to/dir $ dvc remote modify myremote endpointurl \ https://object-storage.example.com ``` @@ -146,7 +145,7 @@ S3 remotes can also be configured entirely via environment variables: ```dvc $ export AWS_ACCESS_KEY_ID="" $ export AWS_SECRET_ACCESS_KEY="" -$ dvc remote add -d myremote s3://my-bucket/my/key +$ dvc remote add -d myremote s3://mybucket/my/path ``` For more information about the variables DVC supports, please visit @@ -159,7 +158,7 @@ For more information about the variables DVC supports, please visit ### Click for Microsoft Azure Blob Storage ```dvc -$ dvc remote add -d myremote azure://my-container-name/path +$ dvc remote add -d myremote azure://mycontainer/path $ dvc remote modify --local myremote connection_string \ 'my-connection-string' ``` @@ -173,7 +172,7 @@ variables: ```dvc $ export AZURE_STORAGE_CONNECTION_STRING='' -$ export AZURE_STORAGE_CONTAINER_NAME='my-container-name' +$ export AZURE_STORAGE_CONTAINER_NAME='mycontainer' $ dvc remote add -d myremote 'azure://' ``` @@ -410,7 +409,7 @@ region. > [Create a Bucket](https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html). ```dvc -$ dvc remote add -d myremote s3://mybucket/myproject +$ dvc remote add -d myremote s3://mybucket/path Setting 'myremote' as a default remote. $ dvc remote modify myremote region us-east-2 @@ -420,7 +419,7 @@ The project's config file (`.dvc/config`) now looks like this: ```ini ['remote "myremote"'] -url = s3://mybucket/myproject +url = s3://mybucket/path region = us-east-2 [core] remote = myremote @@ -430,13 +429,13 @@ The list of remotes should now be: ```dvc $ dvc remote list -myremote s3://mybucket/myproject +myremote s3://mybucket/path ``` You can overwrite existing remotes using `-f` with `dvc remote add`: ```dvc -$ dvc remote add -f myremote s3://mybucket/mynewproject +$ dvc remote add -f myremote s3://mybucket/another-path ``` List remotes again to view the updated remote: @@ -444,5 +443,5 @@ List remotes again to view the updated remote: ```dvc $ dvc remote list -myremote s3://mybucket/mynewproject +myremote s3://mybucket/another-path ``` diff --git a/content/docs/command-reference/remote/index.md b/content/docs/command-reference/remote/index.md index c1d516d9db..6aa54dee17 100644 --- a/content/docs/command-reference/remote/index.md +++ b/content/docs/command-reference/remote/index.md @@ -103,7 +103,7 @@ remote = myremote > [Create a Bucket](https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html). ```dvc -$ dvc remote add newremote s3://mybucket/myproject +$ dvc remote add newremote s3://mybucket/path $ dvc remote modify newremote endpointurl https://object-storage.example.com ``` @@ -115,7 +115,7 @@ url = /path/to/remote [core] remote = myremote ['remote "newremote"'] -url = s3://mybucket/myproject +url = s3://mybucket/path endpointurl = https://object-storage.example.com ``` @@ -124,7 +124,7 @@ endpointurl = https://object-storage.example.com ```dvc $ dvc remote list myremote /path/to/remote -newremote s3://mybucket/myproject +newremote s3://mybucket/path ``` ## Example: Change the name of a remote diff --git a/content/docs/command-reference/remote/modify.md b/content/docs/command-reference/remote/modify.md index b11f63b73e..3de966a489 100644 --- a/content/docs/command-reference/remote/modify.md +++ b/content/docs/command-reference/remote/modify.md @@ -66,7 +66,7 @@ The following config options are available for all remote types: below): ```dvc - $ dvc remote modify s3remote url s3://my-bucket/my/key + $ dvc remote modify s3remote url s3://mybucket/path ``` Or a _local remote_ (a directory in the file system): @@ -105,7 +105,7 @@ these settings, you could use the following options. - `url` - remote location, in the `s3:///` format: ```dvc - $ dvc remote modify myremote url s3://my-bucket/my/key + $ dvc remote modify myremote url s3://mybucket/my/path ``` - `region` - change S3 remote region: @@ -240,7 +240,7 @@ To communicate with a remote object storage that supports an S3 compatible API configure the remote's `endpointurl` explicitly: ```dvc -$ dvc remote add -d myremote s3://my-bucket/path/to/dir +$ dvc remote add -d myremote s3://mybucket/path/to/dir $ dvc remote modify myremote endpointurl \ https://object-storage.example.com ``` @@ -250,7 +250,7 @@ S3 remotes can also be configured entirely via environment variables: ```dvc $ export AWS_ACCESS_KEY_ID='' $ export AWS_SECRET_ACCESS_KEY='' -$ dvc remote add -d myremote s3://my-bucket/my/key +$ dvc remote add -d myremote s3://mybucket/my/path ``` For more information about the variables DVC supports, please visit @@ -265,7 +265,7 @@ For more information about the variables DVC supports, please visit - `url` - remote location, in the `azure:///` format: ```dvc - $ dvc remote modify myremote url azure://my-container-name/path + $ dvc remote modify myremote url azure://mycontainer/path ``` - `connection_string` - connection string: @@ -717,7 +717,7 @@ Let's first set up a _default_ S3 remote. > [Create a Bucket](https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html). ```dvc -$ dvc remote add -d myremote s3://mybucket/myproject +$ dvc remote add -d myremote s3://mybucket/path Setting 'myremote' as a default remote. ``` @@ -731,7 +731,7 @@ Now the project config file should look like this: ```ini ['remote "myremote"'] -url = s3://mybucket/storage +url = s3://mybucket/path profile = myusername [core] remote = myremote diff --git a/content/docs/command-reference/remote/remove.md b/content/docs/command-reference/remote/remove.md index c1939aa9cf..12bfa6eb36 100644 --- a/content/docs/command-reference/remote/remove.md +++ b/content/docs/command-reference/remote/remove.md @@ -46,7 +46,7 @@ The `name` argument is required. Add Amazon S3 remote: ```dvc -$ dvc remote add myremote s3://mybucket/myproject +$ dvc remote add myremote s3://mybucket/path ``` Remove it: diff --git a/content/docs/command-reference/remote/rename.md b/content/docs/command-reference/remote/rename.md index 7504492f5b..06f61c8aac 100644 --- a/content/docs/command-reference/remote/rename.md +++ b/content/docs/command-reference/remote/rename.md @@ -50,7 +50,7 @@ DVC remote, respectively. Add Amazon S3 remote: ```dvc -$ dvc remote add myremote s3://mybucket/myproject +$ dvc remote add myremote s3://mybucket/path ``` Rename it: diff --git a/content/docs/command-reference/status.md b/content/docs/command-reference/status.md index 1ab31228ca..a31e00195c 100644 --- a/content/docs/command-reference/status.md +++ b/content/docs/command-reference/status.md @@ -227,7 +227,7 @@ what files we have generated but haven't pushed to the remote yet: ```dvc $ dvc remote list -storage s3://dvc-remote +storage s3://bucket/path ``` And would like to check what files we have generated but haven't pushed to the diff --git a/content/docs/use-cases/sharing-data-and-model-files.md b/content/docs/use-cases/sharing-data-and-model-files.md index 7b46bd91af..e2930f470b 100644 --- a/content/docs/use-cases/sharing-data-and-model-files.md +++ b/content/docs/use-cases/sharing-data-and-model-files.md @@ -31,7 +31,7 @@ to the bucket where the data should be stored to the `dvc remote add` command. For example: ```dvc -$ dvc remote add -d myremote s3://mybucket/myproject +$ dvc remote add -d myremote s3://mybucket/path Setting 'myremote' as a default remote. ``` @@ -43,7 +43,7 @@ remote section for it: ```dvc ['remote "myremote"'] -url = s3://mybucket/myproject +url = s3://mybucket/path [core] remote = myremote ``` diff --git a/content/docs/user-guide/external-dependencies.md b/content/docs/user-guide/external-dependencies.md index 3241fa0be7..c23a486f12 100644 --- a/content/docs/user-guide/external-dependencies.md +++ b/content/docs/user-guide/external-dependencies.md @@ -54,12 +54,12 @@ $ dvc run -n download_file ```dvc $ dvc run -n download_file - -d azure://my-container-name/data.txt \ + -d azure://mycontainer/data.txt \ -o data.txt \ az storage copy \ -d data.json \ --source-account-name my-account \ - --source-container my-container-name \ + --source-container mycontainer \ --source-blob data.txt ```