diff --git a/content/docs/command-reference/init.md b/content/docs/command-reference/init.md index 365499f15b..b0aaf1766a 100644 --- a/content/docs/command-reference/init.md +++ b/content/docs/command-reference/init.md @@ -32,83 +32,54 @@ advanced scenarios: ### Initializing DVC in subdirectories `--subdir` must be provided to initialize DVC in a subdirectory of a Git -repository. DVC still expects to find the Git repository (will check all -directories up to the system root to find `.git/`). This options does not affect -any config files, `.dvc/` directory is created the same way as in the default -mode. This way multiple DVC projects can be initialized in a single -Git repository, providing isolation between projects. +repository. DVC still expects to find a Git root (will check all directories up +to the system root to find `.git/`). This options does not affect any config +files, `.dvc/` directory is created the same way as in the default mode. This +way multiple DVC projects can be initialized in a single Git +repository, providing isolation between projects. #### When is this useful? This option is mostly useful in the scenario of a -[monorepo](https://en.wikipedia.org/wiki/Monorepo) (Git repository split into -several project directories), but can also be used with other patterns when such -isolation is needed. `dvc init --subdir` mitigates the issues of initializing -DVC in the Git repo root: +[monorepo](https://en.wikipedia.org/wiki/Monorepo) (Git repo split into several +project directories), but can also be used with other patterns when such +isolation is needed. `dvc init --subdir` mitigates possible limitations of +initializing DVC in the Git root: - Repository maintainers might not allow a top level `.dvc/` directory, - especially if DVC is being used by several sub-projects (monorepo). + especially if DVC is already being used by several sub-projects (monorepo). - DVC [internals](/doc/user-guide/dvc-files-and-directories) (config file, cache - directory, etc.) are shared across different sub-projects. This forces all of - them to use the same DVC settings and + directory, etc.) would be shared across different subdirectories. This forces + all of them to use the same DVC settings and [remote storage](/doc/command-reference/remote). - By default, DVC commands like `dvc pull` and `dvc repro` explore the whole DVC repository to find DVC-tracked data and pipelines to work with. This can be inefficient for large monorepos. -- Other commands such as `dvc status` and `dvc metrics show` would produce - unexpected results if not constrained to a single project scope. +- Commands such as `dvc status` and `dvc metrics show` would produce unexpected + results if not constrained to a single project scope. #### How does it affect DVC commands? The project root is found by DVC by looking for `.dvc/` from the current working directory, up. It defines the scope of action for most DVC -commands (e.g. `dvc repro`, `dvc pull`, `dvc metrics diff`), meaning that only -`dvc.yaml`, `.dvc` files, etc. inside the project are usable by the commands. +commands (e.g. `dvc repro`, `dvc pull`, `dvc metrics diff`, etc.), meaning that +only `dvc.yaml` and `.dvc` files inside the project are usable by these +commands. -With `--subdir`, the project root will be found before the Git root, making sure -the scope of DVC commands run here is constrained to this project alone, even if -there are more DVC-related files elsewhere in the repo. +With `--subdir`, the project root will be found before the Git root, causing the +scope of DVC commands run here is constrained to this project alone. DVC-related +files elsewhere in the repo are ignored by them. Similarly, DVC commands run +outside this project root (if nested inside another DVC project) will ignore +this project's contents completely. -If there are multiple `--subdir` projects, but not nested, e.g.: +The only thing shared among nested `--subdir` projects and parent repository is +the Git history. -```dvc -. # git init -├── .git -├── project-A -│   ├── .dvc # dvc init --subdir -│ ... -├── project-B -│ ├── .dvc # dvc init --subdir -│ ... -``` - -DVC considers A and B separate projects. Any DVC command run in `project-A` is -not aware of `project-B`. However, commands that involve versioning (like -`dvc diff`, among others) access the commit history from the Git root (`.`). - -> `.` is not a DVC project in this case, so most DVC commands can't be run -> there. - -If there are nested `--subdir` projects e.g.: - -```dvc -project-A -├── .dvc # git init && dvc init -├── .git -├── dvc.yaml -├── ... -├── project-B -│   ├── .dvc # dvc init --subdir -│   ├── data-B.dvc -│ ... -``` - -Nothing changes for the inner projects. And any DVC command run in the outer one -actively ignores the nested project directories. For example, using `dvc pull` -in `project-A` wouldn't download data for the `data-B.dvc` file. +> Note that nested DVC projects are always isolated from their parents, and vice +> versa, whether using `--subdir` or not. ### Initializing DVC without Git diff --git a/content/docs/command-reference/remote/add.md b/content/docs/command-reference/remote/add.md index adc643d7aa..4d3ade3bcb 100644 --- a/content/docs/command-reference/remote/add.md +++ b/content/docs/command-reference/remote/add.md @@ -131,7 +131,8 @@ For example: ```dvc $ dvc remote add -d myremote s3://mybucket/path/to/dir -$ dvc remote modify myremote endpointurl https://object-storage.example.com +$ dvc remote modify myremote endpointurl \ + https://object-storage.example.com ``` > See `dvc remote modify` for a full list of S3 API parameters. diff --git a/content/docs/command-reference/remote/modify.md b/content/docs/command-reference/remote/modify.md index 279a791719..ce9211a97d 100644 --- a/content/docs/command-reference/remote/modify.md +++ b/content/docs/command-reference/remote/modify.md @@ -90,19 +90,19 @@ these settings, you could use the following options: $ dvc remote modify myremote region us-east-2 ``` -- `profile` - credentials profile name to use to access S3: +- `profile` - credentials profile name to access S3: ```dvc $ dvc remote modify myremote profile myprofile ``` -- `credentialpath` - credentials path to use to access S3: +- `credentialpath` - credentials path to access S3: ```dvc $ dvc remote modify myremote credentialpath /path/to/my/creds ``` -- `endpointurl` - endpoint URL to use to access S3: +- `endpointurl` - endpoint URL to access S3: ```dvc $ dvc remote modify myremote endpointurl https://myendpoint.com @@ -168,21 +168,24 @@ these settings, you could use the following options: for specific grantees\*\*. Grantee can read object and its metadata. ```dvc - $ dvc remote modify myremote grant_read id=aws-canonical-user-id,id=another-aws-canonical-user-id + $ dvc remote modify myremote grant_read \ + id=aws-canonical-user-id,id=another-aws-canonical-user-id ``` - `grant_read_acp`\* - grants `READ_ACP` permissions at object level access control list for specific grantees\*\*. Grantee can read the object's ACP. ```dvc - $ dvc remote modify myremote grant_read_acp id=aws-canonical-user-id,id=another-aws-canonical-user-id + $ dvc remote modify myremote grant_read_acp \ + id=aws-canonical-user-id,id=another-aws-canonical-user-id ``` - `grant_write_acp`\* - grants `WRITE_ACP` permissions at object level access control list for specific grantees\*\*. Grantee can modify the object's ACP. ```dvc - $ dvc remote modify myremote grant_write_acp id=aws-canonical-user-id,id=another-aws-canonical-user-id + $ dvc remote modify myremote grant_write_acp \ + id=aws-canonical-user-id,id=another-aws-canonical-user-id ``` - `grant_full_control`\* - grants `FULL_CONTROL` permissions at object level @@ -190,7 +193,8 @@ these settings, you could use the following options: grant_read_acp + grant_write_acp ```dvc - $ dvc remote modify myremote grant_full_control id=aws-canonical-user-id,id=another-aws-canonical-user-id + $ dvc remote modify myremote grant_full_control \ + id=aws-canonical-user-id,id=another-aws-canonical-user-id ``` > \* `grant_read`, `grant_read_acp`, `grant_write_acp` and @@ -221,7 +225,8 @@ For example: ```dvc $ dvc remote add myremote s3://path/to/dir -$ dvc remote modify myremote endpointurl https://object-storage.example.com +$ dvc remote modify myremote endpointurl \ + https://object-storage.example.com ``` S3 remotes can also be configured entirely via environment variables: @@ -250,7 +255,8 @@ For more information about the variables DVC supports, please visit - `connection_string` - connection string. ```dvc - $ dvc remote modify --local myremote connection_string "my-connection-string" + $ dvc remote modify --local myremote connection_string \ + "my-connection-string" ``` > The connection string contains sensitive user info. Therefore, it's safer to @@ -274,8 +280,8 @@ a full guide on using Google Drive as DVC remote storage. [possible formats](/doc/user-guide/setup-google-drive-remote#url-format). ```dvc - $ dvc remote modify myremote \ - url gdrive://0AIac4JZqHhKmUk9PDA/dvcstore + $ dvc remote modify myremote url \ + gdrive://0AIac4JZqHhKmUk9PDA/dvcstore ``` - `gdrive_client_id` - Client ID for authentication with OAuth 2.0 when using a @@ -415,13 +421,13 @@ more information. $ dvc remote modify myremote oss_endpoint endpoint ``` -- `oss_key_id` - OSS key ID to use to access a remote. +- `oss_key_id` - OSS key ID to access the remote. ```dvc $ dvc remote modify myremote --local oss_key_id my-key-id ``` -- `oss_key_secret` - OSS secret key for authorizing access into a remote. +- `oss_key_secret` - OSS secret key for authorizing access into the remote. ```dvc $ dvc remote modify myremote --local oss_key_secret my-key-secret @@ -440,41 +446,43 @@ more information. - `url` - remote location URL. ```dvc - $ dvc remote modify myremote url ssh://user@example.com:1234/path/to/remote + $ dvc remote modify myremote url \ + ssh://user@example.com:1234/absolute/path ``` -- `user` - username to use to access a remote. The order in which dvc searches - for username: - - 1. `user` specified in one of the dvc configs; - 2. `user` specified in the url(e.g. `ssh://user@example.com/path`); - 3. `user` specified in `~/.ssh/config` for remote host; - 4. current user; +- `user` - username to access the remote. ```dvc $ dvc remote modify --local myremote user myuser ``` -- `port` - port to use to access a remote. The order in which dvc searches for - port: + The order in which DVC picks the username: - 1. `port` specified in one of the dvc configs; - 2. `port` specified in the url(e.g. `ssh://example.com:1234/path`); - 3. `port` specified in `~/.ssh/config` for remote host; - 4. default ssh port 22; + 1. `user` parameter set with this command (found in `.dvc/config`); + 2. User defined in the URL (e.g. `ssh://user@example.com/path`); + 3. User defined in `~/.ssh/config` for this host (URL); + 4. Current user + +- `port` - port to access the remote. ```dvc $ dvc remote modify myremote port 2222 ``` -- `keyfile` - path to private key to use to access a remote. + The order in which DVC decide the port number: + + 1. `port` parameter set with this command (found in `.dvc/config`); + 2. Port defined in the URL (e.g. `ssh://example.com:1234/path`); + 3. Port defined in `~/.ssh/config` for this host (URL); + 4. Default SSH port 22 + +- `keyfile` - path to private key to access the remote. ```dvc $ dvc remote modify myremote keyfile /path/to/keyfile ``` -- `password` - a private key passphrase or a password to use to use when - accessing a remote. +- `password` - a private key passphrase or a password to access the remote. ```dvc $ dvc remote modify --local myremote password mypassword @@ -484,8 +492,8 @@ more information. > safer to add them with the `--local` option, so they're written to a > Git-ignored config file. -- `ask_password` - ask for a private key passphrase or a password to use when - accessing a remote. +- `ask_password` - ask for a private key passphrase or a password to access the + remote. ```dvc $ dvc remote modify myremote ask_password true @@ -509,7 +517,7 @@ more information. ### Click for HDFS -- `user` - username to use to access a remote. +- `user` - username to access the remote. ```dvc $ dvc remote modify --local myremote user myuser @@ -524,7 +532,7 @@ more information. ### Click for HTTP -- `auth` - authentication method to use when accessing a remote. The accepted +- `auth` - authentication method to use when accessing the remote. The accepted values are: - `basic` - @@ -551,15 +559,17 @@ more information. ``` - `user` - username to use when the `auth` parameter is set to `basic` or - `digest`. The order in which DVC searches for username: - - 1. `user` specified in one of the DVC configs; - 2. `user` specified in the url(e.g. `http://user@example.com/path`); + `digest`. ```dvc $ dvc remote modify --local myremote user myuser ``` + The order in which DVC picks the username: + + 1. `user` parameter set with this command (found in `.dvc/config`); + 2. User defined in the URL (e.g. `http://user@example.com/path`); + - `password` - password to use for any `auth` method. ```dvc