Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
0be6f03
Update import-data.md
sarthakforwet May 20, 2020
b0f01a5
Update add-files.md
sarthakforwet May 20, 2020
2ba532f
Update versioning.md
sarthakforwet May 20, 2020
bb1edce
Update pipelines.md
sarthakforwet May 20, 2020
9a68cef
Update define-ml-pipeline.md
sarthakforwet May 20, 2020
c2fe145
Update data-registries.md
sarthakforwet May 20, 2020
0fbf551
Update dvc-files-and-directories.md
sarthakforwet May 20, 2020
ec28a1d
Update external-dependencies.md
sarthakforwet May 20, 2020
af3a1ff
Update add.md
sarthakforwet May 20, 2020
afe146c
Update checkout.md
sarthakforwet May 20, 2020
980fd31
Update commit.md
sarthakforwet May 20, 2020
f5775c5
Update destroy.md
sarthakforwet May 20, 2020
9d04a9b
Update init.md
sarthakforwet May 20, 2020
d33df7e
Update move.md
sarthakforwet May 20, 2020
bfcc276
Update index.md
sarthakforwet May 20, 2020
ae564a6
Update index.md
sarthakforwet May 20, 2020
a57d17a
Update push.md
sarthakforwet May 21, 2020
d2552c8
Update index.md
sarthakforwet May 21, 2020
ed96d47
Update index.md
sarthakforwet May 21, 2020
31a973e
Update pull.md
sarthakforwet May 21, 2020
d71fcec
Update add-files.md
sarthakforwet May 21, 2020
9af23c8
Update add-files.md
sarthakforwet May 21, 2020
51b5bfa
Update import-data.md
sarthakforwet May 21, 2020
80ab644
Update import-data.md
sarthakforwet May 21, 2020
1e375c6
Update define-ml-pipeline.md
sarthakforwet May 21, 2020
03ec69d
Update data-registries.md
sarthakforwet May 21, 2020
66da2ea
Update external-dependencies.md
sarthakforwet May 21, 2020
3fa7974
Update add.md
sarthakforwet May 21, 2020
06538a8
Update checkout.md
sarthakforwet May 21, 2020
987433c
Added reference to .dvcignore to checkout.md
sarthakforwet May 21, 2020
318b225
Added reference to .dvcignore in commit.md
sarthakforwet May 21, 2020
8a49687
Added reference to .dvcignore
sarthakforwet May 21, 2020
0a67b3b
Added reference to .dvcignore
sarthakforwet May 21, 2020
b791813
Added reference to .dvcignore
sarthakforwet May 21, 2020
8d875c7
Added reference to .dvcignore
sarthakforwet May 21, 2020
22345a2
Added reference to .dvcignore
sarthakforwet May 21, 2020
fc469b2
Added referene to .dvcignore
sarthakforwet May 21, 2020
4ecbc84
Added reference to .dvcignore
sarthakforwet May 21, 2020
3614bce
Added reference to .dvcignore
sarthakforwet May 21, 2020
ac3a99e
Added reference to .dvcignore
sarthakforwet May 21, 2020
75fa027
Added reference to .dvcignore
sarthakforwet May 21, 2020
15cd8d4
Added reference to .dvcignore
sarthakforwet May 21, 2020
e06b0f8
Added reference to .dvcignore
sarthakforwet May 21, 2020
a9caad9
Added reference to .dvcignore
sarthakforwet May 21, 2020
8270449
Added reference to .dvcignore
sarthakforwet May 21, 2020
9bce697
Added reference to .dvcignore
sarthakforwet May 21, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions content/docs/command-reference/add.md
Original file line number Diff line number Diff line change
Expand Up @@ -243,3 +243,6 @@ In this case, a DVC-file is generated for each file in the `pics/` directory
tree. No top-level DVC-file is generated, which is typically less convenient.
For example, we cannot use the directory structure as one unit with `dvc run` or
other commands.

To untrack a file or directory just add [patterns](https://git-scm.com/docs/gitignore) (corresponding to the location of file or directory) under `.dvcignore` file.<br>
See [.dvcignore](docs/user-guide/.dvcignore) for more details.
Comment on lines +247 to +248
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a dvcignore ref in this doc is def. needed but here you're just appending at the end of the examples. Please read the page (rendered in the website), check the structure, and find a better place to add this ref.

Adding an add example specific to dvcignore would also be nice.

2 changes: 2 additions & 0 deletions content/docs/command-reference/checkout.md
Original file line number Diff line number Diff line change
Expand Up @@ -230,3 +230,5 @@ MD5 (model.pkl) = 662eb7f64216d9c2c1088d0a5e2c6951
Previously this took two commands, `git checkout` followed by `dvc checkout`. We
can now skip the second one, which is automatically run for us. The workspace is
automatically synchronized accordingly.

One thing to note here is when we are using `dvc checkout`, it does not affect state of files and directories listed under `.dvcignore` as these are currently untracked by DVC and `dvc checkout` synchronizes only tracked files and directories with the versions specified in the current DVC-files. <br> See [.dvcignore](docs/user-guide/.dvcignore) for more details.
2 changes: 2 additions & 0 deletions content/docs/command-reference/commit.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,8 @@ force-update the [DVC-files](/doc/user-guide/dvc-file-format) and save data to
cache. They are still useful, but keep in mind that DVC can't guarantee
reproducibility in those cases.

Note that [patterns](https://git-scm.com/docs/gitignore) listed in `.dvcignore` are not updated as a result of `dvc commit` as they are not currently tracked by DVC. See [.dvcignore](docs/user-guide/.dvcignore) for more details.

Comment on lines 72 to +74
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note was in a more appropriate place. The paragraph just needed formatting (see docs contrib guide) and also I would not link patterns to the Git website in any of these docs to avoid confusions.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link href is incorrect though, it's missing a / in the beginning, please run the app locally so you can see and try your changes. See docs contrib guide.

## Options

- `-d`, `--with-deps` - determines files to commit by tracking dependencies to
Expand Down
3 changes: 3 additions & 0 deletions content/docs/command-reference/destroy.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ directory.) If you were using
cache, DVC will replace them with copies, so that your data is intact after the
project's destruction.

Note that `.dvcignore` will not get deleted as a result of `dvc destroy`.<br>
See [.dvcignore](docs/user-guide/.dvcignore) for more details.

## Options

- `-f`, `--force` - do not prompt when destroying this project.
Expand Down
3 changes: 3 additions & 0 deletions content/docs/command-reference/init.md
Original file line number Diff line number Diff line change
Expand Up @@ -223,3 +223,6 @@ repo
└── project-a
└── .dvc
```

Its quite intutive to add `.dvcignore` at the time of project intialization for management of tracking of files throughout the project.<br>
See [.dvcignore](docs/user-guide/.dvcignore) for more details.
11 changes: 11 additions & 0 deletions content/docs/command-reference/move.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,17 @@ outs:
md5: c8263e8422925b0872ee1fb7c953742a
path: other.csv
```
Note that when we try to use `dvc move` over a file whose pattern matches one of the patterns listed in `.dvcignore`, it would raise an error because that DVC-file was not tracked by DVC.

```dvc
$ dvc add data.csv
$ echo data.* >> .dvcignore
$ dvc move data.csv other.csv
ERROR: failed to move 'data.csv' -> 'other.csv' - Unable to find DVC-file with output 'data.csv'

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
```
See [.dvcignore](docs/user-guide/.dvcignore) for more details.
Comment on lines +74 to +84
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also looks good to me, maybe just put it under an H2 heading such as ## Dvcignore effects. Same in other docs.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it's probably best without header and also without example. Just a very short note should do the trick here. We just want users to be aware of .dvcignore, not to explain it in every command ref.


## Options

Expand Down
2 changes: 2 additions & 0 deletions content/docs/command-reference/pipeline/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ interconnected by their dependencies and outputs later. (See `dvc repro`.)
`dvc pipeline` commands help users display the existing project pipelines in
different ways.

DVC might remove ignored files(files listed in `.dvcignore`) upon `dvc run` or `dvc repro`. If they are not produced by a pipeline stage, they can be deleted permanently.<br>
See [.dvcignore](docs/user-guide/.dvcignore) for more details.
## Options

- `-h`, `--help` - prints the usage/help message, and exit.
Expand Down
3 changes: 3 additions & 0 deletions content/docs/command-reference/pull.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,9 @@ After a data file is in cache, `dvc pull` can use OS-specific mechanisms like
reflinks or hardlinks to put it in the workspace without copying. See
`dvc checkout` for more details.

Note that when you do `dvc pull` then the missing files whose corresponding DVC-files matches with the DVC-files in remote storage will be downloaded. But if a DVC-file is listed under `.dvcignore` then its corresponding file won't be downloaded.<br>
See [.dvcignore](docs/user-guide/.dvcignore) for more details.
Comment on lines +59 to +60
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misisng a space and please don't add <br>s, just regular break lines

Suggested change
Note that when you do `dvc pull` then the missing files whose corresponding DVC-files matches with the DVC-files in remote storage will be downloaded. But if a DVC-file is listed under `.dvcignore` then its corresponding file won't be downloaded.<br>
See [.dvcignore](docs/user-guide/.dvcignore) for more details.
DVC might remove ignored files (files listed in `.dvcignore`) upon `dvc run` or `dvc repro`. If they are not produced by a pipeline stage, they can be deleted permanently. See [.dvcignore](docs/user-guide/.dvcignore) for more details.

It also needs formatting (max 80 char lines).


## Options

- `-a`, `--all-branches` - determines the files to download by examining
Expand Down
2 changes: 2 additions & 0 deletions content/docs/command-reference/push.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,8 @@ backward from the target [stage files](/doc/command-reference/run), through the
corresponding [pipelines](/doc/command-reference/pipeline), to find data files
to push.

Note that as `dvc push` uploads tracked files and directories to remote storage, it won't upload files and directories listed under `.dvcignore`.<br> See [.dvcignore](docs/user-guide/.dvcignore) for more details.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is the case? Even if the data was tracked before adding it to .dvcignore? If so this note is OK, just again: no <br> and needs formatting.


## Options

- `-a`, `--all-branches` - determines the files to upload by examining DVC-files
Expand Down
2 changes: 2 additions & 0 deletions content/docs/command-reference/remote/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ be used or these files could be edited manually.
For the typical process to share the <abbr>project</abbr> via remote, see
[Sharing Data And Model Files](/doc/use-cases/sharing-data-and-model-files).

Only those files will be present in remote storage which are not listed in `.dvcignore`.<br> See [.dvcignore](docs/user-guide/.dvcignore) for more details.

## Options

- `-h`, `--help` - prints the usage/help message, and exit.
Expand Down
3 changes: 3 additions & 0 deletions content/docs/tutorials/deep/define-ml-pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -403,5 +403,8 @@ data/eval.txt:AUC: 0.624652
> document, our focus is DVC, not ML modeling, so we use a relatively small
> dataset without any advanced ML techniques.

Note that if a file is not produced by a pipeline stage and listed under `.dvcignore` then DVC might remove them upon `dvc run`<br>
See [.dvcignore](docs/user-guide/.dvcignore) for more details.

In the next chapter we will try to improve the metrics by changing our modeling
code and using reproducibility in our pipeline.
2 changes: 2 additions & 0 deletions content/docs/tutorials/get-started/add-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ $ git commit -m "Add raw data to project"
Committing DVC-files with Git allows us to track different versions of the
<abbr>project</abbr> data as it evolves with the source code tracked by Git.

When we don't want DVC to track specific files and directories, we list them under `.dvcignore`.
<br>See [.dvcignore](docs/user-guide/.dvcignore) for more details.
<details>

### Expand to learn about DVC internals
Expand Down
2 changes: 2 additions & 0 deletions content/docs/tutorials/get-started/import-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,8 @@ The `url` and `rev_lock` subfields under `repo` are used to save the origin and

</details>

Suppose we want only a subset of files from the imported ones to work on and need remaining files in later stages. Meanwhile we also don't want DVC to track those files (as files as automatically tracked by DVC when imported) so we can just list them under `.dvcignore`. See [.dvcignore](docs/user-guide/.dvcignore) for more details.

Since this is not an official part of this _Get Started_, bring everything back
to normal with:

Expand Down
2 changes: 2 additions & 0 deletions content/docs/tutorials/pipelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -396,3 +396,5 @@ DVC streamlines all of your experiments into a single, reproducible
<abbr>project</abbr>, and it makes it easy to share it with Git, including
dependencies. This collaboration feature provides the ability to review data
science research.

See also [.dvcignore](docs/user-guide/.dvcignore) to untrack specific files in a pipeline.
6 changes: 5 additions & 1 deletion content/docs/tutorials/versioning.md
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,9 @@ If you run `git status` you'll see that `data.dvc` is modified and currently
points to the `v1.0` version of the dataset, while code and model files are from
the `v2.0` tag.

Note that the contents under `.dvcgnore` file won't get affected when switching between versions concluding that the files that are untracked in one version will also remain untracked in other versions.
See [.dvcignore](docs/user-guide/.dvcignore) for more details.

<details>

### Expand to learn more about DVC internals
Expand Down Expand Up @@ -342,7 +345,8 @@ was a dependency change. It also updates outputs and puts them into the
To make things a little simpler: if `dvc add` and `dvc checkout` provide a basic
mechanism to version control large data files or models, `dvc run` and
`dvc repro` provide a build system for ML models, which is similar to
[Make](https://www.gnu.org/software/make/) in software build automation.
[Make](https://www.gnu.org/software/make/) in software build automation.


## What's next?

Expand Down
3 changes: 3 additions & 0 deletions content/docs/use-cases/data-registries.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,9 @@ the data source (registry repo). This is achieved by creating a particular kind
of [DVC-file](/doc/user-guide/dvc-file-format) (a.k.a. _import stage_). This
file can be used staged and committed with Git.

As DVC automatically tracks the files downloaded via `dvc import`, we can list files which we don't want DVC to track under `.dvcignore`.<br>
See [.dvcignore](docs/user-guide/.dvcignore) for more details.

As an addition to the import workflow, and enabled the saved dependency, we can
easily bring it up to date in our consumer project(s) with `dvc update` whenever
the the dataset changes in the source repo (data registry):
Expand Down
2 changes: 2 additions & 0 deletions content/docs/user-guide/dvc-files-and-directories.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,3 +122,5 @@ $ cat .dvc/cache/19/6a322c107c2572335158503c64bfba.dir
```

See also `dvc cache dir` to set the location of the cache directory.

Refer to `.dvcignore` to add [patterns](https://git-scm.com/docs/gitignore) which DVC ignores as if they are non-existent to it.
2 changes: 2 additions & 0 deletions content/docs/user-guide/external-dependencies.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,8 @@ Importing 'model.pkl (git@github.com:iterative/example-get-started)'
The command above creates `model.pkl.dvc`, where the external dependency is
specified (with the `repo` field).

See [.dvcignore](docs/user-guide/.dvcignore) for untracking unecessary files which were automatically tracked by DVC on running `dvc import` or `dvc import-url`.

<details>

### Expand to see resulting DVC-file
Expand Down