From 28f25c6827601ecb50c3bc1211e56b717c7c632a Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Fri, 13 Nov 2020 21:51:44 -0700
Subject: [PATCH 01/59] Initial commit with outline of new doc and navigation
---
content/docs/sidebar.json | 5 +
content/docs/user-guide/basic-concepts.md | 129 ++++++++++++++++++++++
2 files changed, 134 insertions(+)
create mode 100644 content/docs/user-guide/basic-concepts.md
diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json
index 6794fbff8f..90241f4386 100644
--- a/content/docs/sidebar.json
+++ b/content/docs/sidebar.json
@@ -87,6 +87,11 @@
"slug": "what-is-dvc",
"source": "what-is-dvc.md"
},
+ {
+ "label": "Basic Concepts",
+ "slug": "basic-concepts",
+ "source": "basic-concepts.md"
+ },
"dvc-files-and-directories",
"merge-conflicts",
{
diff --git a/content/docs/user-guide/basic-concepts.md b/content/docs/user-guide/basic-concepts.md
new file mode 100644
index 0000000000..55b98df36a
--- /dev/null
+++ b/content/docs/user-guide/basic-concepts.md
@@ -0,0 +1,129 @@
+# Basic Concepts
+
+Intro and DVC philosophy...possible diagram of cache/remote/workspace
+
+## Cache
+
+_From `dvc cache`_
+
+The DVC Cache is where your data files, models, etc. (anything you want to
+version with DVC) are actually stored. The data files and directories visible in
+the workspace are links\* to (or copies of) the ones in cache.
+Learn more about it's
+[structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).
+
+_from tooltip_
+
+The DVC cache is a hidden storage (by default located in the `.dvc/cache`
+directory) for files that are tracked by DVC, and their different versions.
+Learn more about it's
+[structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).
+
+## DVC Files
+
+_from dvc-files-and-directories_
+
+Once initialized in a project, DVC populates its installation
+directory (`.dvc/`) with the
+[internal directories and files](#internal-directories-and-files) needed for DVC
+operation.
+
+Additionally, there are a few metafiles that support DVC's features:
+
+- Files ending with the `.dvc` extension are placeholders to track data files
+ and directories. A DVC project usually has one `.dvc` file per
+ large data file or directory being tracked.
+- `dvc.yaml` files (or _pipelines files_) specify stages that form the
+ pipeline(s) of a project, and how they connect (_dependency graph_ or DAG).
+
+ These normally have a matching `dvc.lock` file to record the pipeline state
+ and track its outputs.
+
+Both `.dvc` files and `dvc.yaml` use human-friendly YAML 1.2 schemas, described
+below. We encourage you to get familiar with them so you may create, generate,
+and edit them on your own.
+
+Both the internal directory and these metafiles should be versioned with Git (in
+Git-enabled repositories).
+
+## Metrics and Plots
+
+_from plots and metrics intros_
+
+DVC has two concepts for metrics, that represent different results of machine
+learning training or data processing:
+
+1. `dvc metrics` represent **scalar numbers** such as AUC, _true positive rate_,
+ etc.
+2. `dvc plots` can be used to visualize **data series** such as AUC curves, loss
+ functions, confusion matrices, etc.
+
+_from `dvc metrics`_
+
+In order to follow the performance of machine learning experiments, DVC has the
+ability to mark a certain stage outputs as metrics. These metrics
+are project-specific floating-point or integer values e.g. AUC, ROC, false
+positives, etc.
+
+_from `dvc plots` description_
+
+DVC provides a set of commands to visualize certain metrics of machine learning
+experiments as plots. Usual plot examples are AUC curves, loss functions,
+confusion matrices, among others.
+
+_probably should mention diff..._
+
+## Pipelines
+
+_from `dvc dag`_
+
+A data pipeline, in general, is a series of data processing
+[stages](/doc/command-reference/run) (for example, console commands that take an
+input and produce an output). A pipeline may produce intermediate
+data, and has a final result.
+
+Data science and machine learning pipelines typically start with large raw
+datasets, include intermediate featurization and training stages, and produce a
+final model, as well as accuracy [metrics](/doc/command-reference/metrics).
+
+In DVC, pipeline stages and commands, their data I/O, interdependencies, and
+results (intermediate or final) are specified in `dvc.yaml`, which can be
+written manually or built using the helper command `dvc run`. This allows DVC to
+restore one or more pipelines later (see `dvc repro`).
+
+> DVC builds a dependency graph
+> ([DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) to do this.
+
+## Remote
+
+_from `dvc remote`_
+
+What is data remote?
+
+The same way as GitHub provides storage hosting for Git repositories, DVC
+remotes provide a location to store and share data and models. You can pull data
+assets created by colleagues from DVC remotes without spending time and
+resources to build or process them locally. Remote storage can also save space
+on your local environment – DVC can [fetch](/doc/command-reference/fetch) into
+the cache directory only the data you need for a specific
+branch/commit.
+
+Using DVC with remote storage is optional. DVC commands use the local cache
+(usually in dir `.dvc/cache`) as data storage by default. This enables the main
+DVC usage scenarios out of the box.
+
+## Workspace
+
+_from workspace tooltip_
+
+Directory containing all your project files e.g. raw datasets, source code, ML
+models, etc. Typically, it's also a Git repository. It will contain your DVC
+project.
+
+_from dvc-project tooltip_
+
+Initialized by running `dvc init` in the **workspace** (typically a Git
+repository). It will contain the
+[`.dvc/` directory](/doc/user-guide/dvc-files-and-directories), as well as
+`dvc.yaml` and `.dvc` files created with commands such as `dvc add` or
+`dvc run`.
From 6434169e40bf18d2e1903c5f87c35070615ceea3 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Tue, 17 Nov 2020 16:43:03 -0700
Subject: [PATCH 02/59] Attempt to rename basic-concepts -> glossary, test
tooltip link
---
.../docs/user-guide/{basic-concepts => glossary}/dependency.md | 0
.../docs/user-guide/{basic-concepts => glossary}/dvc-cache.md | 3 +++
.../user-guide/{basic-concepts => glossary}/dvc-project.md | 0
.../{basic-concepts => glossary}/external-dependency.md | 0
.../user-guide/{basic-concepts => glossary}/import-stage.md | 0
content/docs/user-guide/{basic-concepts => glossary}/output.md | 0
.../docs/user-guide/{basic-concepts => glossary}/parameter.md | 0
.../docs/user-guide/{basic-concepts => glossary}/workspace.md | 0
src/gatsby/models/docs/onCreateMarkdownContentNode.js | 2 +-
src/gatsby/models/glossary/index.js | 3 +--
10 files changed, 5 insertions(+), 3 deletions(-)
rename content/docs/user-guide/{basic-concepts => glossary}/dependency.md (100%)
rename content/docs/user-guide/{basic-concepts => glossary}/dvc-cache.md (79%)
rename content/docs/user-guide/{basic-concepts => glossary}/dvc-project.md (100%)
rename content/docs/user-guide/{basic-concepts => glossary}/external-dependency.md (100%)
rename content/docs/user-guide/{basic-concepts => glossary}/import-stage.md (100%)
rename content/docs/user-guide/{basic-concepts => glossary}/output.md (100%)
rename content/docs/user-guide/{basic-concepts => glossary}/parameter.md (100%)
rename content/docs/user-guide/{basic-concepts => glossary}/workspace.md (100%)
diff --git a/content/docs/user-guide/basic-concepts/dependency.md b/content/docs/user-guide/glossary/dependency.md
similarity index 100%
rename from content/docs/user-guide/basic-concepts/dependency.md
rename to content/docs/user-guide/glossary/dependency.md
diff --git a/content/docs/user-guide/basic-concepts/dvc-cache.md b/content/docs/user-guide/glossary/dvc-cache.md
similarity index 79%
rename from content/docs/user-guide/basic-concepts/dvc-cache.md
rename to content/docs/user-guide/glossary/dvc-cache.md
index 49c0644100..81b624549f 100644
--- a/content/docs/user-guide/basic-concepts/dvc-cache.md
+++ b/content/docs/user-guide/glossary/dvc-cache.md
@@ -7,3 +7,6 @@ The DVC cache is a hidden storage (by default located in the `.dvc/cache`
directory) for files that are tracked by DVC, and their different versions.
Learn more about it's
[structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).
+
+Learn more about the [concept of cache](/doc/user-guide/basic-concepts#cache) in
+DVC.
diff --git a/content/docs/user-guide/basic-concepts/dvc-project.md b/content/docs/user-guide/glossary/dvc-project.md
similarity index 100%
rename from content/docs/user-guide/basic-concepts/dvc-project.md
rename to content/docs/user-guide/glossary/dvc-project.md
diff --git a/content/docs/user-guide/basic-concepts/external-dependency.md b/content/docs/user-guide/glossary/external-dependency.md
similarity index 100%
rename from content/docs/user-guide/basic-concepts/external-dependency.md
rename to content/docs/user-guide/glossary/external-dependency.md
diff --git a/content/docs/user-guide/basic-concepts/import-stage.md b/content/docs/user-guide/glossary/import-stage.md
similarity index 100%
rename from content/docs/user-guide/basic-concepts/import-stage.md
rename to content/docs/user-guide/glossary/import-stage.md
diff --git a/content/docs/user-guide/basic-concepts/output.md b/content/docs/user-guide/glossary/output.md
similarity index 100%
rename from content/docs/user-guide/basic-concepts/output.md
rename to content/docs/user-guide/glossary/output.md
diff --git a/content/docs/user-guide/basic-concepts/parameter.md b/content/docs/user-guide/glossary/parameter.md
similarity index 100%
rename from content/docs/user-guide/basic-concepts/parameter.md
rename to content/docs/user-guide/glossary/parameter.md
diff --git a/content/docs/user-guide/basic-concepts/workspace.md b/content/docs/user-guide/glossary/workspace.md
similarity index 100%
rename from content/docs/user-guide/basic-concepts/workspace.md
rename to content/docs/user-guide/glossary/workspace.md
diff --git a/src/gatsby/models/docs/onCreateMarkdownContentNode.js b/src/gatsby/models/docs/onCreateMarkdownContentNode.js
index 7a250780ff..1393e1dd99 100644
--- a/src/gatsby/models/docs/onCreateMarkdownContentNode.js
+++ b/src/gatsby/models/docs/onCreateMarkdownContentNode.js
@@ -3,7 +3,7 @@ const path = require('path')
async function createMarkdownDocsNode(api, { parentNode, createChildNode }) {
// Suppress page creation for Basic Concepts and the Glossary
// They're only used in tooltips now, but we intend to expand on them later.
- if (parentNode.relativeDirectory === 'docs/user-guide/basic-concepts') return
+ if (parentNode.relativeDirectory === 'docs/user-guide/glossary') return
const splitDir = parentNode.relativeDirectory.split('/')
if (splitDir[0] !== 'docs') return
diff --git a/src/gatsby/models/glossary/index.js b/src/gatsby/models/glossary/index.js
index c53d4c01de..7188e42d3c 100644
--- a/src/gatsby/models/glossary/index.js
+++ b/src/gatsby/models/glossary/index.js
@@ -23,8 +23,7 @@ module.exports = {
},
async onCreateMarkdownContentNode(api, { parentNode, createChildNode }) {
// Only operate on nodes within the docs/glossary folder.
- if (parentNode.relativeDirectory !== 'docs/user-guide/basic-concepts')
- return
+ if (parentNode.relativeDirectory !== 'docs/user-guide/glossary') return
const { node, createNodeId, createContentDigest } = api
From fa56570bb66ad3838e9075569abe3963cfa328a3 Mon Sep 17 00:00:00 2001
From: rogermparent
Date: Wed, 18 Nov 2020 15:59:07 -0500
Subject: [PATCH 03/59] Add tooltip field that overrides basic concept tooltip
content
---
package.json | 5 +-
.../Documentation/Markdown/Tooltip/index.tsx | 2 +-
src/gatsby/models/glossary/index.js | 24 +-
src/utils/front/glossary.ts | 2 +-
yarn.lock | 385 ++++++++++++++++--
5 files changed, 376 insertions(+), 42 deletions(-)
diff --git a/package.json b/package.json
index f16dfd37b3..e3798888f5 100644
--- a/package.json
+++ b/package.json
@@ -77,6 +77,7 @@
"react-slick": "^0.25.2",
"react-use": "^14.0.0",
"rehype-react": "^5.0.1",
+ "remark-preset-lint-recommended": "^5.0.0",
"repo-link-check": "^0.7.1",
"reset-css": "^5.0.1",
"s3-client": "^4.4.2",
@@ -151,8 +152,8 @@
"prettier": "^2.0.4",
"rehype-parse": "^6.0.2",
"rehype-stringify": "^7.0.0",
- "remark": "^12.0.0",
- "remark-html": "^11.0.1",
+ "remark": "^13.0.0",
+ "remark-html": "^13.0.1",
"remark-parse": "^8.0.2",
"stylelint": "^13.3.0",
"stylelint-config-standard": "^20.0.0",
diff --git a/src/components/Documentation/Markdown/Tooltip/index.tsx b/src/components/Documentation/Markdown/Tooltip/index.tsx
index 4341b7a526..608a143c44 100644
--- a/src/components/Documentation/Markdown/Tooltip/index.tsx
+++ b/src/components/Documentation/Markdown/Tooltip/index.tsx
@@ -30,7 +30,7 @@ const Tooltip: React.FC<{ text: string }> = ({ text }) => {
})
}
})
- }, [text])
+ }, [text, glossary])
if (!state.match) {
return {text}
diff --git a/src/gatsby/models/glossary/index.js b/src/gatsby/models/glossary/index.js
index 7188e42d3c..1bf0801ac5 100644
--- a/src/gatsby/models/glossary/index.js
+++ b/src/gatsby/models/glossary/index.js
@@ -1,4 +1,9 @@
const { parentResolverPassthrough } = require('gatsby-plugin-parent-resolvers')
+const remark = require('remark')
+const recommended = require('remark-preset-lint-recommended')
+const remarkHtml = require('remark-html')
+
+const tooltipHTMLProcessor = remark().use(recommended).use(remarkHtml)
module.exports = {
createSchemaCustomization({
@@ -14,6 +19,20 @@ module.exports = {
type: 'String!',
resolve: parentResolverPassthrough()
},
+ tooltip: {
+ type: 'String!',
+ resolve: (source, args, context, info) => {
+ return (
+ source.tooltip ||
+ parentResolverPassthrough({ field: 'html' })(
+ source,
+ args,
+ context,
+ info
+ )
+ )
+ }
+ },
name: 'String!',
match: '[String]'
}
@@ -28,12 +47,13 @@ module.exports = {
const { node, createNodeId, createContentDigest } = api
const {
- frontmatter: { name, match }
+ frontmatter: { name, match, tooltip }
} = node
const fieldData = {
name,
- match
+ match,
+ tooltip: tooltip && tooltipHTMLProcessor.processSync(tooltip).toString()
}
const entryNode = {
diff --git a/src/utils/front/glossary.ts b/src/utils/front/glossary.ts
index 55f25fe97d..292ff52cbf 100644
--- a/src/utils/front/glossary.ts
+++ b/src/utils/front/glossary.ts
@@ -14,7 +14,7 @@ const useGlossary = (): IGlossary =>
query GlossaryEntries {
allGlossaryEntry {
contents: nodes {
- desc: html
+ desc: tooltip
name
match
}
diff --git a/yarn.lock b/yarn.lock
index 2d1aa73b86..135c0a96db 100644
--- a/yarn.lock
+++ b/yarn.lock
@@ -4220,6 +4220,11 @@ clone@^1.0.2:
resolved "https://registry.yarnpkg.com/clone/-/clone-1.0.4.tgz#da309cc263df15994c688ca902179ca3c7cd7c7e"
integrity sha1-2jCcwmPfFZlMaIypAheco8fNfH4=
+co@3.1.0:
+ version "3.1.0"
+ resolved "https://registry.yarnpkg.com/co/-/co-3.1.0.tgz#4ea54ea5a08938153185e15210c68d9092bc1b78"
+ integrity sha1-TqVOpaCJOBUxheFSEMaNkJK8G3g=
+
co@^4.6.0:
version "4.6.0"
resolved "https://registry.yarnpkg.com/co/-/co-4.6.0.tgz#6ea6bdf3d853ae54ccb8e47bfa0bf3f9031fb184"
@@ -4239,7 +4244,7 @@ code-point-at@^1.0.0:
resolved "https://registry.yarnpkg.com/code-point-at/-/code-point-at-1.1.0.tgz#0d070b4d043a5bea33a2f1a40e2edb3d9a4ccf77"
integrity sha1-DQcLTQQ6W+ozovGkDi7bPZpMz3c=
-collapse-white-space@^1.0.0, collapse-white-space@^1.0.2:
+collapse-white-space@^1.0.0, collapse-white-space@^1.0.2, collapse-white-space@^1.0.4:
version "1.0.6"
resolved "https://registry.yarnpkg.com/collapse-white-space/-/collapse-white-space-1.0.6.tgz#e63629c0016665792060dbbeb79c42239d2c5287"
integrity sha512-jEovNnrhMuqyCcjfEJA56v0Xq8SkIoPKDyaHahwo3POf4qcSXqMYuwNcOTzp74vTsR9Tn08z4MxWqAhcekogkQ==
@@ -5024,6 +5029,13 @@ debug@^3.0.0, debug@^3.1.0, debug@^3.1.1, debug@^3.2.5, debug@^3.2.6:
dependencies:
ms "^2.1.1"
+debug@^4.0.0:
+ version "4.2.0"
+ resolved "https://registry.yarnpkg.com/debug/-/debug-4.2.0.tgz#7f150f93920e94c58f5574c2fd01a3110effe7f1"
+ integrity sha512-IX2ncY78vDTjZMFUdmsvIRFY2Cf4FnD0wRs+nQwJU8Lu99/tPFdb0VybiiMTPe3I6rQmwsqQqRBvxU+bZ/I8sg==
+ dependencies:
+ ms "2.1.2"
+
debug@^4.0.1, debug@^4.1.0, debug@^4.1.1, debug@~4.1.0:
version "4.1.1"
resolved "https://registry.yarnpkg.com/debug/-/debug-4.1.1.tgz#3b72260255109c6b589cee050f1d516139664791"
@@ -8416,10 +8428,10 @@ hast-util-raw@^4.0.0:
xtend "^4.0.1"
zwitch "^1.0.0"
-hast-util-sanitize@^2.0.0:
- version "2.0.2"
- resolved "https://registry.yarnpkg.com/hast-util-sanitize/-/hast-util-sanitize-2.0.2.tgz#8a4299bccba6cc8836284466d446060d2ecb2f5c"
- integrity sha512-ppfgtI6pVb0/dopboV/N2SZju/CKEJzLs6jm58NxoYU1c1ib+/sh14JV5bjLDOEYvyeb5hYIttFKanYm0rtnHQ==
+hast-util-sanitize@^3.0.0:
+ version "3.0.1"
+ resolved "https://registry.yarnpkg.com/hast-util-sanitize/-/hast-util-sanitize-3.0.1.tgz#c6a0853a3cbd174995e394aa1fee218f2dafbfad"
+ integrity sha512-XQmIuBSa+DHfAhkrVvtHoSdSLgOnNeBijLY4NPIJgxvpH8MjLof0p68ZADWfWebU7nCWY450HQJZvas6RFKwDA==
dependencies:
xtend "^4.0.0"
@@ -10855,7 +10867,7 @@ longest-streak@^1.0.0:
resolved "https://registry.yarnpkg.com/longest-streak/-/longest-streak-1.0.0.tgz#d06597c4d4c31b52ccb1f5d8f8fe7148eafd6965"
integrity sha1-0GWXxNTDG1LMsfXY+P5xSOr9aWU=
-longest-streak@^2.0.1:
+longest-streak@^2.0.0, longest-streak@^2.0.1:
version "2.0.4"
resolved "https://registry.yarnpkg.com/longest-streak/-/longest-streak-2.0.4.tgz#b8599957da5b5dab64dee3fe316fa774597d90e4"
integrity sha512-vM6rUVCVUJJt33bnmHiZEvr7wPT78ztX7rojL+LW51bHtLh6HTjx84LA5W4+oa6aKEJA7jJu5LR6vQRBpA5DVg==
@@ -11051,6 +11063,11 @@ md5@^2.2.1:
crypt "~0.0.1"
is-buffer "~1.1.1"
+mdast-comment-marker@^1.0.0:
+ version "1.1.2"
+ resolved "https://registry.yarnpkg.com/mdast-comment-marker/-/mdast-comment-marker-1.1.2.tgz#5ad2e42cfcc41b92a10c1421a98c288d7b447a6d"
+ integrity sha512-vTFXtmbbF3rgnTh3Zl3irso4LtvwUq/jaDvT2D1JqTGAwaipcS7RpTxzi6KjoRqI9n2yuAhzLDAC8xVTF3XYVQ==
+
mdast-squeeze-paragraphs@^4.0.0:
version "4.0.0"
resolved "https://registry.yarnpkg.com/mdast-squeeze-paragraphs/-/mdast-squeeze-paragraphs-4.0.0.tgz#7c4c114679c3bee27ef10b58e2e015be79f1ef97"
@@ -11079,13 +11096,6 @@ mdast-util-definitions@^1.2.0, mdast-util-definitions@^1.2.5:
dependencies:
unist-util-visit "^1.0.0"
-mdast-util-definitions@^2.0.0:
- version "2.0.1"
- resolved "https://registry.yarnpkg.com/mdast-util-definitions/-/mdast-util-definitions-2.0.1.tgz#2c931d8665a96670639f17f98e32c3afcfee25f3"
- integrity sha512-Co+DQ6oZlUzvUR7JCpP249PcexxygiaKk9axJh+eRzHDZJk2julbIdKB4PXHVxdBuLzvJ1Izb+YDpj2deGMOuA==
- dependencies:
- unist-util-visit "^2.0.0"
-
mdast-util-definitions@^3.0.0:
version "3.0.1"
resolved "https://registry.yarnpkg.com/mdast-util-definitions/-/mdast-util-definitions-3.0.1.tgz#06af6c49865fc63d6d7d30125569e2f7ae3d0a86"
@@ -11093,6 +11103,28 @@ mdast-util-definitions@^3.0.0:
dependencies:
unist-util-visit "^2.0.0"
+mdast-util-definitions@^4.0.0:
+ version "4.0.0"
+ resolved "https://registry.yarnpkg.com/mdast-util-definitions/-/mdast-util-definitions-4.0.0.tgz#c5c1a84db799173b4dcf7643cda999e440c24db2"
+ integrity sha512-k8AJ6aNnUkB7IE+5azR9h81O5EQ/cTDXtWdMq9Kk5KcEW/8ritU5CeLg/9HhOC++nALHBlaogJ5jz0Ybk3kPMQ==
+ dependencies:
+ unist-util-visit "^2.0.0"
+
+mdast-util-from-markdown@^0.8.0:
+ version "0.8.1"
+ resolved "https://registry.yarnpkg.com/mdast-util-from-markdown/-/mdast-util-from-markdown-0.8.1.tgz#781371d493cac11212947226190270c15dc97116"
+ integrity sha512-qJXNcFcuCSPqUF0Tb0uYcFDIq67qwB3sxo9RPdf9vG8T90ViKnksFqdB/Coq2a7sTnxL/Ify2y7aIQXDkQFH0w==
+ dependencies:
+ "@types/mdast" "^3.0.0"
+ mdast-util-to-string "^1.0.0"
+ micromark "~2.10.0"
+ parse-entities "^2.0.0"
+
+mdast-util-heading-style@^1.0.2:
+ version "1.0.6"
+ resolved "https://registry.yarnpkg.com/mdast-util-heading-style/-/mdast-util-heading-style-1.0.6.tgz#6410418926fd5673d40f519406b35d17da10e3c5"
+ integrity sha512-8ZuuegRqS0KESgjAGW8zTx4tJ3VNIiIaGFNEzFpRSAQBavVc7AvOo9I4g3crcZBfYisHs4seYh0rAVimO6HyOw==
+
mdast-util-to-hast@9.1.0:
version "9.1.0"
resolved "https://registry.yarnpkg.com/mdast-util-to-hast/-/mdast-util-to-hast-9.1.0.tgz#6ef121dd3cd3b006bf8650b1b9454da0faf79ffe"
@@ -11110,6 +11142,20 @@ mdast-util-to-hast@9.1.0:
unist-util-position "^3.0.0"
unist-util-visit "^2.0.0"
+mdast-util-to-hast@^10.0.0:
+ version "10.0.1"
+ resolved "https://registry.yarnpkg.com/mdast-util-to-hast/-/mdast-util-to-hast-10.0.1.tgz#0cfc82089494c52d46eb0e3edb7a4eb2aea021eb"
+ integrity sha512-BW3LM9SEMnjf4HXXVApZMt8gLQWVNXc3jryK0nJu/rOXPOnlkUjmdkDlmxMirpbU9ILncGFIwLH/ubnWBbcdgA==
+ dependencies:
+ "@types/mdast" "^3.0.0"
+ "@types/unist" "^2.0.0"
+ mdast-util-definitions "^4.0.0"
+ mdurl "^1.0.0"
+ unist-builder "^2.0.0"
+ unist-util-generated "^1.0.0"
+ unist-util-position "^3.0.0"
+ unist-util-visit "^2.0.0"
+
mdast-util-to-hast@^3.0.4:
version "3.0.4"
resolved "https://registry.yarnpkg.com/mdast-util-to-hast/-/mdast-util-to-hast-3.0.4.tgz#132001b266031192348d3366a6b011f28e54dc40"
@@ -11127,20 +11173,17 @@ mdast-util-to-hast@^3.0.4:
unist-util-visit "^1.1.0"
xtend "^4.0.1"
-mdast-util-to-hast@^8.2.0:
- version "8.2.0"
- resolved "https://registry.yarnpkg.com/mdast-util-to-hast/-/mdast-util-to-hast-8.2.0.tgz#adf9f824defcd382e53dd7bace4282a45602ac67"
- integrity sha512-WjH/KXtqU66XyTJQ7tg7sjvTw1OQcVV0hKdFh3BgHPwZ96fSBCQ/NitEHsN70Mmnggt+5eUUC7pCnK+2qGQnCA==
+mdast-util-to-markdown@^0.5.0:
+ version "0.5.3"
+ resolved "https://registry.yarnpkg.com/mdast-util-to-markdown/-/mdast-util-to-markdown-0.5.3.tgz#e05c54a3ccd239bab63c48a1e5b5747f0dcd5aca"
+ integrity sha512-sr8q7fQJ1xoCqZSXW6dO/MYu2Md+a4Hfk9uO+XHCfiBhVM0EgWtfAV7BuN+ff6otUeu2xDyt1o7vhZGwOG3+BA==
dependencies:
- collapse-white-space "^1.0.0"
- detab "^2.0.0"
- mdast-util-definitions "^2.0.0"
- mdurl "^1.0.0"
- trim-lines "^1.0.0"
- unist-builder "^2.0.0"
- unist-util-generated "^1.0.0"
- unist-util-position "^3.0.0"
- unist-util-visit "^2.0.0"
+ "@types/unist" "^2.0.0"
+ longest-streak "^2.0.0"
+ mdast-util-to-string "^1.0.0"
+ parse-entities "^2.0.0"
+ repeat-string "^1.0.0"
+ zwitch "^1.0.0"
mdast-util-to-nlcst@^3.2.0:
version "3.2.3"
@@ -11152,7 +11195,7 @@ mdast-util-to-nlcst@^3.2.0:
unist-util-position "^3.0.0"
vfile-location "^2.0.0"
-mdast-util-to-string@^1.0.5, mdast-util-to-string@^1.1.0:
+mdast-util-to-string@^1.0.0, mdast-util-to-string@^1.0.2, mdast-util-to-string@^1.0.5, mdast-util-to-string@^1.1.0:
version "1.1.0"
resolved "https://registry.yarnpkg.com/mdast-util-to-string/-/mdast-util-to-string-1.1.0.tgz#27055500103f51637bd07d01da01eb1967a43527"
integrity sha512-jVU0Nr2B9X3MU4tSK7JP1CMkSvOj7X5l/GboG1tKRw52lLF1x2Ju92Ms9tNetCcbfX3hzlM73zYo2NKkWSfF/A==
@@ -11275,6 +11318,14 @@ methods@~1.1.2:
resolved "https://registry.yarnpkg.com/methods/-/methods-1.1.2.tgz#5529a4d67654134edcc5266656835b0f851afcee"
integrity sha1-VSmk1nZUE07cxSZmVoNbD4Ua/O4=
+micromark@~2.10.0:
+ version "2.10.1"
+ resolved "https://registry.yarnpkg.com/micromark/-/micromark-2.10.1.tgz#cd73f54e0656f10e633073db26b663a221a442a7"
+ integrity sha512-fUuVF8sC1X7wsCS29SYQ2ZfIZYbTymp0EYr6sab3idFjigFFjGa5UwoniPlV9tAgntjuapW1t9U+S0yDYeGKHQ==
+ dependencies:
+ debug "^4.0.0"
+ parse-entities "^2.0.0"
+
micromatch@^3.1.10, micromatch@^3.1.4:
version "3.1.10"
resolved "https://registry.yarnpkg.com/micromatch/-/micromatch-3.1.10.tgz#70859bc95c9840952f359a068a3fc49f9ecfac23"
@@ -11556,7 +11607,7 @@ ms@2.1.1:
resolved "https://registry.yarnpkg.com/ms/-/ms-2.1.1.tgz#30a5864eb3ebb0a66f2ebe6d727af06a09d86e0a"
integrity sha512-tgp+dl5cGk28utYktBsrFqA7HKgrhgPsg6Z/EfhWI4gl1Hwq8B/GmY/0oXZ6nF8hDVesS/FpnYaD/kOWhYQvyg==
-ms@^2.1.1:
+ms@2.1.2, ms@^2.1.1:
version "2.1.2"
resolved "https://registry.yarnpkg.com/ms/-/ms-2.1.2.tgz#d09d1f357b443f493382a8eb3ccd183872ae6009"
integrity sha512-sGkPx+VjMtmA6MX27oA4FBFELFCZZ4S4XqeGOXCv68tT+jb3vk/RyaKWP0PTKyWtmLSM0b+adUTEvbs1PEaH2w==
@@ -12785,6 +12836,11 @@ please-upgrade-node@^3.2.0:
dependencies:
semver-compare "^1.0.0"
+pluralize@^8.0.0:
+ version "8.0.0"
+ resolved "https://registry.yarnpkg.com/pluralize/-/pluralize-8.0.0.tgz#1a6fa16a38d12a1901e0320fa017051c539ce3b1"
+ integrity sha512-Nc3IT5yHzflTfbjgqWcCPpo7DaKy4FnpB0l/zCAW0Tc7jxAiuqSxHasntB3D7887LSrA93kDJ9IXovxJYxyLCA==
+
pngjs@^3.0.0, pngjs@^3.3.3:
version "3.4.0"
resolved "https://registry.yarnpkg.com/pngjs/-/pngjs-3.4.0.tgz#99ca7d725965fb655814eaf65f38f12bbdbf555f"
@@ -14183,15 +14239,174 @@ remark-footnotes@1.0.0:
resolved "https://registry.yarnpkg.com/remark-footnotes/-/remark-footnotes-1.0.0.tgz#9c7a97f9a89397858a50033373020b1ea2aad011"
integrity sha512-X9Ncj4cj3/CIvLI2Z9IobHtVi8FVdUrdJkCNaL9kdX8ohfsi18DXHsCVd/A7ssARBdccdDb5ODnt62WuEWaM/g==
-remark-html@^11.0.1:
- version "11.0.2"
- resolved "https://registry.yarnpkg.com/remark-html/-/remark-html-11.0.2.tgz#76f6f7c8981c736f01cb65f8853dbe5c2e546dfa"
- integrity sha512-U7qPKZq6Aai+UTpH5YrblLvqvdSUCRA4YmZYRTtbtknm/WUGmNUI0dvThbSuTNSf6TtC8btmbbScWi1wtUIxnw==
+remark-html@^13.0.1:
+ version "13.0.1"
+ resolved "https://registry.yarnpkg.com/remark-html/-/remark-html-13.0.1.tgz#d5b2d8be01203e61fc37403167ca7584879ad675"
+ integrity sha512-K5KQCXWVz+harnyC+UVM/J9eJWCgjYRqFeZoZf2NgP0iFbuuw/RgMZv3MA34b/OEpGnstl3oiOUtZzD3tJ+CBw==
dependencies:
- hast-util-sanitize "^2.0.0"
+ hast-util-sanitize "^3.0.0"
hast-util-to-html "^7.0.0"
- mdast-util-to-hast "^8.2.0"
- xtend "^4.0.1"
+ mdast-util-to-hast "^10.0.0"
+
+remark-lint-final-newline@^1.0.0:
+ version "1.0.5"
+ resolved "https://registry.yarnpkg.com/remark-lint-final-newline/-/remark-lint-final-newline-1.0.5.tgz#666f609a91f97c44f5ab7facf1fb3c5b3ffe398f"
+ integrity sha512-rfLlW8+Fz2dqnaEgU4JwLA55CQF1T4mfSs/GwkkeUCGPenvEYwSkCN2KO2Gr1dy8qPoOdTFE1rSufLjmeTW5HA==
+ dependencies:
+ unified-lint-rule "^1.0.0"
+
+remark-lint-hard-break-spaces@^2.0.0:
+ version "2.0.1"
+ resolved "https://registry.yarnpkg.com/remark-lint-hard-break-spaces/-/remark-lint-hard-break-spaces-2.0.1.tgz#2149b55cda17604562d040c525a2a0d26aeb0f0f"
+ integrity sha512-Qfn/BMQFamHhtbfLrL8Co/dbYJFLRL4PGVXZ5wumkUO5f9FkZC2RsV+MD9lisvGTkJK0ZEJrVVeaPbUIFM0OAw==
+ dependencies:
+ unified-lint-rule "^1.0.0"
+ unist-util-generated "^1.1.0"
+ unist-util-position "^3.0.0"
+ unist-util-visit "^2.0.0"
+
+remark-lint-list-item-bullet-indent@^3.0.0:
+ version "3.0.0"
+ resolved "https://registry.yarnpkg.com/remark-lint-list-item-bullet-indent/-/remark-lint-list-item-bullet-indent-3.0.0.tgz#3c902e75e841850da8b37126da45fc1fe850d7d6"
+ integrity sha512-X2rleWP8XReC4LXKF7Qi5vYiPJkA4Grx5zxsjHofFrVRz6j0PYOCuz7vsO+ZzMunFMfom6FODnscSWz4zouDVw==
+ dependencies:
+ pluralize "^8.0.0"
+ unified-lint-rule "^1.0.0"
+ unist-util-generated "^1.1.0"
+ unist-util-visit "^2.0.0"
+
+remark-lint-list-item-indent@^2.0.0:
+ version "2.0.1"
+ resolved "https://registry.yarnpkg.com/remark-lint-list-item-indent/-/remark-lint-list-item-indent-2.0.1.tgz#c6472514e17bc02136ca87936260407ada90bf8d"
+ integrity sha512-4IKbA9GA14Q9PzKSQI6KEHU/UGO36CSQEjaDIhmb9UOhyhuzz4vWhnSIsxyI73n9nl9GGRAMNUSGzr4pQUFwTA==
+ dependencies:
+ pluralize "^8.0.0"
+ unified-lint-rule "^1.0.0"
+ unist-util-generated "^1.1.0"
+ unist-util-position "^3.0.0"
+ unist-util-visit "^2.0.0"
+
+remark-lint-no-auto-link-without-protocol@^2.0.0:
+ version "2.0.1"
+ resolved "https://registry.yarnpkg.com/remark-lint-no-auto-link-without-protocol/-/remark-lint-no-auto-link-without-protocol-2.0.1.tgz#f75e5c24adb42385593e0d75ca39987edb70b6c4"
+ integrity sha512-TFcXxzucsfBb/5uMqGF1rQA+WJJqm1ZlYQXyvJEXigEZ8EAxsxZGPb/gOQARHl/y0vymAuYxMTaChavPKaBqpQ==
+ dependencies:
+ mdast-util-to-string "^1.0.2"
+ unified-lint-rule "^1.0.0"
+ unist-util-generated "^1.1.0"
+ unist-util-position "^3.0.0"
+ unist-util-visit "^2.0.0"
+
+remark-lint-no-blockquote-without-marker@^4.0.0:
+ version "4.0.0"
+ resolved "https://registry.yarnpkg.com/remark-lint-no-blockquote-without-marker/-/remark-lint-no-blockquote-without-marker-4.0.0.tgz#856fb64dd038fa8fc27928163caa24a30ff4d790"
+ integrity sha512-Y59fMqdygRVFLk1gpx2Qhhaw5IKOR9T38Wf7pjR07bEFBGUNfcoNVIFMd1TCJfCPQxUyJzzSqfZz/KT7KdUuiQ==
+ dependencies:
+ unified-lint-rule "^1.0.0"
+ unist-util-generated "^1.0.0"
+ unist-util-position "^3.0.0"
+ unist-util-visit "^2.0.0"
+ vfile-location "^3.0.0"
+
+remark-lint-no-duplicate-definitions@^2.0.0:
+ version "2.0.1"
+ resolved "https://registry.yarnpkg.com/remark-lint-no-duplicate-definitions/-/remark-lint-no-duplicate-definitions-2.0.1.tgz#588039881f63fe01df69d3b64265760b3e83b477"
+ integrity sha512-XL22benJZB01m+aOse91nsu1IMFqeWJWme9QvoJuxIcBROO1BG1VoqLOkwNcawE/M/0CkvTo5rfx0eMlcnXOIw==
+ dependencies:
+ unified-lint-rule "^1.0.0"
+ unist-util-generated "^1.1.0"
+ unist-util-position "^3.0.0"
+ unist-util-stringify-position "^2.0.0"
+ unist-util-visit "^2.0.0"
+
+remark-lint-no-heading-content-indent@^3.0.0:
+ version "3.0.0"
+ resolved "https://registry.yarnpkg.com/remark-lint-no-heading-content-indent/-/remark-lint-no-heading-content-indent-3.0.0.tgz#faa323a52fcb5db9b3ce16cb8e417e43ab433af1"
+ integrity sha512-yULDoVSIqKylLDfW6mVUbrHlyEWUSFtVFiKc+/BA412xDIhm8HZLUnP+FsuBC0OzbIZ+bO9Txy52WtO3LGnK1A==
+ dependencies:
+ mdast-util-heading-style "^1.0.2"
+ pluralize "^8.0.0"
+ unified-lint-rule "^1.0.0"
+ unist-util-generated "^1.1.0"
+ unist-util-position "^3.0.0"
+ unist-util-visit "^2.0.0"
+
+remark-lint-no-inline-padding@^3.0.0:
+ version "3.0.0"
+ resolved "https://registry.yarnpkg.com/remark-lint-no-inline-padding/-/remark-lint-no-inline-padding-3.0.0.tgz#14c2722bcddc648297a54298107a922171faf6eb"
+ integrity sha512-3s9uW3Yux9RFC0xV81MQX3bsYs+UY7nPnRuMxeIxgcVwxQ4E/mTJd9QjXUwBhU9kdPtJ5AalngdmOW2Tgar8Cg==
+ dependencies:
+ mdast-util-to-string "^1.0.2"
+ unified-lint-rule "^1.0.0"
+ unist-util-generated "^1.1.0"
+ unist-util-visit "^2.0.0"
+
+remark-lint-no-literal-urls@^2.0.0:
+ version "2.0.1"
+ resolved "https://registry.yarnpkg.com/remark-lint-no-literal-urls/-/remark-lint-no-literal-urls-2.0.1.tgz#731908f9866c1880e6024dcee1269fb0f40335d6"
+ integrity sha512-IDdKtWOMuKVQIlb1CnsgBoyoTcXU3LppelDFAIZePbRPySVHklTtuK57kacgU5grc7gPM04bZV96eliGrRU7Iw==
+ dependencies:
+ mdast-util-to-string "^1.0.2"
+ unified-lint-rule "^1.0.0"
+ unist-util-generated "^1.1.0"
+ unist-util-position "^3.0.0"
+ unist-util-visit "^2.0.0"
+
+remark-lint-no-shortcut-reference-image@^2.0.0:
+ version "2.0.1"
+ resolved "https://registry.yarnpkg.com/remark-lint-no-shortcut-reference-image/-/remark-lint-no-shortcut-reference-image-2.0.1.tgz#d174d12a57e8307caf6232f61a795bc1d64afeaa"
+ integrity sha512-2jcZBdnN6ecP7u87gkOVFrvICLXIU5OsdWbo160FvS/2v3qqqwF2e/n/e7D9Jd+KTq1mR1gEVVuTqkWWuh3cig==
+ dependencies:
+ unified-lint-rule "^1.0.0"
+ unist-util-generated "^1.1.0"
+ unist-util-visit "^2.0.0"
+
+remark-lint-no-shortcut-reference-link@^2.0.0:
+ version "2.0.1"
+ resolved "https://registry.yarnpkg.com/remark-lint-no-shortcut-reference-link/-/remark-lint-no-shortcut-reference-link-2.0.1.tgz#8f963f81036e45cfb7061b3639e9c6952308bc94"
+ integrity sha512-pTZbslG412rrwwGQkIboA8wpBvcjmGFmvugIA+UQR+GfFysKtJ5OZMPGJ98/9CYWjw9Z5m0/EktplZ5TjFjqwA==
+ dependencies:
+ unified-lint-rule "^1.0.0"
+ unist-util-generated "^1.1.0"
+ unist-util-visit "^2.0.0"
+
+remark-lint-no-undefined-references@^3.0.0:
+ version "3.0.0"
+ resolved "https://registry.yarnpkg.com/remark-lint-no-undefined-references/-/remark-lint-no-undefined-references-3.0.0.tgz#59dab8f815f8de9f1dcbd69e7cc705978e931cb0"
+ integrity sha512-0hzaJS9GuzSQVOeeNdJr/s66LRQOzp618xuOQPYWHcJdd+SCaRTyWbjMrTM/cCI5L1sYjgurp410NkIBQ32Vqg==
+ dependencies:
+ collapse-white-space "^1.0.4"
+ unified-lint-rule "^1.0.0"
+ unist-util-generated "^1.1.0"
+ unist-util-position "^3.1.0"
+ unist-util-visit "^2.0.0"
+ vfile-location "^3.1.0"
+
+remark-lint-no-unused-definitions@^2.0.0:
+ version "2.0.1"
+ resolved "https://registry.yarnpkg.com/remark-lint-no-unused-definitions/-/remark-lint-no-unused-definitions-2.0.1.tgz#ba45d9105b61b77ae02b92d3d339a638ab4ed59a"
+ integrity sha512-+BMc0BOjc364SvKYLkspmxDch8OaKPbnUGgQBvK0Bmlwy42baR4C9zhwAWBxm0SBy5Z4AyM4G4jKpLXPH40Oxg==
+ dependencies:
+ unified-lint-rule "^1.0.0"
+ unist-util-generated "^1.1.0"
+ unist-util-visit "^2.0.0"
+
+remark-lint-ordered-list-marker-style@^2.0.0:
+ version "2.0.1"
+ resolved "https://registry.yarnpkg.com/remark-lint-ordered-list-marker-style/-/remark-lint-ordered-list-marker-style-2.0.1.tgz#183c31967e6f2ae8ef00effad03633f7fd00ffaa"
+ integrity sha512-Cnpw1Dn9CHn+wBjlyf4qhPciiJroFOEGmyfX008sQ8uGoPZsoBVIJx76usnHklojSONbpjEDcJCjnOvfAcWW1A==
+ dependencies:
+ unified-lint-rule "^1.0.0"
+ unist-util-generated "^1.1.0"
+ unist-util-position "^3.0.0"
+ unist-util-visit "^2.0.0"
+
+remark-lint@^8.0.0:
+ version "8.0.0"
+ resolved "https://registry.yarnpkg.com/remark-lint/-/remark-lint-8.0.0.tgz#6e40894f4a39eaea31fc4dd45abfaba948bf9a09"
+ integrity sha512-ESI8qJQ/TIRjABDnqoFsTiZntu+FRifZ5fJ77yX63eIDijl/arvmDvT+tAf75/Nm5BFL4R2JFUtkHRGVjzYUsg==
+ dependencies:
+ remark-message-control "^6.0.0"
remark-mdx@^1.6.4:
version "1.6.4"
@@ -14207,6 +14422,14 @@ remark-mdx@^1.6.4:
remark-parse "8.0.2"
unified "9.0.0"
+remark-message-control@^6.0.0:
+ version "6.0.0"
+ resolved "https://registry.yarnpkg.com/remark-message-control/-/remark-message-control-6.0.0.tgz#955b054b38c197c9f2e35b1d88a4912949db7fc5"
+ integrity sha512-k9bt7BYc3G7YBdmeAhvd3VavrPa/XlKWR3CyHjr4sLO9xJyly8WHHT3Sp+8HPR8lEUv+/sZaffL7IjMLV0f6BA==
+ dependencies:
+ mdast-comment-marker "^1.0.0"
+ unified-message-control "^3.0.0"
+
remark-parse@8.0.2, remark-parse@^8.0.0, remark-parse@^8.0.2:
version "8.0.2"
resolved "https://registry.yarnpkg.com/remark-parse/-/remark-parse-8.0.2.tgz#5999bc0b9c2e3edc038800a64ff103d0890b318b"
@@ -14265,6 +14488,35 @@ remark-parse@^6.0.0, remark-parse@^6.0.3:
vfile-location "^2.0.0"
xtend "^4.0.1"
+remark-parse@^9.0.0:
+ version "9.0.0"
+ resolved "https://registry.yarnpkg.com/remark-parse/-/remark-parse-9.0.0.tgz#4d20a299665880e4f4af5d90b7c7b8a935853640"
+ integrity sha512-geKatMwSzEXKHuzBNU1z676sGcDcFoChMK38TgdHJNAYfFtsfHDQG7MoJAjs6sgYMqyLduCYWDIWZIxiPeafEw==
+ dependencies:
+ mdast-util-from-markdown "^0.8.0"
+
+remark-preset-lint-recommended@^5.0.0:
+ version "5.0.0"
+ resolved "https://registry.yarnpkg.com/remark-preset-lint-recommended/-/remark-preset-lint-recommended-5.0.0.tgz#cc0da5bf532a47392e01ad2ee34c8076edad1207"
+ integrity sha512-uu+Ab8JCwMMaKvvB0LOWTWtM3uAvJbKQM/oyWCEJqj7lUVNTKZS575Ro5rKM3Dx7kQjjR1iw0e99bpAYTc5xNA==
+ dependencies:
+ remark-lint "^8.0.0"
+ remark-lint-final-newline "^1.0.0"
+ remark-lint-hard-break-spaces "^2.0.0"
+ remark-lint-list-item-bullet-indent "^3.0.0"
+ remark-lint-list-item-indent "^2.0.0"
+ remark-lint-no-auto-link-without-protocol "^2.0.0"
+ remark-lint-no-blockquote-without-marker "^4.0.0"
+ remark-lint-no-duplicate-definitions "^2.0.0"
+ remark-lint-no-heading-content-indent "^3.0.0"
+ remark-lint-no-inline-padding "^3.0.0"
+ remark-lint-no-literal-urls "^2.0.0"
+ remark-lint-no-shortcut-reference-image "^2.0.0"
+ remark-lint-no-shortcut-reference-link "^2.0.0"
+ remark-lint-no-undefined-references "^3.0.0"
+ remark-lint-no-unused-definitions "^2.0.0"
+ remark-lint-ordered-list-marker-style "^2.0.0"
+
remark-retext@^3.1.3:
version "3.1.3"
resolved "https://registry.yarnpkg.com/remark-retext/-/remark-retext-3.1.3.tgz#77173b1d9d13dab15ce5b38d996195fea522ee7f"
@@ -14333,6 +14585,13 @@ remark-stringify@^8.0.0:
unherit "^1.0.4"
xtend "^4.0.1"
+remark-stringify@^9.0.0:
+ version "9.0.0"
+ resolved "https://registry.yarnpkg.com/remark-stringify/-/remark-stringify-9.0.0.tgz#8ba0c9e4167c42733832215a81550489759e3793"
+ integrity sha512-8x29DpTbVzEc6Dwb90qhxCtbZ6hmj3BxWWDpMhA+1WM4dOEGH5U5/GFe3Be5Hns5MvPSFAr1e2KSVtKZkK5nUw==
+ dependencies:
+ mdast-util-to-markdown "^0.5.0"
+
remark@^10.0.1:
version "10.0.1"
resolved "https://registry.yarnpkg.com/remark/-/remark-10.0.1.tgz#3058076dc41781bf505d8978c291485fe47667df"
@@ -14351,6 +14610,15 @@ remark@^12.0.0:
remark-stringify "^8.0.0"
unified "^9.0.0"
+remark@^13.0.0:
+ version "13.0.0"
+ resolved "https://registry.yarnpkg.com/remark/-/remark-13.0.0.tgz#d15d9bf71a402f40287ebe36067b66d54868e425"
+ integrity sha512-HDz1+IKGtOyWN+QgBiAT0kn+2s6ovOxHyPAFGKVE81VSzJ+mq7RwHFledEvB5F1p4iJvOah/LOKdFuzvRnNLCA==
+ dependencies:
+ remark-parse "^9.0.0"
+ remark-stringify "^9.0.0"
+ unified "^9.1.0"
+
remark@^5.0.1:
version "5.1.0"
resolved "https://registry.yarnpkg.com/remark/-/remark-5.1.0.tgz#cb463bd3dbcb4b99794935eee1cf71d7a8e3068c"
@@ -15222,6 +15490,11 @@ slice-ansi@^4.0.0:
astral-regex "^2.0.0"
is-fullwidth-code-point "^3.0.0"
+sliced@^1.0.1:
+ version "1.0.1"
+ resolved "https://registry.yarnpkg.com/sliced/-/sliced-1.0.1.tgz#0b3a662b5d04c3177b1926bea82b03f837a2ef41"
+ integrity sha1-CzpmK10Ewxd7GSa+qCsD+Dei70E=
+
slick-carousel@^1.8.1:
version "1.8.1"
resolved "https://registry.yarnpkg.com/slick-carousel/-/slick-carousel-1.8.1.tgz#a4bfb29014887bb66ce528b90bd0cda262cc8f8d"
@@ -16692,6 +16965,21 @@ unicode-property-aliases-ecmascript@^1.0.4:
resolved "https://registry.yarnpkg.com/unicode-property-aliases-ecmascript/-/unicode-property-aliases-ecmascript-1.1.0.tgz#dd57a99f6207bedff4628abefb94c50db941c8f4"
integrity sha512-PqSoPh/pWetQ2phoj5RLiaqIk4kCNwoV3CI+LfGmWLKI3rE3kl1h59XpX2BjgDrmbxD9ARtQobPGU1SguCYuQg==
+unified-lint-rule@^1.0.0:
+ version "1.0.6"
+ resolved "https://registry.yarnpkg.com/unified-lint-rule/-/unified-lint-rule-1.0.6.tgz#b4ab801ff93c251faa917a8d1c10241af030de84"
+ integrity sha512-YPK15YBFwnsVorDFG/u0cVVQN5G2a3V8zv5/N6KN3TCG+ajKtaALcy7u14DCSrJI+gZeyYquFL9cioJXOGXSvg==
+ dependencies:
+ wrapped "^1.0.1"
+
+unified-message-control@^3.0.0:
+ version "3.0.1"
+ resolved "https://registry.yarnpkg.com/unified-message-control/-/unified-message-control-3.0.1.tgz#7018855daea9af96082cbea35970d48c9c4dbbf2"
+ integrity sha512-K2Kvvp1DBzeuxYLLsumZh/gDWUTl4e2z/P3VReFirC78cfHKtQifbhnfRrSBtKtd1Uc6cvYTW0/SZIUaMAEcTg==
+ dependencies:
+ unist-util-visit "^2.0.0"
+ vfile-location "^3.0.0"
+
unified@9.0.0, unified@^9.0.0:
version "9.0.0"
resolved "https://registry.yarnpkg.com/unified/-/unified-9.0.0.tgz#12b099f97ee8b36792dbad13d278ee2f696eed1d"
@@ -16742,6 +17030,18 @@ unified@^7.0.0:
vfile "^3.0.0"
x-is-string "^0.1.0"
+unified@^9.1.0:
+ version "9.2.0"
+ resolved "https://registry.yarnpkg.com/unified/-/unified-9.2.0.tgz#67a62c627c40589edebbf60f53edfd4d822027f8"
+ integrity sha512-vx2Z0vY+a3YoTj8+pttM3tiJHCwY5UFbYdiWrwBEbHmK8pvsPj2rtAX2BFfgXen8T39CJWblWRDT4L5WGXtDdg==
+ dependencies:
+ bail "^1.0.0"
+ extend "^3.0.0"
+ is-buffer "^2.0.0"
+ is-plain-obj "^2.0.0"
+ trough "^1.0.0"
+ vfile "^4.0.0"
+
union-value@^1.0.0:
version "1.0.1"
resolved "https://registry.yarnpkg.com/union-value/-/union-value-1.0.1.tgz#0b6fe7b835aecda61c6ea4d4f02c14221e109847"
@@ -16845,7 +17145,7 @@ unist-util-modify-children@^1.0.0:
dependencies:
array-iterate "^1.0.0"
-unist-util-position@^3.0.0:
+unist-util-position@^3.0.0, unist-util-position@^3.1.0:
version "3.1.0"
resolved "https://registry.yarnpkg.com/unist-util-position/-/unist-util-position-3.1.0.tgz#1c42ee6301f8d52f47d14f62bbdb796571fa2d47"
integrity sha512-w+PkwCbYSFw8vpgWD0v7zRCl1FpY3fjDSQ3/N/wNd9Ffa4gPi8+4keqt99N3XW6F99t/mUzp2xAhNmfKWp95QA==
@@ -17198,6 +17498,11 @@ vfile-location@^3.0.0:
resolved "https://registry.yarnpkg.com/vfile-location/-/vfile-location-3.0.1.tgz#d78677c3546de0f7cd977544c367266764d31bb3"
integrity sha512-yYBO06eeN/Ki6Kh1QAkgzYpWT1d3Qln+ZCtSbJqFExPl1S3y2qqotJQXoh6qEvl/jDlgpUJolBn3PItVnnZRqQ==
+vfile-location@^3.1.0:
+ version "3.2.0"
+ resolved "https://registry.yarnpkg.com/vfile-location/-/vfile-location-3.2.0.tgz#d8e41fbcbd406063669ebf6c33d56ae8721d0f3c"
+ integrity sha512-aLEIZKv/oxuCDZ8lkJGhuhztf/BW4M+iHdCwglA/eWc+vtuRFJj8EtgceYFX4LRjOhCAAiNHsKGssC6onJ+jbA==
+
vfile-message@*, vfile-message@^2.0.0:
version "2.0.4"
resolved "https://registry.yarnpkg.com/vfile-message/-/vfile-message-2.0.4.tgz#5b43b88171d409eae58477d13f23dd41d52c371a"
@@ -17651,6 +17956,14 @@ wrap-ansi@^6.2.0:
string-width "^4.1.0"
strip-ansi "^6.0.0"
+wrapped@^1.0.1:
+ version "1.0.1"
+ resolved "https://registry.yarnpkg.com/wrapped/-/wrapped-1.0.1.tgz#c783d9d807b273e9b01e851680a938c87c907242"
+ integrity sha1-x4PZ2Aeyc+mwHoUWgKk4yHyQckI=
+ dependencies:
+ co "3.1.0"
+ sliced "^1.0.1"
+
wrappy@1:
version "1.0.2"
resolved "https://registry.yarnpkg.com/wrappy/-/wrappy-1.0.2.tgz#b5243d8f3ec1aa35f1364605bc0d1036e30ab69f"
From 1136b3b77b85144e76cc75f9e1da653516e41cac Mon Sep 17 00:00:00 2001
From: rogermparent
Date: Wed, 18 Nov 2020 17:56:49 -0500
Subject: [PATCH 04/59] Add example tooltip that shows off tooltip overrides
---
content/docs/user-guide/glossary/dvc-project.md | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/content/docs/user-guide/glossary/dvc-project.md b/content/docs/user-guide/glossary/dvc-project.md
index af2a191a50..fbfb5f8b82 100644
--- a/content/docs/user-guide/glossary/dvc-project.md
+++ b/content/docs/user-guide/glossary/dvc-project.md
@@ -11,6 +11,11 @@ match:
repository,
repositories,
]
+tooltip: >-
+ Initialized by running `dvc init` in the **workspace** (typically a Git
+ repository). It will contain the [`.dvc/`
+ directory](/doc/user-guide/dvc-files-and-directories), as well as `dvc.yaml`
+ and `.dvc` files created with commands such as `dvc add` or `dvc run`.
---
Initialized by running `dvc init` in the **workspace** (typically a Git
@@ -18,3 +23,6 @@ repository). It will contain the
[`.dvc/` directory](/doc/user-guide/dvc-files-and-directories), as well as
`dvc.yaml` and `.dvc` files created with commands such as `dvc add` or
`dvc run`.
+
+This page extends for a _while_, elaborating on what a DVC Project is in a way
+where it doesn't make sense to put **all** this stuff in a tooltip.
From 852da9cbb0ed20b0ad9cf8813855f61e5f6cac86 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Thu, 19 Nov 2020 15:45:33 -0700
Subject: [PATCH 05/59] break out concepts, add example links in tooltips
---
content/docs/sidebar.json | 13 +-
content/docs/user-guide/basic-concepts.md | 129 ------------------
content/docs/user-guide/concepts/cache.md | 18 +++
content/docs/user-guide/concepts/dvc-files.md | 26 ++++
.../docs/user-guide/concepts/metrics-plots.md | 26 ++++
content/docs/user-guide/concepts/pipelines.md | 20 +++
content/docs/user-guide/concepts/remote.md | 17 +++
content/docs/user-guide/concepts/workspace.md | 15 ++
content/docs/user-guide/glossary/dvc-cache.md | 18 ++-
content/docs/user-guide/glossary/workspace.md | 3 +
10 files changed, 148 insertions(+), 137 deletions(-)
delete mode 100644 content/docs/user-guide/basic-concepts.md
create mode 100644 content/docs/user-guide/concepts/cache.md
create mode 100644 content/docs/user-guide/concepts/dvc-files.md
create mode 100644 content/docs/user-guide/concepts/metrics-plots.md
create mode 100644 content/docs/user-guide/concepts/pipelines.md
create mode 100644 content/docs/user-guide/concepts/remote.md
create mode 100644 content/docs/user-guide/concepts/workspace.md
diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json
index 90241f4386..41399b81b5 100644
--- a/content/docs/sidebar.json
+++ b/content/docs/sidebar.json
@@ -88,9 +88,16 @@
"source": "what-is-dvc.md"
},
{
- "label": "Basic Concepts",
- "slug": "basic-concepts",
- "source": "basic-concepts.md"
+ "slug": "concepts",
+ "source": false,
+ "children": [
+ "cache",
+ "dvc-files",
+ "metrics-plots",
+ "pipelines",
+ "remote",
+ "workspace"
+ ]
},
"dvc-files-and-directories",
"merge-conflicts",
diff --git a/content/docs/user-guide/basic-concepts.md b/content/docs/user-guide/basic-concepts.md
deleted file mode 100644
index 55b98df36a..0000000000
--- a/content/docs/user-guide/basic-concepts.md
+++ /dev/null
@@ -1,129 +0,0 @@
-# Basic Concepts
-
-Intro and DVC philosophy...possible diagram of cache/remote/workspace
-
-## Cache
-
-_From `dvc cache`_
-
-The DVC Cache is where your data files, models, etc. (anything you want to
-version with DVC) are actually stored. The data files and directories visible in
-the workspace are links\* to (or copies of) the ones in cache.
-Learn more about it's
-[structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).
-
-_from tooltip_
-
-The DVC cache is a hidden storage (by default located in the `.dvc/cache`
-directory) for files that are tracked by DVC, and their different versions.
-Learn more about it's
-[structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).
-
-## DVC Files
-
-_from dvc-files-and-directories_
-
-Once initialized in a project, DVC populates its installation
-directory (`.dvc/`) with the
-[internal directories and files](#internal-directories-and-files) needed for DVC
-operation.
-
-Additionally, there are a few metafiles that support DVC's features:
-
-- Files ending with the `.dvc` extension are placeholders to track data files
- and directories. A DVC project usually has one `.dvc` file per
- large data file or directory being tracked.
-- `dvc.yaml` files (or _pipelines files_) specify stages that form the
- pipeline(s) of a project, and how they connect (_dependency graph_ or DAG).
-
- These normally have a matching `dvc.lock` file to record the pipeline state
- and track its outputs.
-
-Both `.dvc` files and `dvc.yaml` use human-friendly YAML 1.2 schemas, described
-below. We encourage you to get familiar with them so you may create, generate,
-and edit them on your own.
-
-Both the internal directory and these metafiles should be versioned with Git (in
-Git-enabled repositories).
-
-## Metrics and Plots
-
-_from plots and metrics intros_
-
-DVC has two concepts for metrics, that represent different results of machine
-learning training or data processing:
-
-1. `dvc metrics` represent **scalar numbers** such as AUC, _true positive rate_,
- etc.
-2. `dvc plots` can be used to visualize **data series** such as AUC curves, loss
- functions, confusion matrices, etc.
-
-_from `dvc metrics`_
-
-In order to follow the performance of machine learning experiments, DVC has the
-ability to mark a certain stage outputs as metrics. These metrics
-are project-specific floating-point or integer values e.g. AUC, ROC, false
-positives, etc.
-
-_from `dvc plots` description_
-
-DVC provides a set of commands to visualize certain metrics of machine learning
-experiments as plots. Usual plot examples are AUC curves, loss functions,
-confusion matrices, among others.
-
-_probably should mention diff..._
-
-## Pipelines
-
-_from `dvc dag`_
-
-A data pipeline, in general, is a series of data processing
-[stages](/doc/command-reference/run) (for example, console commands that take an
-input and produce an output). A pipeline may produce intermediate
-data, and has a final result.
-
-Data science and machine learning pipelines typically start with large raw
-datasets, include intermediate featurization and training stages, and produce a
-final model, as well as accuracy [metrics](/doc/command-reference/metrics).
-
-In DVC, pipeline stages and commands, their data I/O, interdependencies, and
-results (intermediate or final) are specified in `dvc.yaml`, which can be
-written manually or built using the helper command `dvc run`. This allows DVC to
-restore one or more pipelines later (see `dvc repro`).
-
-> DVC builds a dependency graph
-> ([DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) to do this.
-
-## Remote
-
-_from `dvc remote`_
-
-What is data remote?
-
-The same way as GitHub provides storage hosting for Git repositories, DVC
-remotes provide a location to store and share data and models. You can pull data
-assets created by colleagues from DVC remotes without spending time and
-resources to build or process them locally. Remote storage can also save space
-on your local environment – DVC can [fetch](/doc/command-reference/fetch) into
-the cache directory only the data you need for a specific
-branch/commit.
-
-Using DVC with remote storage is optional. DVC commands use the local cache
-(usually in dir `.dvc/cache`) as data storage by default. This enables the main
-DVC usage scenarios out of the box.
-
-## Workspace
-
-_from workspace tooltip_
-
-Directory containing all your project files e.g. raw datasets, source code, ML
-models, etc. Typically, it's also a Git repository. It will contain your DVC
-project.
-
-_from dvc-project tooltip_
-
-Initialized by running `dvc init` in the **workspace** (typically a Git
-repository). It will contain the
-[`.dvc/` directory](/doc/user-guide/dvc-files-and-directories), as well as
-`dvc.yaml` and `.dvc` files created with commands such as `dvc add` or
-`dvc run`.
diff --git a/content/docs/user-guide/concepts/cache.md b/content/docs/user-guide/concepts/cache.md
new file mode 100644
index 0000000000..e768a6ebbe
--- /dev/null
+++ b/content/docs/user-guide/concepts/cache.md
@@ -0,0 +1,18 @@
+# Cache
+
+Diagram of cache/remote/workspace...
+
+_From `dvc cache`_
+
+The DVC Cache is where your data files, models, etc. (anything you want to
+version with DVC) are actually stored. The data files and directories visible in
+the workspace are links\* to (or copies of) the ones in cache.
+Learn more about it's
+[structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).
+
+_from tooltip_
+
+The DVC cache is a hidden storage (by default located in the `.dvc/cache`
+directory) for files that are tracked by DVC, and their different versions.
+Learn more about it's
+[structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).
diff --git a/content/docs/user-guide/concepts/dvc-files.md b/content/docs/user-guide/concepts/dvc-files.md
new file mode 100644
index 0000000000..db41610c48
--- /dev/null
+++ b/content/docs/user-guide/concepts/dvc-files.md
@@ -0,0 +1,26 @@
+# DVC Files
+
+_from dvc-files-and-directories_
+
+Once initialized in a project, DVC populates its installation
+directory (`.dvc/`) with the
+[internal directories and files](#internal-directories-and-files) needed for DVC
+operation.
+
+Additionally, there are a few metafiles that support DVC's features:
+
+- Files ending with the `.dvc` extension are placeholders to track data files
+ and directories. A DVC project usually has one `.dvc` file per
+ large data file or directory being tracked.
+- `dvc.yaml` files (or _pipelines files_) specify stages that form the
+ pipeline(s) of a project, and how they connect (_dependency graph_ or DAG).
+
+ These normally have a matching `dvc.lock` file to record the pipeline state
+ and track its outputs.
+
+Both `.dvc` files and `dvc.yaml` use human-friendly YAML 1.2 schemas, described
+below. We encourage you to get familiar with them so you may create, generate,
+and edit them on your own.
+
+Both the internal directory and these metafiles should be versioned with Git (in
+Git-enabled repositories).
diff --git a/content/docs/user-guide/concepts/metrics-plots.md b/content/docs/user-guide/concepts/metrics-plots.md
new file mode 100644
index 0000000000..676f7a9b23
--- /dev/null
+++ b/content/docs/user-guide/concepts/metrics-plots.md
@@ -0,0 +1,26 @@
+# Metrics and Plots
+
+_from plots and metrics intros_
+
+DVC has two concepts for metrics, that represent different results of machine
+learning training or data processing:
+
+1. `dvc metrics` represent **scalar numbers** such as AUC, _true positive rate_,
+ etc.
+2. `dvc plots` can be used to visualize **data series** such as AUC curves, loss
+ functions, confusion matrices, etc.
+
+_from `dvc metrics`_
+
+In order to follow the performance of machine learning experiments, DVC has the
+ability to mark a certain stage outputs as metrics. These metrics
+are project-specific floating-point or integer values e.g. AUC, ROC, false
+positives, etc.
+
+_from `dvc plots` description_
+
+DVC provides a set of commands to visualize certain metrics of machine learning
+experiments as plots. Usual plot examples are AUC curves, loss functions,
+confusion matrices, among others.
+
+_probably should mention diff..._
diff --git a/content/docs/user-guide/concepts/pipelines.md b/content/docs/user-guide/concepts/pipelines.md
new file mode 100644
index 0000000000..5fadacbcc2
--- /dev/null
+++ b/content/docs/user-guide/concepts/pipelines.md
@@ -0,0 +1,20 @@
+# Pipelines
+
+_from `dvc dag`_
+
+A data pipeline, in general, is a series of data processing
+[stages](/doc/command-reference/run) (for example, console commands that take an
+input and produce an output). A pipeline may produce intermediate
+data, and has a final result.
+
+Data science and machine learning pipelines typically start with large raw
+datasets, include intermediate featurization and training stages, and produce a
+final model, as well as accuracy [metrics](/doc/command-reference/metrics).
+
+In DVC, pipeline stages and commands, their data I/O, interdependencies, and
+results (intermediate or final) are specified in `dvc.yaml`, which can be
+written manually or built using the helper command `dvc run`. This allows DVC to
+restore one or more pipelines later (see `dvc repro`).
+
+> DVC builds a dependency graph
+> ([DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) to do this.
diff --git a/content/docs/user-guide/concepts/remote.md b/content/docs/user-guide/concepts/remote.md
new file mode 100644
index 0000000000..00846ee118
--- /dev/null
+++ b/content/docs/user-guide/concepts/remote.md
@@ -0,0 +1,17 @@
+# Remote
+
+_from `dvc remote`_
+
+What is data remote?
+
+The same way as GitHub provides storage hosting for Git repositories, DVC
+remotes provide a location to store and share data and models. You can pull data
+assets created by colleagues from DVC remotes without spending time and
+resources to build or process them locally. Remote storage can also save space
+on your local environment – DVC can [fetch](/doc/command-reference/fetch) into
+the cache directory only the data you need for a specific
+branch/commit.
+
+Using DVC with remote storage is optional. DVC commands use the local cache
+(usually in dir `.dvc/cache`) as data storage by default. This enables the main
+DVC usage scenarios out of the box.
diff --git a/content/docs/user-guide/concepts/workspace.md b/content/docs/user-guide/concepts/workspace.md
new file mode 100644
index 0000000000..f2f070adb3
--- /dev/null
+++ b/content/docs/user-guide/concepts/workspace.md
@@ -0,0 +1,15 @@
+# Workspace
+
+_from workspace tooltip_
+
+Directory containing all your project files e.g. raw datasets, source code, ML
+models, etc. Typically, it's also a Git repository. It will contain your DVC
+project.
+
+_from dvc-project tooltip_
+
+Initialized by running `dvc init` in the **workspace** (typically a Git
+repository). It will contain the
+[`.dvc/` directory](/doc/user-guide/dvc-files-and-directories), as well as
+`dvc.yaml` and `.dvc` files created with commands such as `dvc add` or
+`dvc run`.
diff --git a/content/docs/user-guide/glossary/dvc-cache.md b/content/docs/user-guide/glossary/dvc-cache.md
index 81b624549f..49d9d31333 100644
--- a/content/docs/user-guide/glossary/dvc-cache.md
+++ b/content/docs/user-guide/glossary/dvc-cache.md
@@ -1,12 +1,20 @@
---
name: 'DVC Cache'
match: ['DVC cache', cache, caches, cached]
+tooltip: >-
+ The DVC cache is a hidden storage (by default located in the `.dvc/cache`
+ directory) for files that are tracked by DVC, and their different versions.
+ Learn more about it's
+ [structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).
+
+ Learn more about the [concept of cache](/doc/user-guide/concepts/cache) in
+ DVC.
---
-The DVC cache is a hidden storage (by default located in the `.dvc/cache`
-directory) for files that are tracked by DVC, and their different versions.
+_From `dvc cache`_
+
+The DVC Cache is where your data files, models, etc. (anything you want to
+version with DVC) are actually stored. The data files and directories visible in
+the workspace are links\* to (or copies of) the ones in cache.
Learn more about it's
[structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).
-
-Learn more about the [concept of cache](/doc/user-guide/basic-concepts#cache) in
-DVC.
diff --git a/content/docs/user-guide/glossary/workspace.md b/content/docs/user-guide/glossary/workspace.md
index 76621c5e0c..5e5d1cfe24 100644
--- a/content/docs/user-guide/glossary/workspace.md
+++ b/content/docs/user-guide/glossary/workspace.md
@@ -6,3 +6,6 @@ match: [workspace]
Directory containing all your project files e.g. raw datasets, source code, ML
models, etc. Typically, it's also a Git repository. It will contain your DVC
project.
+
+Learn more about the [workspace concept](/doc/user-guide/concepts/workspace) in
+DVC.
From 491b367094f47d10dd24b559b8d37a4d65f24a86 Mon Sep 17 00:00:00 2001
From: Jorge Orpinel
Date: Thu, 19 Nov 2020 19:00:28 -0600
Subject: [PATCH 06/59] Update content/docs/user-guide/concepts/pipelines.md
---
content/docs/user-guide/concepts/pipelines.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/content/docs/user-guide/concepts/pipelines.md b/content/docs/user-guide/concepts/pipelines.md
index 5fadacbcc2..b1d68c1152 100644
--- a/content/docs/user-guide/concepts/pipelines.md
+++ b/content/docs/user-guide/concepts/pipelines.md
@@ -1,4 +1,4 @@
-# Pipelines
+# Data Pipelines
_from `dvc dag`_
From 98bbed1024136c6191b5e38836c0486403e59e58 Mon Sep 17 00:00:00 2001
From: Jorge Orpinel
Date: Thu, 19 Nov 2020 19:00:45 -0600
Subject: [PATCH 07/59] Update content/docs/user-guide/concepts/remote.md
---
content/docs/user-guide/concepts/remote.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/content/docs/user-guide/concepts/remote.md b/content/docs/user-guide/concepts/remote.md
index 00846ee118..a75bfd04a0 100644
--- a/content/docs/user-guide/concepts/remote.md
+++ b/content/docs/user-guide/concepts/remote.md
@@ -1,4 +1,4 @@
-# Remote
+# Remote Storage
_from `dvc remote`_
From 1a1a1ba78e566a45ae54b4d23b0e15330039e45c Mon Sep 17 00:00:00 2001
From: rogermparent
Date: Thu, 19 Nov 2020 23:15:08 -0500
Subject: [PATCH 08/59] Enable glossary page doc creation
---
src/gatsby/models/docs/onCreateMarkdownContentNode.js | 4 ----
1 file changed, 4 deletions(-)
diff --git a/src/gatsby/models/docs/onCreateMarkdownContentNode.js b/src/gatsby/models/docs/onCreateMarkdownContentNode.js
index 1393e1dd99..3754167e26 100644
--- a/src/gatsby/models/docs/onCreateMarkdownContentNode.js
+++ b/src/gatsby/models/docs/onCreateMarkdownContentNode.js
@@ -1,10 +1,6 @@
const path = require('path')
async function createMarkdownDocsNode(api, { parentNode, createChildNode }) {
- // Suppress page creation for Basic Concepts and the Glossary
- // They're only used in tooltips now, but we intend to expand on them later.
- if (parentNode.relativeDirectory === 'docs/user-guide/glossary') return
-
const splitDir = parentNode.relativeDirectory.split('/')
if (splitDir[0] !== 'docs') return
From 95738705b57d10e00a12b7dff9c685fac9422699 Mon Sep 17 00:00:00 2001
From: rogermparent
Date: Fri, 20 Nov 2020 01:12:44 -0500
Subject: [PATCH 09/59] Add DVC Project example for glossary page
---
content/docs/sidebar.json | 5 +++--
content/docs/user-guide/glossary/dvc-project.md | 3 +++
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json
index 41399b81b5..4449cc8aae 100644
--- a/content/docs/sidebar.json
+++ b/content/docs/sidebar.json
@@ -88,10 +88,11 @@
"source": "what-is-dvc.md"
},
{
- "slug": "concepts",
+ "slug": "glossary",
"source": false,
"children": [
- "cache",
+ "dvc-project",
+ "dvc-cache",
"dvc-files",
"metrics-plots",
"pipelines",
diff --git a/content/docs/user-guide/glossary/dvc-project.md b/content/docs/user-guide/glossary/dvc-project.md
index fbfb5f8b82..204f037266 100644
--- a/content/docs/user-guide/glossary/dvc-project.md
+++ b/content/docs/user-guide/glossary/dvc-project.md
@@ -1,4 +1,5 @@
---
+title: 'DVC Project'
name: 'DVC Project'
match:
[
@@ -18,6 +19,8 @@ tooltip: >-
and `.dvc` files created with commands such as `dvc add` or `dvc run`.
---
+# DVC Project
+
Initialized by running `dvc init` in the **workspace** (typically a Git
repository). It will contain the
[`.dvc/` directory](/doc/user-guide/dvc-files-and-directories), as well as
From 7ed4ddf2284ff67b99dbd48939db67e73186fb4c Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Fri, 20 Nov 2020 10:31:50 -0700
Subject: [PATCH 10/59] Merge concepts and glossary, add frontmatter, remove
duplicate content
---
content/docs/user-guide/concepts/cache.md | 18 ------------------
content/docs/user-guide/concepts/workspace.md | 15 ---------------
content/docs/user-guide/glossary/dvc-cache.md | 12 +++++++-----
.../{concepts => glossary}/dvc-files.md | 6 ++++++
.../{concepts => glossary}/metrics-plots.md | 6 ++++++
.../{concepts => glossary}/pipelines.md | 6 ++++++
.../{concepts => glossary}/remote.md | 6 ++++++
content/docs/user-guide/glossary/workspace.md | 18 +++++++++++++-----
8 files changed, 44 insertions(+), 43 deletions(-)
delete mode 100644 content/docs/user-guide/concepts/cache.md
delete mode 100644 content/docs/user-guide/concepts/workspace.md
rename content/docs/user-guide/{concepts => glossary}/dvc-files.md (91%)
rename content/docs/user-guide/{concepts => glossary}/metrics-plots.md (89%)
rename content/docs/user-guide/{concepts => glossary}/pipelines.md (88%)
rename content/docs/user-guide/{concepts => glossary}/remote.md (85%)
diff --git a/content/docs/user-guide/concepts/cache.md b/content/docs/user-guide/concepts/cache.md
deleted file mode 100644
index e768a6ebbe..0000000000
--- a/content/docs/user-guide/concepts/cache.md
+++ /dev/null
@@ -1,18 +0,0 @@
-# Cache
-
-Diagram of cache/remote/workspace...
-
-_From `dvc cache`_
-
-The DVC Cache is where your data files, models, etc. (anything you want to
-version with DVC) are actually stored. The data files and directories visible in
-the workspace are links\* to (or copies of) the ones in cache.
-Learn more about it's
-[structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).
-
-_from tooltip_
-
-The DVC cache is a hidden storage (by default located in the `.dvc/cache`
-directory) for files that are tracked by DVC, and their different versions.
-Learn more about it's
-[structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).
diff --git a/content/docs/user-guide/concepts/workspace.md b/content/docs/user-guide/concepts/workspace.md
deleted file mode 100644
index f2f070adb3..0000000000
--- a/content/docs/user-guide/concepts/workspace.md
+++ /dev/null
@@ -1,15 +0,0 @@
-# Workspace
-
-_from workspace tooltip_
-
-Directory containing all your project files e.g. raw datasets, source code, ML
-models, etc. Typically, it's also a Git repository. It will contain your DVC
-project.
-
-_from dvc-project tooltip_
-
-Initialized by running `dvc init` in the **workspace** (typically a Git
-repository). It will contain the
-[`.dvc/` directory](/doc/user-guide/dvc-files-and-directories), as well as
-`dvc.yaml` and `.dvc` files created with commands such as `dvc add` or
-`dvc run`.
diff --git a/content/docs/user-guide/glossary/dvc-cache.md b/content/docs/user-guide/glossary/dvc-cache.md
index 49d9d31333..62c6f0c7e2 100644
--- a/content/docs/user-guide/glossary/dvc-cache.md
+++ b/content/docs/user-guide/glossary/dvc-cache.md
@@ -1,16 +1,18 @@
---
name: 'DVC Cache'
-match: ['DVC cache', cache, caches, cached]
+match: ['DVC cache', 'cache', 'caches', 'cached']
tooltip: >-
The DVC cache is a hidden storage (by default located in the `.dvc/cache`
directory) for files that are tracked by DVC, and their different versions.
Learn more about it's
- [structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).
-
- Learn more about the [concept of cache](/doc/user-guide/concepts/cache) in
- DVC.
+ [structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).Learn
+ more about the [concept of cache](/doc/user-guide/glossary/dvc-cache) in DVC.
---
+# DVC Cache
+
+Diagram of cache/remote/workspace...
+
_From `dvc cache`_
The DVC Cache is where your data files, models, etc. (anything you want to
diff --git a/content/docs/user-guide/concepts/dvc-files.md b/content/docs/user-guide/glossary/dvc-files.md
similarity index 91%
rename from content/docs/user-guide/concepts/dvc-files.md
rename to content/docs/user-guide/glossary/dvc-files.md
index db41610c48..1b1e4b5f26 100644
--- a/content/docs/user-guide/concepts/dvc-files.md
+++ b/content/docs/user-guide/glossary/dvc-files.md
@@ -1,3 +1,9 @@
+---
+name: 'DVC Files'
+match: ['DVC files', 'files', 'directories']
+tooltip: 'DVC files tooltip...'
+---
+
# DVC Files
_from dvc-files-and-directories_
diff --git a/content/docs/user-guide/concepts/metrics-plots.md b/content/docs/user-guide/glossary/metrics-plots.md
similarity index 89%
rename from content/docs/user-guide/concepts/metrics-plots.md
rename to content/docs/user-guide/glossary/metrics-plots.md
index 676f7a9b23..408b5ab4a6 100644
--- a/content/docs/user-guide/concepts/metrics-plots.md
+++ b/content/docs/user-guide/glossary/metrics-plots.md
@@ -1,3 +1,9 @@
+---
+name: 'Metrics and Plots'
+match: ['metrics', 'plots']
+tooltip: 'Metrics and plots tooltip...'
+---
+
# Metrics and Plots
_from plots and metrics intros_
diff --git a/content/docs/user-guide/concepts/pipelines.md b/content/docs/user-guide/glossary/pipelines.md
similarity index 88%
rename from content/docs/user-guide/concepts/pipelines.md
rename to content/docs/user-guide/glossary/pipelines.md
index b1d68c1152..634f8f768b 100644
--- a/content/docs/user-guide/concepts/pipelines.md
+++ b/content/docs/user-guide/glossary/pipelines.md
@@ -1,3 +1,9 @@
+---
+name: 'Data Pipelines'
+match: ['data pipeline', 'pipeline', 'pipelines']
+tooltip: 'DVC pipelines tooltip...'
+---
+
# Data Pipelines
_from `dvc dag`_
diff --git a/content/docs/user-guide/concepts/remote.md b/content/docs/user-guide/glossary/remote.md
similarity index 85%
rename from content/docs/user-guide/concepts/remote.md
rename to content/docs/user-guide/glossary/remote.md
index a75bfd04a0..17f0b7928e 100644
--- a/content/docs/user-guide/concepts/remote.md
+++ b/content/docs/user-guide/glossary/remote.md
@@ -1,3 +1,9 @@
+---
+name: 'Remote Storage'
+match: ['DVC remote', 'remote', 'remote storage']
+tooltip: 'DVC remote storage tooltip...'
+---
+
# Remote Storage
_from `dvc remote`_
diff --git a/content/docs/user-guide/glossary/workspace.md b/content/docs/user-guide/glossary/workspace.md
index 5e5d1cfe24..6acde9c328 100644
--- a/content/docs/user-guide/glossary/workspace.md
+++ b/content/docs/user-guide/glossary/workspace.md
@@ -1,11 +1,19 @@
---
name: Workspace
match: [workspace]
+tooltip: >-
+ Directory containing all your project files e.g. raw datasets, source code, ML
+ models, etc. Typically, it's also a Git repository. It will contain your DVC
+ project. Learn more about the [workspace
+ concept](/doc/user-guide/glossary/workspace) in DVC.
---
-Directory containing all your project files e.g. raw datasets, source code, ML
-models, etc. Typically, it's also a Git repository. It will contain your DVC
-project.
+# Workspace
-Learn more about the [workspace concept](/doc/user-guide/concepts/workspace) in
-DVC.
+_from dvc-project tooltip_
+
+Initialized by running `dvc init` in the **workspace** (typically a Git
+repository). It will contain the
+[`.dvc/` directory](/doc/user-guide/dvc-files-and-directories), as well as
+`dvc.yaml` and `.dvc` files created with commands such as `dvc add` or
+`dvc run`.
From 53c99b7875fcd2a6e853f1d5429ce9492096dd66 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Fri, 20 Nov 2020 14:59:05 -0700
Subject: [PATCH 11/59] Files -> metafiles. Comments. Reorder concepts. Nav
Glossary -> Concepts
---
content/docs/sidebar.json | 10 +++++-----
content/docs/user-guide/glossary/dvc-cache.md | 4 ++--
.../glossary/{dvc-files.md => dvc-metafiles.md} | 8 ++++----
content/docs/user-guide/glossary/dvc-project.md | 3 ---
content/docs/user-guide/glossary/metrics-plots.md | 8 ++++----
content/docs/user-guide/glossary/pipelines.md | 2 +-
content/docs/user-guide/glossary/remote.md | 2 +-
content/docs/user-guide/glossary/workspace.md | 2 +-
8 files changed, 18 insertions(+), 21 deletions(-)
rename content/docs/user-guide/glossary/{dvc-files.md => dvc-metafiles.md} (90%)
diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json
index 4449cc8aae..1678c873ff 100644
--- a/content/docs/sidebar.json
+++ b/content/docs/sidebar.json
@@ -88,16 +88,16 @@
"source": "what-is-dvc.md"
},
{
+ "label": "Concepts",
"slug": "glossary",
"source": false,
"children": [
- "dvc-project",
+ "workspace",
+ "dvc-metafiles",
"dvc-cache",
- "dvc-files",
- "metrics-plots",
"pipelines",
- "remote",
- "workspace"
+ "metrics-plots",
+ "remote"
]
},
"dvc-files-and-directories",
diff --git a/content/docs/user-guide/glossary/dvc-cache.md b/content/docs/user-guide/glossary/dvc-cache.md
index 62c6f0c7e2..68cfd46cf3 100644
--- a/content/docs/user-guide/glossary/dvc-cache.md
+++ b/content/docs/user-guide/glossary/dvc-cache.md
@@ -11,9 +11,9 @@ tooltip: >-
# DVC Cache
-Diagram of cache/remote/workspace...
+
The DVC Cache is where your data files, models, etc. (anything you want to
version with DVC) are actually stored. The data files and directories visible in
diff --git a/content/docs/user-guide/glossary/dvc-files.md b/content/docs/user-guide/glossary/dvc-metafiles.md
similarity index 90%
rename from content/docs/user-guide/glossary/dvc-files.md
rename to content/docs/user-guide/glossary/dvc-metafiles.md
index 1b1e4b5f26..5980712679 100644
--- a/content/docs/user-guide/glossary/dvc-files.md
+++ b/content/docs/user-guide/glossary/dvc-metafiles.md
@@ -1,12 +1,12 @@
---
-name: 'DVC Files'
+name: 'DVC Metafiles'
match: ['DVC files', 'files', 'directories']
-tooltip: 'DVC files tooltip...'
+tooltip: 'DVC metafiles tooltip...'
---
-# DVC Files
+# DVC Metafiles
-_from dvc-files-and-directories_
+
Once initialized in a project, DVC populates its installation
directory (`.dvc/`) with the
diff --git a/content/docs/user-guide/glossary/dvc-project.md b/content/docs/user-guide/glossary/dvc-project.md
index 204f037266..346d6a153e 100644
--- a/content/docs/user-guide/glossary/dvc-project.md
+++ b/content/docs/user-guide/glossary/dvc-project.md
@@ -26,6 +26,3 @@ repository). It will contain the
[`.dvc/` directory](/doc/user-guide/dvc-files-and-directories), as well as
`dvc.yaml` and `.dvc` files created with commands such as `dvc add` or
`dvc run`.
-
-This page extends for a _while_, elaborating on what a DVC Project is in a way
-where it doesn't make sense to put **all** this stuff in a tooltip.
diff --git a/content/docs/user-guide/glossary/metrics-plots.md b/content/docs/user-guide/glossary/metrics-plots.md
index 408b5ab4a6..518b67ee69 100644
--- a/content/docs/user-guide/glossary/metrics-plots.md
+++ b/content/docs/user-guide/glossary/metrics-plots.md
@@ -6,7 +6,7 @@ tooltip: 'Metrics and plots tooltip...'
# Metrics and Plots
-_from plots and metrics intros_
+
DVC has two concepts for metrics, that represent different results of machine
learning training or data processing:
@@ -16,17 +16,17 @@ learning training or data processing:
2. `dvc plots` can be used to visualize **data series** such as AUC curves, loss
functions, confusion matrices, etc.
-_from `dvc metrics`_
+
In order to follow the performance of machine learning experiments, DVC has the
ability to mark a certain stage outputs as metrics. These metrics
are project-specific floating-point or integer values e.g. AUC, ROC, false
positives, etc.
-_from `dvc plots` description_
+
DVC provides a set of commands to visualize certain metrics of machine learning
experiments as plots. Usual plot examples are AUC curves, loss functions,
confusion matrices, among others.
-_probably should mention diff..._
+
diff --git a/content/docs/user-guide/glossary/pipelines.md b/content/docs/user-guide/glossary/pipelines.md
index 634f8f768b..a5367ca7ed 100644
--- a/content/docs/user-guide/glossary/pipelines.md
+++ b/content/docs/user-guide/glossary/pipelines.md
@@ -6,7 +6,7 @@ tooltip: 'DVC pipelines tooltip...'
# Data Pipelines
-_from `dvc dag`_
+
A data pipeline, in general, is a series of data processing
[stages](/doc/command-reference/run) (for example, console commands that take an
diff --git a/content/docs/user-guide/glossary/remote.md b/content/docs/user-guide/glossary/remote.md
index 17f0b7928e..c0a5ab4c0f 100644
--- a/content/docs/user-guide/glossary/remote.md
+++ b/content/docs/user-guide/glossary/remote.md
@@ -6,7 +6,7 @@ tooltip: 'DVC remote storage tooltip...'
# Remote Storage
-_from `dvc remote`_
+
What is data remote?
diff --git a/content/docs/user-guide/glossary/workspace.md b/content/docs/user-guide/glossary/workspace.md
index 6acde9c328..42ed55017c 100644
--- a/content/docs/user-guide/glossary/workspace.md
+++ b/content/docs/user-guide/glossary/workspace.md
@@ -10,7 +10,7 @@ tooltip: >-
# Workspace
-_from dvc-project tooltip_
+
Initialized by running `dvc init` in the **workspace** (typically a Git
repository). It will contain the
From 5e9ed4f2bb3a8389fe6cc47265b3c8a98458f406 Mon Sep 17 00:00:00 2001
From: Jorge Orpinel
Date: Fri, 20 Nov 2020 16:46:21 -0600
Subject: [PATCH 12/59] guide: full Basic Concepts name on nav
---
content/docs/sidebar.json | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json
index 1678c873ff..97de25faf0 100644
--- a/content/docs/sidebar.json
+++ b/content/docs/sidebar.json
@@ -88,7 +88,7 @@
"source": "what-is-dvc.md"
},
{
- "label": "Concepts",
+ "label": "Basic Concepts",
"slug": "glossary",
"source": false,
"children": [
From 05b28ca95dae160a14a8169af470bec7967e2349 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Fri, 20 Nov 2020 16:12:12 -0700
Subject: [PATCH 13/59] Revert DVC Project glossary item.
---
content/docs/user-guide/glossary/dvc-project.md | 8 --------
1 file changed, 8 deletions(-)
diff --git a/content/docs/user-guide/glossary/dvc-project.md b/content/docs/user-guide/glossary/dvc-project.md
index 346d6a153e..af2a191a50 100644
--- a/content/docs/user-guide/glossary/dvc-project.md
+++ b/content/docs/user-guide/glossary/dvc-project.md
@@ -1,5 +1,4 @@
---
-title: 'DVC Project'
name: 'DVC Project'
match:
[
@@ -12,15 +11,8 @@ match:
repository,
repositories,
]
-tooltip: >-
- Initialized by running `dvc init` in the **workspace** (typically a Git
- repository). It will contain the [`.dvc/`
- directory](/doc/user-guide/dvc-files-and-directories), as well as `dvc.yaml`
- and `.dvc` files created with commands such as `dvc add` or `dvc run`.
---
-# DVC Project
-
Initialized by running `dvc init` in the **workspace** (typically a Git
repository). It will contain the
[`.dvc/` directory](/doc/user-guide/dvc-files-and-directories), as well as
From c4d45b2df788de80f59e260293d2e1562e4ba2d4 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Fri, 20 Nov 2020 16:32:41 -0700
Subject: [PATCH 14/59] Update basic concepts nav titles
---
content/docs/sidebar.json | 6 +++---
.../user-guide/glossary/{pipelines.md => data-pipelines.md} | 0
.../glossary/{metrics-plots.md => metrics-and-plots.md} | 0
.../user-guide/glossary/{remote.md => remote-storage.md} | 0
4 files changed, 3 insertions(+), 3 deletions(-)
rename content/docs/user-guide/glossary/{pipelines.md => data-pipelines.md} (100%)
rename content/docs/user-guide/glossary/{metrics-plots.md => metrics-and-plots.md} (100%)
rename content/docs/user-guide/glossary/{remote.md => remote-storage.md} (100%)
diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json
index 97de25faf0..3d458facb9 100644
--- a/content/docs/sidebar.json
+++ b/content/docs/sidebar.json
@@ -95,9 +95,9 @@
"workspace",
"dvc-metafiles",
"dvc-cache",
- "pipelines",
- "metrics-plots",
- "remote"
+ "data-pipelines",
+ "metrics-and-plots",
+ "remote-storage"
]
},
"dvc-files-and-directories",
diff --git a/content/docs/user-guide/glossary/pipelines.md b/content/docs/user-guide/glossary/data-pipelines.md
similarity index 100%
rename from content/docs/user-guide/glossary/pipelines.md
rename to content/docs/user-guide/glossary/data-pipelines.md
diff --git a/content/docs/user-guide/glossary/metrics-plots.md b/content/docs/user-guide/glossary/metrics-and-plots.md
similarity index 100%
rename from content/docs/user-guide/glossary/metrics-plots.md
rename to content/docs/user-guide/glossary/metrics-and-plots.md
diff --git a/content/docs/user-guide/glossary/remote.md b/content/docs/user-guide/glossary/remote-storage.md
similarity index 100%
rename from content/docs/user-guide/glossary/remote.md
rename to content/docs/user-guide/glossary/remote-storage.md
From 02aa8ea1774329255f06d4b097fe0fab84d07f56 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Fri, 20 Nov 2020 16:39:26 -0700
Subject: [PATCH 15/59] Update content/docs/sidebar.json
Co-authored-by: Jorge Orpinel
---
content/docs/sidebar.json | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json
index 3d458facb9..44a069054f 100644
--- a/content/docs/sidebar.json
+++ b/content/docs/sidebar.json
@@ -96,8 +96,8 @@
"dvc-metafiles",
"dvc-cache",
"data-pipelines",
- "metrics-and-plots",
- "remote-storage"
+ "remote-storage",
+ "metrics-and-plots"
]
},
"dvc-files-and-directories",
From 155c9bf07566caeaf858e4e9b42d565bd8989e13 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Fri, 20 Nov 2020 16:48:48 -0700
Subject: [PATCH 16/59] Update dvc-cache and workspace tooltips.
---
content/docs/user-guide/glossary/dvc-cache.md | 9 ++++-----
content/docs/user-guide/glossary/workspace.md | 7 +++----
2 files changed, 7 insertions(+), 9 deletions(-)
diff --git a/content/docs/user-guide/glossary/dvc-cache.md b/content/docs/user-guide/glossary/dvc-cache.md
index 68cfd46cf3..7eec6952bf 100644
--- a/content/docs/user-guide/glossary/dvc-cache.md
+++ b/content/docs/user-guide/glossary/dvc-cache.md
@@ -2,11 +2,10 @@
name: 'DVC Cache'
match: ['DVC cache', 'cache', 'caches', 'cached']
tooltip: >-
- The DVC cache is a hidden storage (by default located in the `.dvc/cache`
- directory) for files that are tracked by DVC, and their different versions.
- Learn more about it's
- [structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).Learn
- more about the [concept of cache](/doc/user-guide/glossary/dvc-cache) in DVC.
+ The [DVC cache](/doc/user-guide/glossary/dvc-cache) is a hidden storage (by
+ default located in the `.dvc/cache` directory) for files that are tracked by
+ DVC, and their different versions. Learn more about it's
+ [structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).
---
# DVC Cache
diff --git a/content/docs/user-guide/glossary/workspace.md b/content/docs/user-guide/glossary/workspace.md
index 42ed55017c..ec6f9595e6 100644
--- a/content/docs/user-guide/glossary/workspace.md
+++ b/content/docs/user-guide/glossary/workspace.md
@@ -2,10 +2,9 @@
name: Workspace
match: [workspace]
tooltip: >-
- Directory containing all your project files e.g. raw datasets, source code, ML
- models, etc. Typically, it's also a Git repository. It will contain your DVC
- project. Learn more about the [workspace
- concept](/doc/user-guide/glossary/workspace) in DVC.
+ The [workspace](/doc/user-guide/glossary/workspace) is the directory
+ containing all your project files e.g. raw datasets, source code, ML models,
+ etc. Typically, it's also a Git repository. It will contain your DVC project.
---
# Workspace
From 5195366270ee379f71735160f3208877171d2ca0 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Tue, 24 Nov 2020 21:20:54 -0700
Subject: [PATCH 17/59] Add link placeholders for basic concept tooltips.
---
content/docs/user-guide/glossary/data-pipelines.md | 3 ++-
content/docs/user-guide/glossary/dvc-metafiles.md | 3 ++-
content/docs/user-guide/glossary/metrics-and-plots.md | 3 ++-
content/docs/user-guide/glossary/remote-storage.md | 3 ++-
4 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/content/docs/user-guide/glossary/data-pipelines.md b/content/docs/user-guide/glossary/data-pipelines.md
index a5367ca7ed..de67f81c6a 100644
--- a/content/docs/user-guide/glossary/data-pipelines.md
+++ b/content/docs/user-guide/glossary/data-pipelines.md
@@ -1,7 +1,8 @@
---
name: 'Data Pipelines'
match: ['data pipeline', 'pipeline', 'pipelines']
-tooltip: 'DVC pipelines tooltip...'
+tooltip: >-
+ 'DVC [data pipelines](/doc/user-guide/glossary/data-pipelines) tooltip...'
---
# Data Pipelines
diff --git a/content/docs/user-guide/glossary/dvc-metafiles.md b/content/docs/user-guide/glossary/dvc-metafiles.md
index 5980712679..60a9968a02 100644
--- a/content/docs/user-guide/glossary/dvc-metafiles.md
+++ b/content/docs/user-guide/glossary/dvc-metafiles.md
@@ -1,7 +1,8 @@
---
name: 'DVC Metafiles'
match: ['DVC files', 'files', 'directories']
-tooltip: 'DVC metafiles tooltip...'
+tooltip: >-
+ 'DVC [metafiles](/doc/user-guide/glossary/dvc-metafiles) tooltip...'
---
# DVC Metafiles
diff --git a/content/docs/user-guide/glossary/metrics-and-plots.md b/content/docs/user-guide/glossary/metrics-and-plots.md
index 518b67ee69..22522def60 100644
--- a/content/docs/user-guide/glossary/metrics-and-plots.md
+++ b/content/docs/user-guide/glossary/metrics-and-plots.md
@@ -1,7 +1,8 @@
---
name: 'Metrics and Plots'
match: ['metrics', 'plots']
-tooltip: 'Metrics and plots tooltip...'
+tooltip: >-
+ '[Metrics and plots](/doc/user-guide/glossary/metrics-and-plots) tooltip...'
---
# Metrics and Plots
diff --git a/content/docs/user-guide/glossary/remote-storage.md b/content/docs/user-guide/glossary/remote-storage.md
index c0a5ab4c0f..7a778affc3 100644
--- a/content/docs/user-guide/glossary/remote-storage.md
+++ b/content/docs/user-guide/glossary/remote-storage.md
@@ -1,7 +1,8 @@
---
name: 'Remote Storage'
match: ['DVC remote', 'remote', 'remote storage']
-tooltip: 'DVC remote storage tooltip...'
+tooltip: >-
+ 'DVC [remote storage](/doc/user-guide/glossary/remote-storage) tooltip...'
---
# Remote Storage
From badc15746088ba589c0d17a6aa25a14430f81ae9 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Tue, 24 Nov 2020 22:13:37 -0700
Subject: [PATCH 18/59] Outline workspace, add notes
---
content/docs/user-guide/glossary/workspace.md | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/content/docs/user-guide/glossary/workspace.md b/content/docs/user-guide/glossary/workspace.md
index ec6f9595e6..81ce9c032c 100644
--- a/content/docs/user-guide/glossary/workspace.md
+++ b/content/docs/user-guide/glossary/workspace.md
@@ -9,6 +9,12 @@ tooltip: >-
# Workspace
+The workspace is the directory containing all your project files e.g. raw
+datasets, source code, ML models, etc. Typically, it's also a Git repository. It
+will contain your DVC project.
+
+
+
Initialized by running `dvc init` in the **workspace** (typically a Git
@@ -16,3 +22,11 @@ repository). It will contain the
[`.dvc/` directory](/doc/user-guide/dvc-files-and-directories), as well as
`dvc.yaml` and `.dvc` files created with commands such as `dvc add` or
`dvc run`.
+
+## What's the difference between workspace and project?
+
+
+
+## Things you can do in the Workspace
+
+`dvc init` to create a DVC project...
From e3deb5125a873db401bd1a627216fc9b70935612 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Tue, 24 Nov 2020 22:48:08 -0700
Subject: [PATCH 19/59] Initial extract of cache content into basic concepts
---
content/docs/command-reference/cache/dir.md | 2 +-
content/docs/command-reference/cache/index.md | 9 +--
content/docs/command-reference/config.md | 4 +-
.../user-guide/dvc-files-and-directories.md | 63 +---------------
content/docs/user-guide/glossary/dvc-cache.md | 73 +++++++++++++++++--
5 files changed, 75 insertions(+), 76 deletions(-)
diff --git a/content/docs/command-reference/cache/dir.md b/content/docs/command-reference/cache/dir.md
index 9f2cc9e751..34aba9ebbc 100644
--- a/content/docs/command-reference/cache/dir.md
+++ b/content/docs/command-reference/cache/dir.md
@@ -18,7 +18,7 @@ positional arguments:
## Description
Helper to set the `cache.dir` configuration option. (See
-[cache directory](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).)
+[cache directory](/doc/user-guide/glossary/dvc-cache#structure-of-the-cache-directory).)
Unlike doing so with `dvc config cache`, `dvc cache dir` transform paths
(`value`) that are provided relative to the current working directory into paths
**relative to the config file location**. However, if the `value` provided is an
diff --git a/content/docs/command-reference/cache/index.md b/content/docs/command-reference/cache/index.md
index 0117e2b4f7..74764b9d48 100644
--- a/content/docs/command-reference/cache/index.md
+++ b/content/docs/command-reference/cache/index.md
@@ -15,11 +15,10 @@ positional arguments:
## Description
-The DVC Cache is where your data files, models, etc. (anything you want to
-version with DVC) are actually stored. The data files and directories visible in
-the workspace are links\* to (or copies of) the ones in cache.
-Learn more about it's
-[structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).
+The DVC cache is where your data files, models, etc. (anything you
+want to version with DVC) are actually stored. The data files and directories
+visible in the workspace are links\* to (or copies of) the ones in
+cache.
> \* Refer to
> [File link types](/doc/user-guide/large-dataset-optimization#file-link-types-for-the-dvc-cache)
diff --git a/content/docs/command-reference/config.md b/content/docs/command-reference/config.md
index 84d964ab82..18faf56306 100644
--- a/content/docs/command-reference/config.md
+++ b/content/docs/command-reference/config.md
@@ -127,8 +127,8 @@ remote. See `dvc remote` for more information.
A DVC project cache is the hidden storage (by default located in
the `.dvc/cache` directory) for files that are tracked by DVC, and their
different versions. (See `dvc cache` and
-[DVC Files and Directories](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory)
-for more details.) This section contains the following options:
+[DVC Files and Directories](/doc/user-guide/dvc-files-and-directories) for more
+details.) This section contains the following options:
- `cache.dir` - set/unset cache directory location. A correct value is either an
absolute path, or a path **relative to the config file location**. The default
diff --git a/content/docs/user-guide/dvc-files-and-directories.md b/content/docs/user-guide/dvc-files-and-directories.md
index 98c0769a71..45c95b9e70 100644
--- a/content/docs/user-guide/dvc-files-and-directories.md
+++ b/content/docs/user-guide/dvc-files-and-directories.md
@@ -239,7 +239,7 @@ Full parameters (key and value) are listed separately under
hand or with the command `dvc config --local`.
- `.dvc/cache`: The cache directory will store your data in a
- special [structure](#structure-of-the-cache-directory). The data files and
+ special [structure](doc/user-guide/glossary/dvc-cache). The data files and
directories in the workspace will only contain links to the data
files in the cache. (Refer to
[Large Dataset Optimization](/doc/user-guide/large-dataset-optimization). See
@@ -279,64 +279,3 @@ Full parameters (key and value) are listed separately under
- `.dvc/tmp/rwlock`: JSON file that contains read and write locks for specific
dependencies and outputs, to allow safely running multiple DVC commands in
parallel
-
-## Structure of the cache directory
-
-The DVC cache is a
-[content-addressable storage](https://en.wikipedia.org/wiki/Content-addressable_storage)
-(by default in `.dvc/cache`), which adds a layer of indirection between code and
-data.
-
-There are two ways in which the data is cached: As a single file
-(eg. `data.csv`), or as a directory.
-
-### Files
-
-DVC calculates the file hash, a 32 characters long string (usually MD5). The
-first two characters are used to name the directory inside the cache, and the
-rest become the file name of the cached file. For example, if a data file has a
-hash value of `ec1d2935f811b77cc49b031b999cbf17`, its path in the cache will be
-`.dvc/cache/ec/1d2935f811b77cc49b031b999cbf17`.
-
-> Note that file hashes are calculated from file contents only. 2 or more files
-> with different names but the same contents can exist in the workspace and be
-> tracked by DVC, but only one copy is stored in the cache. This helps avoid
-> data duplication.
-
-### Directories
-
-Let's imagine [adding](/doc/command-reference/add) a directory with 2 images:
-
-```dvc
-$ tree data/images/
-data/images/
-├── cat.jpeg
-└── index.jpeg
-
-$ dvc add data/images
-```
-
-The directory is cached as a JSON file with `.dir` extension. The files it
-contains are stored in the cache regularly, as explained earlier. It looks like
-this:
-
-```dvc
-.dvc/cache/
-├── 19
-│ └── 6a322c107c2572335158503c64bfba.dir
-├── d4
-│ └── 1d8cd98f00b204e9800998ecf8427e
-└── 20
- └── 0b40427ee0998e9802335d98f08cd98f
-```
-
-The `.dir` file contains the mapping of files in `data/images` (as a JSON
-array), including their hash values:
-
-```dvc
-$ cat .dvc/cache/19/6a322c107c2572335158503c64bfba.dir
-[{"md5": "dff70c0392d7d386c39a23c64fcc0376", "relpath": "cat.jpeg"},
-{"md5": "29a6c8271c0c8fbf75d3b97aecee589f", "relpath": "index.jpeg"}]
-```
-
-That's how DVC knows that the other two cached files belong in the directory.
diff --git a/content/docs/user-guide/glossary/dvc-cache.md b/content/docs/user-guide/glossary/dvc-cache.md
index 7eec6952bf..a93846dd8e 100644
--- a/content/docs/user-guide/glossary/dvc-cache.md
+++ b/content/docs/user-guide/glossary/dvc-cache.md
@@ -4,18 +4,79 @@ match: ['DVC cache', 'cache', 'caches', 'cached']
tooltip: >-
The [DVC cache](/doc/user-guide/glossary/dvc-cache) is a hidden storage (by
default located in the `.dvc/cache` directory) for files that are tracked by
- DVC, and their different versions. Learn more about it's
- [structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).
+ DVC, and their different versions.
---
# DVC Cache
-
-_From `dvc cache`_ -->
+
The DVC Cache is where your data files, models, etc. (anything you want to
version with DVC) are actually stored. The data files and directories visible in
the workspace are links\* to (or copies of) the ones in cache.
-Learn more about it's
-[structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).
+Learn more about its [structure](#structure-of-the-cache-directory).
+
+
+
+## Structure of the cache directory
+
+The DVC cache is a
+[content-addressable storage](https://en.wikipedia.org/wiki/Content-addressable_storage)
+(by default in `.dvc/cache`), which adds a layer of indirection between code and
+data.
+
+There are two ways in which the data is cached: As a single file
+(eg. `data.csv`), or as a directory.
+
+### Files
+
+DVC calculates the file hash, a 32 characters long string (usually MD5). The
+first two characters are used to name the directory inside the cache, and the
+rest become the file name of the cached file. For example, if a data file has a
+hash value of `ec1d2935f811b77cc49b031b999cbf17`, its path in the cache will be
+`.dvc/cache/ec/1d2935f811b77cc49b031b999cbf17`.
+
+> Note that file hashes are calculated from file contents only. 2 or more files
+> with different names but the same contents can exist in the workspace and be
+> tracked by DVC, but only one copy is stored in the cache. This helps avoid
+> data duplication.
+
+### Directories
+
+Let's imagine [adding](/doc/command-reference/add) a directory with 2 images:
+
+```dvc
+$ tree data/images/
+data/images/
+├── cat.jpeg
+└── index.jpeg
+
+$ dvc add data/images
+```
+
+The directory is cached as a JSON file with `.dir` extension. The files it
+contains are stored in the cache regularly, as explained earlier. It looks like
+this:
+
+```dvc
+.dvc/cache/
+├── 19
+│ └── 6a322c107c2572335158503c64bfba.dir
+├── d4
+│ └── 1d8cd98f00b204e9800998ecf8427e
+└── 20
+ └── 0b40427ee0998e9802335d98f08cd98f
+```
+
+The `.dir` file contains the mapping of files in `data/images` (as a JSON
+array), including their hash values:
+
+```dvc
+$ cat .dvc/cache/19/6a322c107c2572335158503c64bfba.dir
+[{"md5": "dff70c0392d7d386c39a23c64fcc0376", "relpath": "cat.jpeg"},
+{"md5": "29a6c8271c0c8fbf75d3b97aecee589f", "relpath": "index.jpeg"}]
+```
+
+That's how DVC knows that the other two cached files belong in the directory.
From 77d82bc2c5fbc04b83be6420c6253b2d6ef1b8da Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Tue, 24 Nov 2020 22:59:27 -0700
Subject: [PATCH 20/59] Fix broken link
---
content/docs/user-guide/dvc-files-and-directories.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/content/docs/user-guide/dvc-files-and-directories.md b/content/docs/user-guide/dvc-files-and-directories.md
index 45c95b9e70..e0f2ddb577 100644
--- a/content/docs/user-guide/dvc-files-and-directories.md
+++ b/content/docs/user-guide/dvc-files-and-directories.md
@@ -239,7 +239,7 @@ Full parameters (key and value) are listed separately under
hand or with the command `dvc config --local`.
- `.dvc/cache`: The cache directory will store your data in a
- special [structure](doc/user-guide/glossary/dvc-cache). The data files and
+ special [structure](/doc/user-guide/glossary/dvc-cache). The data files and
directories in the workspace will only contain links to the data
files in the cache. (Refer to
[Large Dataset Optimization](/doc/user-guide/large-dataset-optimization). See
From b7aed74bfa9e4647c9b6cec34acb87e3be355e68 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Wed, 25 Nov 2020 12:55:33 -0700
Subject: [PATCH 21/59] Add remotes note
---
content/docs/user-guide/glossary/remote-storage.md | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/content/docs/user-guide/glossary/remote-storage.md b/content/docs/user-guide/glossary/remote-storage.md
index 7a778affc3..4f0d10828f 100644
--- a/content/docs/user-guide/glossary/remote-storage.md
+++ b/content/docs/user-guide/glossary/remote-storage.md
@@ -22,3 +22,7 @@ branch/commit.
Using DVC with remote storage is optional. DVC commands use the local cache
(usually in dir `.dvc/cache`) as data storage by default. This enables the main
DVC usage scenarios out of the box.
+
+
+
+
From d966b697a9e18862dca6a7267b5e5aca1bfa00be Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Wed, 25 Nov 2020 13:08:34 -0700
Subject: [PATCH 22/59] Extract data pipeline concept from dag -> basic
concepts, add tooltip
---
content/docs/command-reference/dag.md | 21 ++-----------------
.../user-guide/glossary/data-pipelines.md | 8 ++++++-
2 files changed, 9 insertions(+), 20 deletions(-)
diff --git a/content/docs/command-reference/dag.md b/content/docs/command-reference/dag.md
index 2fb7e9c5b4..7c1943d58d 100644
--- a/content/docs/command-reference/dag.md
+++ b/content/docs/command-reference/dag.md
@@ -15,25 +15,8 @@ positional arguments:
## Description
-A data pipeline, in general, is a series of data processing
-[stages](/doc/command-reference/run) (for example, console commands that take an
-input and produce an output). A pipeline may produce intermediate
-data, and has a final result.
-
-Data science and machine learning pipelines typically start with large raw
-datasets, include intermediate featurization and training stages, and produce a
-final model, as well as accuracy [metrics](/doc/command-reference/metrics).
-
-In DVC, pipeline stages and commands, their data I/O, interdependencies, and
-results (intermediate or final) are specified in `dvc.yaml`, which can be
-written manually or built using the helper command `dvc run`. This allows DVC to
-restore one or more pipelines later (see `dvc repro`).
-
-> DVC builds a dependency graph
-> ([DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) to do this.
-
-`dvc dag` command displays the stages of a pipeline up to the target stage. If
-`target` is omitted, it will show the full project DAG.
+The `dvc dag` command displays the stages of a data pipeline up to
+the target stage. If `target` is omitted, it will show the full project DAG.
## Options
diff --git a/content/docs/user-guide/glossary/data-pipelines.md b/content/docs/user-guide/glossary/data-pipelines.md
index de67f81c6a..ccf0d2bf9f 100644
--- a/content/docs/user-guide/glossary/data-pipelines.md
+++ b/content/docs/user-guide/glossary/data-pipelines.md
@@ -2,7 +2,11 @@
name: 'Data Pipelines'
match: ['data pipeline', 'pipeline', 'pipelines']
tooltip: >-
- 'DVC [data pipelines](/doc/user-guide/glossary/data-pipelines) tooltip...'
+ In DVC, [data pipeline](/doc/user-guide/glossary/data-pipelines) stages and
+ commands, inputs, outputs, interdependencies, and results (intermediate or
+ final) are specified in `dvc.yaml`, which can be written manually or built
+ using the helper command `dvc run`. This allows DVC to restore one or more
+ pipelines later (see `dvc repro`).
---
# Data Pipelines
@@ -25,3 +29,5 @@ restore one or more pipelines later (see `dvc repro`).
> DVC builds a dependency graph
> ([DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) to do this.
+
+
From 2b6b9cb0db1bd40ed991d632d7aece68be03ebff Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Wed, 25 Nov 2020 13:46:54 -0700
Subject: [PATCH 23/59] Extract remote storage concept from dvc remote -> basic
concepts
---
.../docs/command-reference/remote/index.md | 40 +++-------------
.../user-guide/glossary/remote-storage.md | 48 ++++++++++++++-----
2 files changed, 43 insertions(+), 45 deletions(-)
diff --git a/content/docs/command-reference/remote/index.md b/content/docs/command-reference/remote/index.md
index 1cbfc93175..6821e628f9 100644
--- a/content/docs/command-reference/remote/index.md
+++ b/content/docs/command-reference/remote/index.md
@@ -1,6 +1,6 @@
# remote
-A set of commands to set up and manage data remotes:
+A set of commands to set up and manage data remotes:
[add](/doc/command-reference/remote/add),
[default](/doc/command-reference/remote/default),
[list](/doc/command-reference/remote/list),
@@ -24,44 +24,18 @@ positional arguments:
## Description
-What is data remote?
-
-The same way as GitHub provides storage hosting for Git repositories, DVC
-remotes provide a location to store and share data and models. You can pull data
-assets created by colleagues from DVC remotes without spending time and
-resources to build or process them locally. Remote storage can also save space
-on your local environment – DVC can [fetch](/doc/command-reference/fetch) into
-the cache directory only the data you need for a specific
-branch/commit.
-
-Using DVC with remote storage is optional. DVC commands use the local cache
-(usually in dir `.dvc/cache`) as data storage by default. This enables the main
-DVC usage scenarios out of the box.
-
-DVC supports several types of remote storage: local file system, SSH, Amazon S3,
-Google Cloud Storage, HTTP, HDFS, among others. Refer to `dvc remote add` for
-more details.
-
-> If you installed DVC via `pip` and plan to use cloud services as remote
-> storage, you might need to install these optional dependencies: `[s3]`,
-> `[azure]`, `[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Alternatively, use `[all]` to
-> include them all. The command should look like this: `pip install "dvc[s3]"`.
-> (This example installs `boto3` library along with DVC to support S3 storage.)
-
-### Managing remote storage
-
-> For the typical process to share the project via remote, see
-> [Sharing Data And Model Files](/doc/use-cases/sharing-data-and-model-files).
-
The [add](/doc/command-reference/remote/add),
[default](/doc/command-reference/remote/default),
[list](/doc/command-reference/remote/list),
[modify](/doc/command-reference/remote/modify),
[remove](/doc/command-reference/remote/remove), and
[rename](/doc/command-reference/remote/rename) subcommands read or modify DVC
-[config files](/doc/command-reference/config), where DVC remotes are setup.
-Alternatively, `dvc config` can be used, or the config files can be edited
-manually.
+[config files](/doc/command-reference/config), where DVC remotes
+are set up. Alternatively, `dvc config` can be used, or the config files can be
+edited manually.
+
+> For the typical process to share the project via remote, see
+> [Sharing Data And Model Files](/doc/use-cases/sharing-data-and-model-files).
## Options
diff --git a/content/docs/user-guide/glossary/remote-storage.md b/content/docs/user-guide/glossary/remote-storage.md
index 4f0d10828f..8dbc3e7c70 100644
--- a/content/docs/user-guide/glossary/remote-storage.md
+++ b/content/docs/user-guide/glossary/remote-storage.md
@@ -1,28 +1,52 @@
---
name: 'Remote Storage'
-match: ['DVC remote', 'remote', 'remote storage']
+match:
+ [
+ 'DVC remote',
+ 'DVC remotes',
+ 'remote',
+ 'remote storage',
+ 'data remote',
+ 'data remotes',
+ ]
tooltip: >-
- 'DVC [remote storage](/doc/user-guide/glossary/remote-storage) tooltip...'
+ [DVC remotes](/doc/user-guide/glossary/remote-storage) provide a location to
+ store and share data and models. You can pull data assets created by
+ colleagues from DVC remotes without spending time and resources to build or
+ process them locally. Remote storage can also save space on your local
+ environment.
---
# Remote Storage
-What is data remote?
-
The same way as GitHub provides storage hosting for Git repositories, DVC
remotes provide a location to store and share data and models. You can pull data
assets created by colleagues from DVC remotes without spending time and
-resources to build or process them locally. Remote storage can also save space
-on your local environment – DVC can [fetch](/doc/command-reference/fetch) into
-the cache directory only the data you need for a specific
-branch/commit.
+resources to build or process them locally.
+
+Remote storage can also save space on your local environment – DVC can
+[fetch](/doc/command-reference/fetch) into the cache directory only
+the data you need for a specific branch/commit.
-Using DVC with remote storage is optional. DVC commands use the local cache
-(usually in dir `.dvc/cache`) as data storage by default. This enables the main
-DVC usage scenarios out of the box.
+> DVC remotes are **not** Git remotes. They are cache backups, not distributed
+> copies of the DVC project.
-
+Using DVC with remote storage is optional. DVC commands use the local
+cache (usually in dir `.dvc/cache`) as data storage by default.
+This enables the main DVC usage scenarios out of the box.
+
+## Types of remote storage
+
+DVC supports several types of remote storage: local file system, SSH, Amazon S3,
+Google Cloud Storage, HTTP, HDFS, among others. Refer to `dvc remote add` for
+more details.
+
+> If you installed DVC via `pip` and plan to use cloud services as remote
+> storage, you might need to install these optional dependencies: `[s3]`,
+> `[azure]`, `[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Alternatively, use `[all]` to
+> include them all. The command should look like this: `pip install "dvc[s3]"`.
+> (This example installs `boto3` library along with DVC to support S3 storage.)
From 3d952dd424dcd0776fda92d43e22e40c4591232d Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Thu, 26 Nov 2020 21:34:17 -0700
Subject: [PATCH 24/59] Initial extract metrics, plots -> basic concepts. Add
tooltip.
---
.../docs/command-reference/metrics/diff.md | 10 ++
.../docs/command-reference/metrics/index.md | 32 +----
content/docs/command-reference/plots/index.md | 136 ++----------------
content/docs/command-reference/plots/show.md | 62 +++++++-
.../user-guide/glossary/metrics-and-plots.md | 24 +++-
5 files changed, 100 insertions(+), 164 deletions(-)
diff --git a/content/docs/command-reference/metrics/diff.md b/content/docs/command-reference/metrics/diff.md
index cfbecae900..6357d31e0a 100644
--- a/content/docs/command-reference/metrics/diff.md
+++ b/content/docs/command-reference/metrics/diff.md
@@ -31,6 +31,16 @@ specified, `dvc metrics diff` compares metrics currently present in the
(required). A single specified revision results in comparing the workspace and
that version.
+Unlike its `dvc plots` counterpart, `dvc metrics diff` can report the numeric
+difference between the metrics in different experiments, for example an `AUC`
+metrics that is `0.801807` and gets increase by `+0.037826`:
+
+```dvc
+$ dvc metrics diff
+ Path Metric Value Change
+summary.json AUC 0.801807 0.037826
+```
+
Another way to display metrics is the `dvc metrics show` command, which just
lists all the current metrics, without comparisons.
diff --git a/content/docs/command-reference/metrics/index.md b/content/docs/command-reference/metrics/index.md
index d34cfead1d..d450130ba0 100644
--- a/content/docs/command-reference/metrics/index.md
+++ b/content/docs/command-reference/metrics/index.md
@@ -1,6 +1,6 @@
# metrics
-A set of commands to display and compare _metrics_:
+A set of commands to display and compare metrics:
[show](/doc/command-reference/metrics/show), and
[diff](/doc/command-reference/metrics/diff).
@@ -15,38 +15,8 @@ positional arguments:
diff Show changes in metrics between commits.
```
-## Types of metrics
-
-DVC has two concepts for metrics, that represent different results of machine
-learning training or data processing:
-
-1. `dvc metrics` represent **scalar numbers** such as AUC, _true positive rate_,
- etc.
-2. `dvc plots` can be used to visualize **data series** such as AUC curves, loss
- functions, confusion matrices, etc.
-
## Description
-In order to follow the performance of machine learning experiments, DVC has the
-ability to mark a certain stage outputs as metrics. These metrics
-are project-specific floating-point or integer values e.g. AUC, ROC, false
-positives, etc.
-
-This type of metrics files are typically generated by user data processing code,
-and are tracked using the `-m` (`--metrics`) and `-M` (`--metrics-no-cache`)
-options of `dvc run`.
-
-In contrast to `dvc plots`, these metrics should be stored in hierarchical
-files. Unlike its `dvc plots` counterpart, `dvc metrics diff` can report the
-numeric difference between the metrics in different experiments, for example an
-`AUC` metrics that is `0.801807` and gets increase by `+0.037826`:
-
-```dvc
-$ dvc metrics diff
- Path Metric Value Change
-summary.json AUC 0.801807 0.037826
-```
-
`dvc metrics` subcommands by default use the metrics files specified in
`dvc.yaml` (if any), for example `summary.json` below:
diff --git a/content/docs/command-reference/plots/index.md b/content/docs/command-reference/plots/index.md
index 9a9af955bf..e0ca009bfb 100644
--- a/content/docs/command-reference/plots/index.md
+++ b/content/docs/command-reference/plots/index.md
@@ -1,7 +1,8 @@
# plots
-A set of commands to visualize and compare _plot metrics_ in structured files
-(JSON, YAML, CSV, or TSV): [show](/doc/command-reference/plots/show),
+A set of commands to visualize and compare plot metrics in
+structured files (JSON, YAML, CSV, or TSV):
+[show](/doc/command-reference/plots/show),
[diff](/doc/command-reference/plots/diff), and
[modify](/doc/command-reference/plots/modify).
@@ -17,37 +18,25 @@ positional arguments:
modify Modify plot properties associated with a target file.
```
-## Types of metrics
+## Description
-DVC has two concepts for metrics, that represent different results of machine
-learning training or data processing:
+...
-1. `dvc metrics` represent **scalar numbers** such as AUC, _true positive rate_,
- etc.
-2. `dvc plots` can be used to visualize **data series** such as AUC curves, loss
- functions, confusion matrices, etc.
+## Options
-## Description
+- `-h`, `--help` - prints the usage/help message, and exit.
+
+- `-q`, `--quiet` - do not write anything to standard output.
-DVC provides a set of commands to visualize certain metrics of machine learning
-experiments as plots. Usual plot examples are AUC curves, loss functions,
-confusion matrices, among others.
+- `-v`, `--verbose` - displays detailed tracing information.
-This type of metrics files are created by users, or generated by user data
-processing code, and can be defined in `dvc.yaml` (`plots` field) for tracking
-(optional).
+## Supported file formats
DVC generates plots as HTML files that can be open with a web browser. These
HTML files use [Vega-Lite](https://vega.github.io/vega-lite/). Vega is a
declarative grammar for defining plots using JSON. The plots can also be saved
as SVG or PNG image filed from the browser.
-In contrast to `dvc metrics`, these metrics should be stored as data series.
-Unlike its `dvc metrics` counterpart, `dvc plots diff` cannot calculate numeric
-differences between the metrics in different experiments.
-
-### Supported file formats
-
Plot metrics can be organized as data series in JSON, YAML 1.2, CSV, or TSV
files. DVC expects to see an array (or multiple arrays) of objects (usually
_float numbers_) in the file.
@@ -154,106 +143,3 @@ header (first row) are equivalent to field names.
- `` (optional) - field name to display as the X axis label
- `` (optional) - field name to display as the X axis label
-
-## Options
-
-- `-h`, `--help` - prints the usage/help message, and exit.
-
-- `-q`, `--quiet` - do not write anything to standard output.
-
-- `-v`, `--verbose` - displays detailed tracing information.
-
-## Example: Tabular data
-
-We'll use tabular metrics file `logs.csv` for this example:
-
-```
-epoch,accuracy,loss,val_accuracy,val_loss
-0,0.9418667,0.19958884770199656,0.9679,0.10217399864746257
-1,0.9763333,0.07896138601688048,0.9768,0.07310650711813942
-2,0.98375,0.05241111190887168,0.9788,0.06665669009438716
-3,0.98801666,0.03681169906261687,0.9781,0.06697812260198989
-4,0.99111664,0.027362171787042946,0.978,0.07385754839298315
-5,0.9932333,0.02069501801203781,0.9771,0.08009233058886166
-6,0.9945,0.017702101902437668,0.9803,0.07830339228538505
-7,0.9954,0.01396906608727198,0.9802,0.07247738889862157
-```
-
-Let's plot the last column (default behavior):
-
-```dvc
-$ dvc plots show logs.csv
-file:///Users/usr/src/plots/logs.csv.html
-```
-
-
-
-Difference in this metric between the current project version and the previous
-commit:
-
-```dvc
-$ dvc plots diff -d logs.csv HEAD^
-file:///Users/usr/src/plots/logs.csv.html
-```
-
-
-
-Visualize a specific field:
-
-```dvc
-$ dvc plots show -y loss logs.csv
-file:///Users/usr/src/plots/logs.html
-```
-
-
-
-## Example: Smooth plot
-
-In some cases we would like to smooth our plot. In this example we will use a
-plot with 1000 data points:
-
-```dvc
-$ dvc plots show data.csv
-file:///Users/usr/src/plots/plots.html
-```
-
-
-
-We can use the `-t` option and `smooth` template to make it less noisy:
-
-```dvc
-$ dvc plots show -t smooth data.csv
-file:///Users/usr/src/plots/plots.html
-```
-
-
-
-## Example: Confusion matrix
-
-We'll use `classes.csv` for this example:
-
-```
-actual,predicted
-cat,cat
-cat,cat
-cat,cat
-cat,dog
-cat,dinosaur
-cat,dinosaur
-cat,bird
-turtle,dog
-turtle,cat
-...
-```
-
-Let's visualize it:
-
-```dvc
-$ dvc plots show classes.csv --template confusion -x actual -y predicted
-file:///Users/usr/src/plots/classes.csv.html
-```
-
-
-
-> A confusion matrix [template](/doc/command-reference/plots#plot-templates) is
-> predefined in DVC (found in `.dvc/plots/confusion.json`).
diff --git a/content/docs/command-reference/plots/show.md b/content/docs/command-reference/plots/show.md
index 441380cb5e..ec4c1186d8 100644
--- a/content/docs/command-reference/plots/show.md
+++ b/content/docs/command-reference/plots/show.md
@@ -174,10 +174,60 @@ file:///Users/usr/src/plots/logs.csv.json
```json
{
- "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
- "data": {
- "values": [
- {
- "accuracy": "0.9418667",
- ...
+ "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
+ "data": {
+ "values": [{ "accuracy": "0.9418667", ... }]
+ }
+}
+```
+
+## Example: Smooth plot
+
+In some cases we would like to smooth our plot. In this example we will use a
+plot with 1000 data points:
+
+```dvc
+$ dvc plots show data.csv
+file:///Users/usr/src/plots/plots.html
```
+
+
+
+We can use the `-t` option and `smooth` template to make it less noisy:
+
+```dvc
+$ dvc plots show -t smooth data.csv
+file:///Users/usr/src/plots/plots.html
+```
+
+
+
+## Example: Confusion matrix
+
+We'll use `classes.csv` for this example:
+
+```
+actual,predicted
+cat,cat
+cat,cat
+cat,cat
+cat,dog
+cat,dinosaur
+cat,dinosaur
+cat,bird
+turtle,dog
+turtle,cat
+...
+```
+
+Let's visualize it:
+
+```dvc
+$ dvc plots show classes.csv --template confusion -x actual -y predicted
+file:///Users/usr/src/plots/classes.csv.html
+```
+
+
+
+> A confusion matrix [template](/doc/command-reference/plots#plot-templates) is
+> predefined in DVC (found in `.dvc/plots/confusion.json`).
diff --git a/content/docs/user-guide/glossary/metrics-and-plots.md b/content/docs/user-guide/glossary/metrics-and-plots.md
index 22522def60..0cd75cb08e 100644
--- a/content/docs/user-guide/glossary/metrics-and-plots.md
+++ b/content/docs/user-guide/glossary/metrics-and-plots.md
@@ -1,8 +1,11 @@
---
name: 'Metrics and Plots'
-match: ['metrics', 'plots']
+match: ['metrics', 'plots', 'plot metrics']
tooltip: >-
- '[Metrics and plots](/doc/user-guide/glossary/metrics-and-plots) tooltip...'
+ DVC [metrics and plots](/doc/user-guide/glossary/metrics-and-plots) provide
+ sets of commands to follow the performance of machine learning experiments.
+ Mark certain stage outputs as metrics and visualize metrics as
+ plots.
---
# Metrics and Plots
@@ -19,15 +22,32 @@ learning training or data processing:
+## Metrics
+
In order to follow the performance of machine learning experiments, DVC has the
ability to mark a certain stage outputs as metrics. These metrics
are project-specific floating-point or integer values e.g. AUC, ROC, false
positives, etc.
+This type of metrics files are typically generated by user data processing code,
+and are tracked using the `-m` (`--metrics`) and `-M` (`--metrics-no-cache`)
+options of `dvc run`.
+
+In contrast to `dvc plots`, these metrics should be stored in hierarchical
+files.
+
+## Plots
+
DVC provides a set of commands to visualize certain metrics of machine learning
experiments as plots. Usual plot examples are AUC curves, loss functions,
confusion matrices, among others.
+This type of metrics files are created by users, or generated by user data
+processing code, and can be defined in `dvc.yaml` (`plots` field) for tracking
+(optional).
+
+In contrast to `dvc metrics`, these metrics should be stored as data series.
+
From 9ec88c160f781bbf196b35865e45d0f89632bdc6 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Thu, 26 Nov 2020 21:38:07 -0700
Subject: [PATCH 25/59] Move supported file formats to Plots index description.
---
content/docs/command-reference/plots/index.md | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/content/docs/command-reference/plots/index.md b/content/docs/command-reference/plots/index.md
index e0ca009bfb..7bd2c390c1 100644
--- a/content/docs/command-reference/plots/index.md
+++ b/content/docs/command-reference/plots/index.md
@@ -22,15 +22,7 @@ positional arguments:
...
-## Options
-
-- `-h`, `--help` - prints the usage/help message, and exit.
-
-- `-q`, `--quiet` - do not write anything to standard output.
-
-- `-v`, `--verbose` - displays detailed tracing information.
-
-## Supported file formats
+### Supported file formats
DVC generates plots as HTML files that can be open with a web browser. These
HTML files use [Vega-Lite](https://vega.github.io/vega-lite/). Vega is a
@@ -74,6 +66,14 @@ names in the `train` array below:
}
```
+## Options
+
+- `-h`, `--help` - prints the usage/help message, and exit.
+
+- `-q`, `--quiet` - do not write anything to standard output.
+
+- `-v`, `--verbose` - displays detailed tracing information.
+
## Plot templates
Users have the ability to change the way plots are displayed by modifying the
From 487f1fbbca893a1e92227aadec69910fea717759 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Thu, 26 Nov 2020 21:51:00 -0700
Subject: [PATCH 26/59] Remove abbr from metrics and plots tooltip.
---
content/docs/user-guide/glossary/metrics-and-plots.md | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/content/docs/user-guide/glossary/metrics-and-plots.md b/content/docs/user-guide/glossary/metrics-and-plots.md
index 0cd75cb08e..903d79aa28 100644
--- a/content/docs/user-guide/glossary/metrics-and-plots.md
+++ b/content/docs/user-guide/glossary/metrics-and-plots.md
@@ -4,8 +4,7 @@ match: ['metrics', 'plots', 'plot metrics']
tooltip: >-
DVC [metrics and plots](/doc/user-guide/glossary/metrics-and-plots) provide
sets of commands to follow the performance of machine learning experiments.
- Mark certain stage outputs as metrics and visualize metrics as
- plots.
+ Mark certain stage outputs as metrics and visualize metrics as plots.
---
# Metrics and Plots
From e4483959eebce45a51213f423c16efa0cbe33b4a Mon Sep 17 00:00:00 2001
From: Jorge Orpinel
Date: Sat, 28 Nov 2020 19:32:42 -0600
Subject: [PATCH 27/59] Update content/docs/command-reference/dag.md
---
content/docs/command-reference/dag.md | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/content/docs/command-reference/dag.md b/content/docs/command-reference/dag.md
index 7c1943d58d..0d9f138665 100644
--- a/content/docs/command-reference/dag.md
+++ b/content/docs/command-reference/dag.md
@@ -15,8 +15,8 @@ positional arguments:
## Description
-The `dvc dag` command displays the stages of a data pipeline up to
-the target stage. If `target` is omitted, it will show the full project DAG.
+Displays the stages of a data pipeline up to the target stage. If
+`target` is omitted, it will show the full project DAG.
## Options
From 184339e382327290e77a02ceb50a59b175928e7b Mon Sep 17 00:00:00 2001
From: Jorge Orpinel
Date: Sat, 28 Nov 2020 19:47:50 -0600
Subject: [PATCH 28/59] cmd: remove the cache index since it's a basic concept
now
---
content/docs/command-reference/cache/index.md | 36 -------------------
content/docs/sidebar.json | 2 +-
2 files changed, 1 insertion(+), 37 deletions(-)
delete mode 100644 content/docs/command-reference/cache/index.md
diff --git a/content/docs/command-reference/cache/index.md b/content/docs/command-reference/cache/index.md
deleted file mode 100644
index 74764b9d48..0000000000
--- a/content/docs/command-reference/cache/index.md
+++ /dev/null
@@ -1,36 +0,0 @@
-# cache
-
-Contains a helper command to set the cache directory location:
-[dir](/doc/command-reference/cache/dir).
-
-## Synopsis
-
-```usage
-usage: dvc cache [-h] [-q] [-v] {dir} ...
-
-positional arguments:
- COMMAND
- dir Configure cache directory location.
-```
-
-## Description
-
-The DVC cache is where your data files, models, etc. (anything you
-want to version with DVC) are actually stored. The data files and directories
-visible in the workspace are links\* to (or copies of) the ones in
-cache.
-
-> \* Refer to
-> [File link types](/doc/user-guide/large-dataset-optimization#file-link-types-for-the-dvc-cache)
-> for more information on file links on different platforms.
-
-For cache configuration options, refer to `dvc config cache`.
-
-## Options
-
-- `-h`, `--help` - prints the usage/help message, and exit.
-
-- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no
- problems arise, otherwise 1.
-
-- `-v`, `--verbose` - displays detailed tracing information.
diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json
index c9d1d9b4b6..80b0b1a853 100644
--- a/content/docs/sidebar.json
+++ b/content/docs/sidebar.json
@@ -164,7 +164,7 @@
{
"label": "cache",
"slug": "cache",
- "source": "cache/index.md",
+ "source": false,
"children": [
{
"label": "cache dir",
From 784f6bb0cc0dabff33ede014d60f27f7e0217d3e Mon Sep 17 00:00:00 2001
From: Jorge Orpinel
Date: Sat, 28 Nov 2020 20:16:43 -0600
Subject: [PATCH 29/59] cmd: simplify metrics refs
---
.../docs/command-reference/metrics/diff.md | 20 +++++--------------
.../docs/command-reference/metrics/index.md | 10 ++++------
2 files changed, 9 insertions(+), 21 deletions(-)
diff --git a/content/docs/command-reference/metrics/diff.md b/content/docs/command-reference/metrics/diff.md
index c4fe3d4326..abc9570941 100644
--- a/content/docs/command-reference/metrics/diff.md
+++ b/content/docs/command-reference/metrics/diff.md
@@ -20,27 +20,17 @@ positional arguments:
## Description
-This command provides a quick way to compare metrics among experiments in the
-repository history. All metrics defined in `dvc.yaml` are used by default. The
-differences shown by this command include the new value, and numeric difference
-(delta) from the previous value of metrics (rounded to 5 digits precision).
+Provides a quick way to compare metrics among experiments in the repository
+history. All metrics defined in `dvc.yaml` are used by default. The differences
+shown by this command include the new value, and numeric difference (delta) from
+the previous value of metrics (rounded to 5 digits precision).
`a_rev` and `b_rev` are Git commit hashes, tag, or branch names. If none are
-specified, `dvc metrics diff` compares metrics currently present in the
+specified, this command compares metrics currently present in the
workspace (uncommitted changes) with the latest committed versions
(required). A single specified revision results in comparing the workspace and
that version.
-Unlike its `dvc plots` counterpart, `dvc metrics diff` can report the numeric
-difference between the metrics in different experiments, for example an `AUC`
-metrics that is `0.801807` and gets increase by `+0.037826`:
-
-```dvc
-$ dvc metrics diff
- Path Metric Value Change
-summary.json AUC 0.801807 0.037826
-```
-
Another way to display metrics is the `dvc metrics show` command, which just
lists all the current metrics, without comparisons.
diff --git a/content/docs/command-reference/metrics/index.md b/content/docs/command-reference/metrics/index.md
index d450130ba0..286a792062 100644
--- a/content/docs/command-reference/metrics/index.md
+++ b/content/docs/command-reference/metrics/index.md
@@ -17,8 +17,8 @@ positional arguments:
## Description
-`dvc metrics` subcommands by default use the metrics files specified in
-`dvc.yaml` (if any), for example `summary.json` below:
+`dvc metrics` subcommands by default use all metrics files found in `dvc.yaml`
+(if any), for example `summary.json` below:
```yaml
stages:
@@ -33,10 +33,8 @@ stages:
cache: false
```
-> `cache: false` above specifies that `summary.json` is not tracked or
-> cached by DVC (`-M` option of `dvc run`). These metrics files are
-> normally committed with Git instead. See `dvc.yaml` for more information on
-> the file format above.
+Note that metrics files are normally committed with Git (that's what
+`cache: false` above is for). See `dvc.yaml` for more information.
### Supported file formats
From 918ab49d70aca7812c3efb941ae0517bed7bdac8 Mon Sep 17 00:00:00 2001
From: Jorge Orpinel
Date: Sat, 28 Nov 2020 20:24:07 -0600
Subject: [PATCH 30/59] cmd: consistent metrics and plots index refs
---
.../docs/command-reference/metrics/index.md | 4 ++--
content/docs/command-reference/plots/index.md | 24 +++++++++++++++----
2 files changed, 22 insertions(+), 6 deletions(-)
diff --git a/content/docs/command-reference/metrics/index.md b/content/docs/command-reference/metrics/index.md
index 286a792062..eb2459e751 100644
--- a/content/docs/command-reference/metrics/index.md
+++ b/content/docs/command-reference/metrics/index.md
@@ -1,7 +1,7 @@
# metrics
-A set of commands to display and compare metrics:
-[show](/doc/command-reference/metrics/show), and
+A set of commands to display and compare metrics (JSON, YAML
+files): [show](/doc/command-reference/metrics/show), and
[diff](/doc/command-reference/metrics/diff).
## Synopsis
diff --git a/content/docs/command-reference/plots/index.md b/content/docs/command-reference/plots/index.md
index 5344781b08..e058d2bdd0 100644
--- a/content/docs/command-reference/plots/index.md
+++ b/content/docs/command-reference/plots/index.md
@@ -1,8 +1,7 @@
# plots
-A set of commands to visualize and compare plot metrics in
-structured files (JSON, YAML, CSV, or TSV):
-[show](/doc/command-reference/plots/show),
+A set of commands to visualize and compare plots (JSON, YAML, CSV,
+or TSV files): [show](/doc/command-reference/plots/show),
[diff](/doc/command-reference/plots/diff), and
[modify](/doc/command-reference/plots/modify).
@@ -20,7 +19,24 @@ positional arguments:
## Description
-...
+`dvc plots` subcommands by default use all plots files found in `dvc.yaml` (if
+any), for example `accuracy.json` below:
+
+```yaml
+stages:
+ train:
+ cmd: python train.py
+ deps:
+ - users.csv
+ outs:
+ - model.pkl
+ plots:
+ - accuracy.json:
+ cache: false
+```
+
+Note that metrics files are normally committed with Git (that's what
+`cache: false` above is for). See `dvc.yaml` for more information.
### Supported file formats
From 02ac3c63750148d4aed66c834fa1cea839131e0d Mon Sep 17 00:00:00 2001
From: Jorge Orpinel
Date: Sat, 28 Nov 2020 21:04:37 -0600
Subject: [PATCH 31/59] cmd: revert some changes in plots and remote refs
---
content/docs/command-reference/plots/show.md | 11 ++++++-----
content/docs/command-reference/remote/index.md | 6 +++---
2 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/content/docs/command-reference/plots/show.md b/content/docs/command-reference/plots/show.md
index ec4c1186d8..561c08d1dc 100644
--- a/content/docs/command-reference/plots/show.md
+++ b/content/docs/command-reference/plots/show.md
@@ -174,11 +174,12 @@ file:///Users/usr/src/plots/logs.csv.json
```json
{
- "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
- "data": {
- "values": [{ "accuracy": "0.9418667", ... }]
- }
-}
+ "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
+ "data": {
+ "values": [
+ {
+ "accuracy": "0.9418667",
+ ...
```
## Example: Smooth plot
diff --git a/content/docs/command-reference/remote/index.md b/content/docs/command-reference/remote/index.md
index 6821e628f9..eab6a0831a 100644
--- a/content/docs/command-reference/remote/index.md
+++ b/content/docs/command-reference/remote/index.md
@@ -30,9 +30,9 @@ The [add](/doc/command-reference/remote/add),
[modify](/doc/command-reference/remote/modify),
[remove](/doc/command-reference/remote/remove), and
[rename](/doc/command-reference/remote/rename) subcommands read or modify DVC
-[config files](/doc/command-reference/config), where DVC remotes
-are set up. Alternatively, `dvc config` can be used, or the config files can be
-edited manually.
+[config files](/doc/command-reference/config), where DVC remotes are set up.
+Alternatively, `dvc config` can be used, or the config files can be edited
+manually.
> For the typical process to share the project via remote, see
> [Sharing Data And Model Files](/doc/use-cases/sharing-data-and-model-files).
From 467819ab14a0897c8136208cac8acc791639af82 Mon Sep 17 00:00:00 2001
From: Jorge Orpinel
Date: Sat, 28 Nov 2020 21:24:35 -0600
Subject: [PATCH 32/59] guide: add some TODOs...
---
content/docs/user-guide/glossary/dvc-metafiles.md | 4 ++++
content/docs/user-guide/glossary/remote-storage.md | 4 ++++
2 files changed, 8 insertions(+)
diff --git a/content/docs/user-guide/glossary/dvc-metafiles.md b/content/docs/user-guide/glossary/dvc-metafiles.md
index 60a9968a02..f83812612d 100644
--- a/content/docs/user-guide/glossary/dvc-metafiles.md
+++ b/content/docs/user-guide/glossary/dvc-metafiles.md
@@ -5,6 +5,10 @@ tooltip: >-
'DVC [metafiles](/doc/user-guide/glossary/dvc-metafiles) tooltip...'
---
+
+
# DVC Metafiles
diff --git a/content/docs/user-guide/glossary/remote-storage.md b/content/docs/user-guide/glossary/remote-storage.md
index 8dbc3e7c70..c494e9ca9c 100644
--- a/content/docs/user-guide/glossary/remote-storage.md
+++ b/content/docs/user-guide/glossary/remote-storage.md
@@ -17,6 +17,10 @@ tooltip: >-
environment.
---
+
+
# Remote Storage
From e7df6755326c16bce4c598a3a834081dce37134d Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Mon, 30 Nov 2020 10:21:10 -0700
Subject: [PATCH 33/59] /glossary -> /concepts for files, nav, and js engine
---
content/docs/sidebar.json | 2 +-
.../docs/user-guide/{glossary => concepts}/data-pipelines.md | 1 -
content/docs/user-guide/{glossary => concepts}/dependency.md | 0
content/docs/user-guide/{glossary => concepts}/dvc-cache.md | 0
.../docs/user-guide/{glossary => concepts}/dvc-metafiles.md | 0
content/docs/user-guide/{glossary => concepts}/dvc-project.md | 0
.../user-guide/{glossary => concepts}/external-dependency.md | 0
.../docs/user-guide/{glossary => concepts}/import-stage.md | 0
.../user-guide/{glossary => concepts}/metrics-and-plots.md | 0
content/docs/user-guide/{glossary => concepts}/output.md | 0
content/docs/user-guide/{glossary => concepts}/parameter.md | 0
.../docs/user-guide/{glossary => concepts}/remote-storage.md | 0
content/docs/user-guide/{glossary => concepts}/workspace.md | 0
src/gatsby/models/glossary/index.js | 4 ++--
14 files changed, 3 insertions(+), 4 deletions(-)
rename content/docs/user-guide/{glossary => concepts}/data-pipelines.md (94%)
rename content/docs/user-guide/{glossary => concepts}/dependency.md (100%)
rename content/docs/user-guide/{glossary => concepts}/dvc-cache.md (100%)
rename content/docs/user-guide/{glossary => concepts}/dvc-metafiles.md (100%)
rename content/docs/user-guide/{glossary => concepts}/dvc-project.md (100%)
rename content/docs/user-guide/{glossary => concepts}/external-dependency.md (100%)
rename content/docs/user-guide/{glossary => concepts}/import-stage.md (100%)
rename content/docs/user-guide/{glossary => concepts}/metrics-and-plots.md (100%)
rename content/docs/user-guide/{glossary => concepts}/output.md (100%)
rename content/docs/user-guide/{glossary => concepts}/parameter.md (100%)
rename content/docs/user-guide/{glossary => concepts}/remote-storage.md (100%)
rename content/docs/user-guide/{glossary => concepts}/workspace.md (100%)
diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json
index 80b0b1a853..e496b3d3cd 100644
--- a/content/docs/sidebar.json
+++ b/content/docs/sidebar.json
@@ -89,7 +89,7 @@
},
{
"label": "Basic Concepts",
- "slug": "glossary",
+ "slug": "concepts",
"source": false,
"children": [
"workspace",
diff --git a/content/docs/user-guide/glossary/data-pipelines.md b/content/docs/user-guide/concepts/data-pipelines.md
similarity index 94%
rename from content/docs/user-guide/glossary/data-pipelines.md
rename to content/docs/user-guide/concepts/data-pipelines.md
index ccf0d2bf9f..4e19a46b9e 100644
--- a/content/docs/user-guide/glossary/data-pipelines.md
+++ b/content/docs/user-guide/concepts/data-pipelines.md
@@ -2,7 +2,6 @@
name: 'Data Pipelines'
match: ['data pipeline', 'pipeline', 'pipelines']
tooltip: >-
- In DVC, [data pipeline](/doc/user-guide/glossary/data-pipelines) stages and
commands, inputs, outputs, interdependencies, and results (intermediate or
final) are specified in `dvc.yaml`, which can be written manually or built
using the helper command `dvc run`. This allows DVC to restore one or more
diff --git a/content/docs/user-guide/glossary/dependency.md b/content/docs/user-guide/concepts/dependency.md
similarity index 100%
rename from content/docs/user-guide/glossary/dependency.md
rename to content/docs/user-guide/concepts/dependency.md
diff --git a/content/docs/user-guide/glossary/dvc-cache.md b/content/docs/user-guide/concepts/dvc-cache.md
similarity index 100%
rename from content/docs/user-guide/glossary/dvc-cache.md
rename to content/docs/user-guide/concepts/dvc-cache.md
diff --git a/content/docs/user-guide/glossary/dvc-metafiles.md b/content/docs/user-guide/concepts/dvc-metafiles.md
similarity index 100%
rename from content/docs/user-guide/glossary/dvc-metafiles.md
rename to content/docs/user-guide/concepts/dvc-metafiles.md
diff --git a/content/docs/user-guide/glossary/dvc-project.md b/content/docs/user-guide/concepts/dvc-project.md
similarity index 100%
rename from content/docs/user-guide/glossary/dvc-project.md
rename to content/docs/user-guide/concepts/dvc-project.md
diff --git a/content/docs/user-guide/glossary/external-dependency.md b/content/docs/user-guide/concepts/external-dependency.md
similarity index 100%
rename from content/docs/user-guide/glossary/external-dependency.md
rename to content/docs/user-guide/concepts/external-dependency.md
diff --git a/content/docs/user-guide/glossary/import-stage.md b/content/docs/user-guide/concepts/import-stage.md
similarity index 100%
rename from content/docs/user-guide/glossary/import-stage.md
rename to content/docs/user-guide/concepts/import-stage.md
diff --git a/content/docs/user-guide/glossary/metrics-and-plots.md b/content/docs/user-guide/concepts/metrics-and-plots.md
similarity index 100%
rename from content/docs/user-guide/glossary/metrics-and-plots.md
rename to content/docs/user-guide/concepts/metrics-and-plots.md
diff --git a/content/docs/user-guide/glossary/output.md b/content/docs/user-guide/concepts/output.md
similarity index 100%
rename from content/docs/user-guide/glossary/output.md
rename to content/docs/user-guide/concepts/output.md
diff --git a/content/docs/user-guide/glossary/parameter.md b/content/docs/user-guide/concepts/parameter.md
similarity index 100%
rename from content/docs/user-guide/glossary/parameter.md
rename to content/docs/user-guide/concepts/parameter.md
diff --git a/content/docs/user-guide/glossary/remote-storage.md b/content/docs/user-guide/concepts/remote-storage.md
similarity index 100%
rename from content/docs/user-guide/glossary/remote-storage.md
rename to content/docs/user-guide/concepts/remote-storage.md
diff --git a/content/docs/user-guide/glossary/workspace.md b/content/docs/user-guide/concepts/workspace.md
similarity index 100%
rename from content/docs/user-guide/glossary/workspace.md
rename to content/docs/user-guide/concepts/workspace.md
diff --git a/src/gatsby/models/glossary/index.js b/src/gatsby/models/glossary/index.js
index 1bf0801ac5..4f3499ef09 100644
--- a/src/gatsby/models/glossary/index.js
+++ b/src/gatsby/models/glossary/index.js
@@ -41,8 +41,8 @@ module.exports = {
createTypes(typeDefs)
},
async onCreateMarkdownContentNode(api, { parentNode, createChildNode }) {
- // Only operate on nodes within the docs/glossary folder.
- if (parentNode.relativeDirectory !== 'docs/user-guide/glossary') return
+ // Only operate on nodes within the docs/concepts folder.
+ if (parentNode.relativeDirectory !== 'docs/user-guide/concepts') return
const { node, createNodeId, createContentDigest } = api
From cb495420d4eab644bdeb6779be7ebe64ebf0a50d Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Mon, 30 Nov 2020 10:27:15 -0700
Subject: [PATCH 34/59] Update new links to /concepts
---
content/docs/command-reference/cache/dir.md | 2 +-
content/docs/user-guide/concepts/dvc-cache.md | 2 +-
content/docs/user-guide/concepts/dvc-metafiles.md | 2 +-
content/docs/user-guide/concepts/metrics-and-plots.md | 2 +-
content/docs/user-guide/concepts/remote-storage.md | 4 ++--
content/docs/user-guide/concepts/workspace.md | 2 +-
content/docs/user-guide/dvc-files-and-directories.md | 2 +-
7 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/content/docs/command-reference/cache/dir.md b/content/docs/command-reference/cache/dir.md
index 34aba9ebbc..3251411e79 100644
--- a/content/docs/command-reference/cache/dir.md
+++ b/content/docs/command-reference/cache/dir.md
@@ -18,7 +18,7 @@ positional arguments:
## Description
Helper to set the `cache.dir` configuration option. (See
-[cache directory](/doc/user-guide/glossary/dvc-cache#structure-of-the-cache-directory).)
+[cache directory](/doc/user-guide/concepts/dvc-cache#structure-of-the-cache-directory).)
Unlike doing so with `dvc config cache`, `dvc cache dir` transform paths
(`value`) that are provided relative to the current working directory into paths
**relative to the config file location**. However, if the `value` provided is an
diff --git a/content/docs/user-guide/concepts/dvc-cache.md b/content/docs/user-guide/concepts/dvc-cache.md
index a93846dd8e..17774ddef2 100644
--- a/content/docs/user-guide/concepts/dvc-cache.md
+++ b/content/docs/user-guide/concepts/dvc-cache.md
@@ -2,7 +2,7 @@
name: 'DVC Cache'
match: ['DVC cache', 'cache', 'caches', 'cached']
tooltip: >-
- The [DVC cache](/doc/user-guide/glossary/dvc-cache) is a hidden storage (by
+ The [DVC cache](/doc/user-guide/concepts/dvc-cache) is a hidden storage (by
default located in the `.dvc/cache` directory) for files that are tracked by
DVC, and their different versions.
---
diff --git a/content/docs/user-guide/concepts/dvc-metafiles.md b/content/docs/user-guide/concepts/dvc-metafiles.md
index f83812612d..2c40aec0b6 100644
--- a/content/docs/user-guide/concepts/dvc-metafiles.md
+++ b/content/docs/user-guide/concepts/dvc-metafiles.md
@@ -2,7 +2,7 @@
name: 'DVC Metafiles'
match: ['DVC files', 'files', 'directories']
tooltip: >-
- 'DVC [metafiles](/doc/user-guide/glossary/dvc-metafiles) tooltip...'
+ 'DVC [metafiles](/doc/user-guide/concepts/dvc-metafiles) tooltip...'
---
# Remote Storage
diff --git a/content/docs/user-guide/concepts/workspace.md b/content/docs/user-guide/concepts/workspace.md
index 81ce9c032c..16e33ab70c 100644
--- a/content/docs/user-guide/concepts/workspace.md
+++ b/content/docs/user-guide/concepts/workspace.md
@@ -2,7 +2,7 @@
name: Workspace
match: [workspace]
tooltip: >-
- The [workspace](/doc/user-guide/glossary/workspace) is the directory
+ The [workspace](/doc/user-guide/concepts/workspace) is the directory
containing all your project files e.g. raw datasets, source code, ML models,
etc. Typically, it's also a Git repository. It will contain your DVC project.
---
diff --git a/content/docs/user-guide/dvc-files-and-directories.md b/content/docs/user-guide/dvc-files-and-directories.md
index 5f6a61cad7..e577ae5b88 100644
--- a/content/docs/user-guide/dvc-files-and-directories.md
+++ b/content/docs/user-guide/dvc-files-and-directories.md
@@ -250,7 +250,7 @@ Full parameters (key and value) are listed separately under
hand or with the command `dvc config --local`.
- `.dvc/cache`: The cache directory will store your data in a
- special [structure](/doc/user-guide/glossary/dvc-cache). The data files and
+ special [structure](/doc/user-guide/concepts/dvc-cache). The data files and
directories in the workspace will only contain links to the data
files in the cache. (Refer to
[Large Dataset Optimization](/doc/user-guide/large-dataset-optimization). See
From 06a9cf753c4564f1ec81138756392e764b751f37 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Mon, 30 Nov 2020 11:42:54 -0700
Subject: [PATCH 35/59] Add parameters concept and initial content sample.
---
content/docs/sidebar.json | 3 ++-
content/docs/user-guide/concepts/parameter.md | 8 --------
content/docs/user-guide/concepts/parameters.md | 16 ++++++++++++++++
3 files changed, 18 insertions(+), 9 deletions(-)
delete mode 100644 content/docs/user-guide/concepts/parameter.md
create mode 100644 content/docs/user-guide/concepts/parameters.md
diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json
index e496b3d3cd..ef01542a78 100644
--- a/content/docs/sidebar.json
+++ b/content/docs/sidebar.json
@@ -97,7 +97,8 @@
"dvc-cache",
"data-pipelines",
"remote-storage",
- "metrics-and-plots"
+ "metrics-and-plots",
+ "parameters"
]
},
"dvc-files-and-directories",
diff --git a/content/docs/user-guide/concepts/parameter.md b/content/docs/user-guide/concepts/parameter.md
deleted file mode 100644
index 0dd2a90f44..0000000000
--- a/content/docs/user-guide/concepts/parameter.md
+++ /dev/null
@@ -1,8 +0,0 @@
----
-name: Parameter
-match: [parameter, parameters, param, params, hyperparameter, hyperparameters]
----
-
-Pipeline stages (defined in `dvc.yaml`) can depend on specific values inside an
-arbitrary YAML, JSON, TOML, or Python file (`params.yaml` by default). Stages
-are invalidated when any of their parameter values change. See `dvc param`.
diff --git a/content/docs/user-guide/concepts/parameters.md b/content/docs/user-guide/concepts/parameters.md
new file mode 100644
index 0000000000..473e5562a9
--- /dev/null
+++ b/content/docs/user-guide/concepts/parameters.md
@@ -0,0 +1,16 @@
+---
+name: Parameters
+match: [parameter, parameters, param, params, hyperparameter, hyperparameters]
+tooltip: >-
+ Pipeline stages (defined in `dvc.yaml`) can depend on specific values inside
+ an arbitrary YAML, JSON, TOML, or Python file (`params.yaml` by default).
+ Stages are invalidated when any of their
+ [parameter](/doc/user-guide/concepts/parameters) values change. See `dvc
+ param`.
+---
+
+# Parameters
+
+Pipeline stages (defined in `dvc.yaml`) can depend on specific values inside an
+arbitrary YAML, JSON, TOML, or Python file (`params.yaml` by default). Stages
+are invalidated when any of their parameter values change. See `dvc param`.
From db5de08808297bcea37a3a972b59a4d13d8542ae Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Mon, 30 Nov 2020 12:20:55 -0700
Subject: [PATCH 36/59] Update dvc cache link in config cmd ref
---
content/docs/command-reference/config.md | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/content/docs/command-reference/config.md b/content/docs/command-reference/config.md
index 213a2926af..fddb6fb19f 100644
--- a/content/docs/command-reference/config.md
+++ b/content/docs/command-reference/config.md
@@ -126,9 +126,9 @@ remote. See `dvc remote` for more information.
A DVC project cache is the hidden storage (by default located in
the `.dvc/cache` directory) for files that are tracked by DVC, and their
-different versions. (See `dvc cache` and
-[DVC Files and Directories](/doc/user-guide/dvc-files-and-directories) for more
-details.) This section contains the following options:
+different versions. (See
+[DVC cache](/doc/user-guide/concepts/dvc-cache#structure-of-the-cache-directory)
+for more details.) This section contains the following options:
- `cache.dir` - set/unset cache directory location. A correct value is either an
absolute path, or a path **relative to the config file location**. The default
From 0a423fab1ab791f30903cffa9046af6d1bbbbdbb Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Mon, 30 Nov 2020 13:54:52 -0700
Subject: [PATCH 37/59] Update pipeline glossary tooltip, add meta description
---
content/docs/user-guide/concepts/data-pipelines.md | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/content/docs/user-guide/concepts/data-pipelines.md b/content/docs/user-guide/concepts/data-pipelines.md
index 4e19a46b9e..87f28da0cb 100644
--- a/content/docs/user-guide/concepts/data-pipelines.md
+++ b/content/docs/user-guide/concepts/data-pipelines.md
@@ -2,10 +2,13 @@
name: 'Data Pipelines'
match: ['data pipeline', 'pipeline', 'pipelines']
tooltip: >-
- commands, inputs, outputs, interdependencies, and results (intermediate or
- final) are specified in `dvc.yaml`, which can be written manually or built
- using the helper command `dvc run`. This allows DVC to restore one or more
- pipelines later (see `dvc repro`).
+ In DVC, a [data pipeline](/doc/user-guide/concepts/data-pipelines) is a series
+ of data processing stages (for example, console commands that take an input
+ and produce an output). A pipeline may produce intermediate data,
+ and has a final result.
+description: >-
+ In DVC, a data pipeline is a series of data processing stages. A pipeline may
+ produce intermediate data, and has a final result.
---
# Data Pipelines
From 231ee609b0ad4e0a5749eeaa61ed18487658b0d0 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Mon, 30 Nov 2020 13:56:39 -0700
Subject: [PATCH 38/59] Remove abbr from data pipeline glossary tooltip
---
content/docs/user-guide/concepts/data-pipelines.md | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/content/docs/user-guide/concepts/data-pipelines.md b/content/docs/user-guide/concepts/data-pipelines.md
index 87f28da0cb..502ba945f0 100644
--- a/content/docs/user-guide/concepts/data-pipelines.md
+++ b/content/docs/user-guide/concepts/data-pipelines.md
@@ -4,8 +4,8 @@ match: ['data pipeline', 'pipeline', 'pipelines']
tooltip: >-
In DVC, a [data pipeline](/doc/user-guide/concepts/data-pipelines) is a series
of data processing stages (for example, console commands that take an input
- and produce an output). A pipeline may produce intermediate data,
- and has a final result.
+ and produce an output). A pipeline may produce intermediate data, and has a
+ final result.
description: >-
In DVC, a data pipeline is a series of data processing stages. A pipeline may
produce intermediate data, and has a final result.
From 80a5dffa2ade51340dd3f7de0ac36a728d5d0161 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Mon, 30 Nov 2020 14:24:49 -0700
Subject: [PATCH 39/59] Revert plots/index examples
---
content/docs/command-reference/plots/index.md | 111 ++++++++++++++++--
1 file changed, 103 insertions(+), 8 deletions(-)
diff --git a/content/docs/command-reference/plots/index.md b/content/docs/command-reference/plots/index.md
index e058d2bdd0..367c04378d 100644
--- a/content/docs/command-reference/plots/index.md
+++ b/content/docs/command-reference/plots/index.md
@@ -82,14 +82,6 @@ names in the `train` array below:
}
```
-## Options
-
-- `-h`, `--help` - prints the usage/help message, and exit.
-
-- `-q`, `--quiet` - do not write anything to standard output.
-
-- `-v`, `--verbose` - displays detailed tracing information.
-
## Plot templates
Users have the ability to change the way plots are displayed by modifying the
@@ -159,3 +151,106 @@ header (first row) are equivalent to field names.
- `` (optional) - field name to display as the X axis label
- `` (optional) - field name to display as the X axis label
+
+## Options
+
+- `-h`, `--help` - prints the usage/help message, and exit.
+
+- `-q`, `--quiet` - do not write anything to standard output.
+
+- `-v`, `--verbose` - displays detailed tracing information.
+
+## Example: Tabular data
+
+We'll use tabular metrics file `logs.csv` for this example:
+
+```
+epoch,accuracy,loss,val_accuracy,val_loss
+0,0.9418667,0.19958884770199656,0.9679,0.10217399864746257
+1,0.9763333,0.07896138601688048,0.9768,0.07310650711813942
+2,0.98375,0.05241111190887168,0.9788,0.06665669009438716
+3,0.98801666,0.03681169906261687,0.9781,0.06697812260198989
+4,0.99111664,0.027362171787042946,0.978,0.07385754839298315
+5,0.9932333,0.02069501801203781,0.9771,0.08009233058886166
+6,0.9945,0.017702101902437668,0.9803,0.07830339228538505
+7,0.9954,0.01396906608727198,0.9802,0.07247738889862157
+```
+
+Let's plot the last column (default behavior):
+
+```dvc
+$ dvc plots show logs.csv
+file:///Users/usr/src/plots/logs.csv.html
+```
+
+
+
+Difference in this metric between the current project version and the previous
+commit:
+
+```dvc
+$ dvc plots diff -d logs.csv HEAD^
+file:///Users/usr/src/plots/logs.csv.html
+```
+
+
+
+Visualize a specific field:
+
+```dvc
+$ dvc plots show -y loss logs.csv
+file:///Users/usr/src/plots/logs.html
+```
+
+
+
+## Example: Smooth plot
+
+In some cases we would like to smooth our plot. In this example we will use a
+plot with 1000 data points:
+
+```dvc
+$ dvc plots show data.csv
+file:///Users/usr/src/plots/plots.html
+```
+
+
+
+We can use the `-t` option and `smooth` template to make it less noisy:
+
+```dvc
+$ dvc plots show -t smooth data.csv
+file:///Users/usr/src/plots/plots.html
+```
+
+
+
+## Example: Confusion matrix
+
+We'll use `classes.csv` for this example:
+
+```
+actual,predicted
+cat,cat
+cat,cat
+cat,cat
+cat,dog
+cat,dinosaur
+cat,dinosaur
+cat,bird
+turtle,dog
+turtle,cat
+...
+```
+
+Let's visualize it:
+
+```dvc
+$ dvc plots show classes.csv --template confusion -x actual -y predicted
+file:///Users/usr/src/plots/classes.csv.html
+```
+
+
+
+> A confusion matrix [template](/doc/command-reference/plots#plot-templates) is
+> predefined in DVC (found in `.dvc/plots/confusion.json`).
From 6b43953e9b27370205ce5e80c71795eaf1cd4456 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Mon, 30 Nov 2020 14:37:15 -0700
Subject: [PATCH 40/59] Revert plots/show examples
---
content/docs/command-reference/plots/show.md | 51 --------------------
1 file changed, 51 deletions(-)
diff --git a/content/docs/command-reference/plots/show.md b/content/docs/command-reference/plots/show.md
index 561c08d1dc..ef8a828c59 100644
--- a/content/docs/command-reference/plots/show.md
+++ b/content/docs/command-reference/plots/show.md
@@ -181,54 +181,3 @@ file:///Users/usr/src/plots/logs.csv.json
"accuracy": "0.9418667",
...
```
-
-## Example: Smooth plot
-
-In some cases we would like to smooth our plot. In this example we will use a
-plot with 1000 data points:
-
-```dvc
-$ dvc plots show data.csv
-file:///Users/usr/src/plots/plots.html
-```
-
-
-
-We can use the `-t` option and `smooth` template to make it less noisy:
-
-```dvc
-$ dvc plots show -t smooth data.csv
-file:///Users/usr/src/plots/plots.html
-```
-
-
-
-## Example: Confusion matrix
-
-We'll use `classes.csv` for this example:
-
-```
-actual,predicted
-cat,cat
-cat,cat
-cat,cat
-cat,dog
-cat,dinosaur
-cat,dinosaur
-cat,bird
-turtle,dog
-turtle,cat
-...
-```
-
-Let's visualize it:
-
-```dvc
-$ dvc plots show classes.csv --template confusion -x actual -y predicted
-file:///Users/usr/src/plots/classes.csv.html
-```
-
-
-
-> A confusion matrix [template](/doc/command-reference/plots#plot-templates) is
-> predefined in DVC (found in `.dvc/plots/confusion.json`).
From ffcbd0d8b5e9acca46358b85d4da6bf5fe481eba Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Mon, 30 Nov 2020 14:39:07 -0700
Subject: [PATCH 41/59] Revert plots/show example indent
---
content/docs/command-reference/plots/show.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/content/docs/command-reference/plots/show.md b/content/docs/command-reference/plots/show.md
index ef8a828c59..441380cb5e 100644
--- a/content/docs/command-reference/plots/show.md
+++ b/content/docs/command-reference/plots/show.md
@@ -179,5 +179,5 @@ file:///Users/usr/src/plots/logs.csv.json
"values": [
{
"accuracy": "0.9418667",
- ...
+ ...
```
From ac7ab004b9041e4dab03b17b016b4b5facf5d151 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Mon, 30 Nov 2020 16:52:04 -0700
Subject: [PATCH 42/59] Move metrics and plots supported file formats sections
to concepts page
---
.../docs/command-reference/metrics/index.md | 23 -------
content/docs/command-reference/plots/index.md | 44 ------------
.../user-guide/concepts/metrics-and-plots.md | 67 +++++++++++++++++++
3 files changed, 67 insertions(+), 67 deletions(-)
diff --git a/content/docs/command-reference/metrics/index.md b/content/docs/command-reference/metrics/index.md
index eb2459e751..952d46e96e 100644
--- a/content/docs/command-reference/metrics/index.md
+++ b/content/docs/command-reference/metrics/index.md
@@ -36,29 +36,6 @@ stages:
Note that metrics files are normally committed with Git (that's what
`cache: false` above is for). See `dvc.yaml` for more information.
-### Supported file formats
-
-Metrics can be organized as tree hierarchies in JSON or YAML 1.2 files. DVC
-addresses specific metrics by the tree path. In the JSON example below, five
-metrics are presented: `train.accuracy`, `train.loss`, `train.TN`, `train.FP`
-and `time_real`.
-
-```json
-{
- "train": {
- "accuracy": 0.9886999726295471,
- "loss": 0.041855331510305405,
- "TN": 473,
- "FP": 845
- },
- "time_real": 344.61309599876404
-}
-```
-
-DVC itself does not ascribe any specific meaning for these numbers. Usually they
-are produced by the model training or model evaluation code and serve as a way
-to compare and pick the best performing experiment.
-
## Options
- `-h`, `--help` - prints the usage/help message, and exit.
diff --git a/content/docs/command-reference/plots/index.md b/content/docs/command-reference/plots/index.md
index 367c04378d..a6d3876fce 100644
--- a/content/docs/command-reference/plots/index.md
+++ b/content/docs/command-reference/plots/index.md
@@ -38,50 +38,6 @@ stages:
Note that metrics files are normally committed with Git (that's what
`cache: false` above is for). See `dvc.yaml` for more information.
-### Supported file formats
-
-DVC generates plots as HTML files that can be open with a web browser. These
-HTML files use [Vega-Lite](https://vega.github.io/vega-lite/). Vega is a
-declarative grammar for defining plots using JSON. The plots can also be saved
-as SVG or PNG image filed from the browser.
-
-Plot metrics can be organized as data series in JSON, YAML 1.2, CSV, or TSV
-files. DVC expects to see an array (or multiple arrays) of objects (usually
-_float numbers_) in the file.
-
-In tabular file formats such as CSV and TSV, each column is an array.
-`dvc plots` subcommands can produce plots for a specified column or a set of
-them. For example, `epoch`, `AUC`, and `loss` are the column names below:
-
-```
-epoch, AUC, loss
-34, 0.91935, 0.0317345
-35, 0.91913, 0.0317829
-36, 0.92256, 0.0304632
-37, 0.92302, 0.0299015
-```
-
-In hierarchical file formats (JSON or YAML), an array of consistent objects is
-expected: every object should have the same structure.
-
-`dvc plots` subcommands can produce plots for a specified field or a set of
-them, from the array's objects. For example, `val_loss` is one of the field
-names in the `train` array below:
-
-```
-{
- "train": [
- {"val_accuracy": 0.9665, "val_loss": 0.10757},
- {"val_accuracy": 0.9764, "val_loss": 0.07324},
- {"val_accuracy": 0.8770, "val_loss": 0.08136},
- {"val_accuracy": 0.8740, "val_loss": 0.09026},
- {"val_accuracy": 0.8795, "val_loss": 0.07640},
- {"val_accuracy": 0.8803, "val_loss": 0.07608},
- {"val_accuracy": 0.8987, "val_loss": 0.08455}
- ]
-}
-```
-
## Plot templates
Users have the ability to change the way plots are displayed by modifying the
diff --git a/content/docs/user-guide/concepts/metrics-and-plots.md b/content/docs/user-guide/concepts/metrics-and-plots.md
index 6a828ccf11..e0fb440227 100644
--- a/content/docs/user-guide/concepts/metrics-and-plots.md
+++ b/content/docs/user-guide/concepts/metrics-and-plots.md
@@ -35,6 +35,29 @@ options of `dvc run`.
In contrast to `dvc plots`, these metrics should be stored in hierarchical
files.
+### Supported file formats
+
+Metrics can be organized as tree hierarchies in JSON or YAML 1.2 files. DVC
+addresses specific metrics by the tree path. In the JSON example below, five
+metrics are presented: `train.accuracy`, `train.loss`, `train.TN`, `train.FP`
+and `time_real`.
+
+```json
+{
+ "train": {
+ "accuracy": 0.9886999726295471,
+ "loss": 0.041855331510305405,
+ "TN": 473,
+ "FP": 845
+ },
+ "time_real": 344.61309599876404
+}
+```
+
+DVC itself does not ascribe any specific meaning for these numbers. Usually they
+are produced by the model training or model evaluation code and serve as a way
+to compare and pick the best performing experiment.
+
## Plots
@@ -49,4 +72,48 @@ processing code, and can be defined in `dvc.yaml` (`plots` field) for tracking
In contrast to `dvc metrics`, these metrics should be stored as data series.
+### Supported file formats
+
+DVC generates plots as HTML files that can be open with a web browser. These
+HTML files use [Vega-Lite](https://vega.github.io/vega-lite/). Vega is a
+declarative grammar for defining plots using JSON. The plots can also be saved
+as SVG or PNG image filed from the browser.
+
+Plot metrics can be organized as data series in JSON, YAML 1.2, CSV, or TSV
+files. DVC expects to see an array (or multiple arrays) of objects (usually
+_float numbers_) in the file.
+
+In tabular file formats such as CSV and TSV, each column is an array.
+`dvc plots` subcommands can produce plots for a specified column or a set of
+them. For example, `epoch`, `AUC`, and `loss` are the column names below:
+
+```
+epoch, AUC, loss
+34, 0.91935, 0.0317345
+35, 0.91913, 0.0317829
+36, 0.92256, 0.0304632
+37, 0.92302, 0.0299015
+```
+
+In hierarchical file formats (JSON or YAML), an array of consistent objects is
+expected: every object should have the same structure.
+
+`dvc plots` subcommands can produce plots for a specified field or a set of
+them, from the array's objects. For example, `val_loss` is one of the field
+names in the `train` array below:
+
+```
+{
+ "train": [
+ {"val_accuracy": 0.9665, "val_loss": 0.10757},
+ {"val_accuracy": 0.9764, "val_loss": 0.07324},
+ {"val_accuracy": 0.8770, "val_loss": 0.08136},
+ {"val_accuracy": 0.8740, "val_loss": 0.09026},
+ {"val_accuracy": 0.8795, "val_loss": 0.07640},
+ {"val_accuracy": 0.8803, "val_loss": 0.07608},
+ {"val_accuracy": 0.8987, "val_loss": 0.08455}
+ ]
+}
+```
+
From 2ed367f3f8b6435ae2052c7d70a48fd0a40d0507 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Mon, 30 Nov 2020 21:57:28 -0700
Subject: [PATCH 43/59] Extracted params -> concept, removed index, added
description, tooltip
---
content/docs/command-reference/params/diff.md | 4 +-
.../docs/command-reference/params/index.md | 252 ------------------
content/docs/sidebar.json | 2 +-
.../docs/user-guide/concepts/parameters.md | 245 ++++++++++++++++-
4 files changed, 242 insertions(+), 261 deletions(-)
delete mode 100644 content/docs/command-reference/params/index.md
diff --git a/content/docs/command-reference/params/diff.md b/content/docs/command-reference/params/diff.md
index 02a51fa522..248f78644e 100644
--- a/content/docs/command-reference/params/diff.md
+++ b/content/docs/command-reference/params/diff.md
@@ -1,7 +1,7 @@
# params diff
-Show changes in [parameter dependencies](/doc/command-reference/params) between
-commits in the DVC repository, or between a commit and the
+Show changes in [parameter dependencies](/doc/user-guide/concepts/parameters)
+between commits in the DVC repository, or between a commit and the
workspace.
## Synopsis
diff --git a/content/docs/command-reference/params/index.md b/content/docs/command-reference/params/index.md
deleted file mode 100644
index 02a92c0e0f..0000000000
--- a/content/docs/command-reference/params/index.md
+++ /dev/null
@@ -1,252 +0,0 @@
-# params
-
-Contains a command to show changes in parameters:
-[diff](/doc/command-reference/params/diff).
-
-## Synopsis
-
-```usage
-usage: dvc params [-h] [-q | -v] {diff} ...
-
-positional arguments:
- COMMAND
- diff Show changes in params between commits in the
- DVC repository, or between a commit and the workspace.
-```
-
-## Description
-
-In order to track parameters and hyperparameters associated to machine learning
-experiments in DVC projects, DVC provides a different type of
-dependencies: _parameters_. Parameters are defined using the the `-p`
-(`--params`) option of `dvc run`, using simple names like `epochs`,
-`learning-rate`, `batch_size`, etc.
-
-In contrast to a regular dependency, a parameter is not a file (or
-directory). Instead, it consists of a _parameter name_ (or key) to find inside a
-YAML 1.2, JSON, TOML, or [Python](#examples-python-parameters-file) _parameters
-file_. Multiple parameter dependencies can be specified from one or more
-parameters files.
-
-The default parameters file name is `params.yaml`. Parameters should be
-organized as a tree hierarchy inside, as DVC will locate param names by their
-tree path. Parameters files have to be manually written, or generated, and these
-can be versioned directly with Git.
-
-Supported parameter _value_ types are: string, integer, float, and arrays. DVC
-itself does not ascribe any specific meaning for these values. They are
-user-defined, and serve as a way to generalize and parametrize an machine
-learning algorithms or data processing code.
-
-DVC saves the param names and their latest values in the `dvc.yaml` file. These
-values will be compared to the ones in the params files to determine if the
-stage is invalidated upon pipeline [reproduction](/doc/command-reference/repro).
-
-> Note that DVC does not pass the parameter values to stage commands. The
-> associated command executed by `dvc run` or `dvc repro` will have to open and
-> parse the parameters file by itself, and use the params specified with `-p`.
-
-The parameters concept helps to define [stage](/doc/command-reference/run)
-dependencies more granularly. A particular parameter or set of parameters will
-be required for the stage invalidation (see `dvc status` and `dvc repro`).
-Changes to other parts of the dependency file will not affect the stage. This
-prevents situations where several stages share a (configuration) file as a
-common dependency, and any change in this dependency invalidates all these
-stages and causes their reproduction unnecessarily.
-
-`dvc params diff` is available to show changes in parameters, displaying the
-param names as well as their current and previous values.
-
-## Options
-
-- `-h`, `--help` - prints the usage/help message, and exit.
-
-- `-q`, `--quiet` - do not write anything to standard output.
-
-- `-v`, `--verbose` - displays detailed tracing information.
-
-## Examples
-
-First, let's create a simple parameters file in YAML format, using the default
-file name `params.yaml`:
-
-```yaml
-lr: 0.0041
-
-train:
- epochs: 70
- layers: 9
-
-process:
- thresh: 0.98
- bow: 15000
-```
-
-Define a [stage](/doc/command-reference/run) that depends on params `lr`,
-`layers`, and `epochs` from the params file above. Full paths should be used to
-specify `layers` and `epochs` from the `train` group:
-
-```dvc
-$ dvc run -n train -d users.csv -o model.pkl \
- -p lr,train.epochs,train.layers \
- python train.py
-```
-
-> Note that we could use the same parameter addressing with JSON, TOML, or
-> Python parameters files.
-
-The `train.py` script will have some code to parse the needed parameters. For
-example:
-
-```py
-import yaml
-
-with open("params.yaml", 'r') as fd:
- params = yaml.safe_load(fd)
-
-lr = params['lr']
-epochs = params['train']['epochs']
-layers = params['train']['layers']
-```
-
-You can find that each parameter and it's value were saved to `dvc.yaml`. These
-values will be compared to the ones in the parameters files whenever `dvc repro`
-is used, to determine if dependency to the params file is invalidated:
-
-```yaml
-stages:
- train:
- cmd: python train.py
- deps:
- - users.csv
- params:
- - lr
- - train
- outs:
- - model.pkl
-```
-
-Alternatively, the entire group of parameters `train` can be referenced, instead
-of specifying each of the group parameters separately:
-
-```dvc
-$ dvc run -n train -d users.csv -o model.pkl \
- -p lr,train \
- python train.py
-```
-
-In the examples above, the default parameters file name `params.yaml` was used.
-This file name can be redefined with a prefix in the `-p` argument:
-
-```dvc
-$ dvc run -n train -d logs/ -o users.csv \
- -p parse_params.yaml:threshold,classes_num \
- python train.py
-```
-
-## Examples: Python parameters file
-
-Consider this Python parameters file named `params.py`:
-
-```python
-# All standard variable types are supported.
-BOOL = True
-INT = 5
-FLOAT = 0.001
-STR = 'abc'
-DICT = {'a': 1, 'b': 2}
-LIST = [1, 2, 3]
-SET = {4, 5, 6}
-TUPLE = (10, 100)
-NONE = None
-
-# DVC can retrieve class constants and variables defined in __init__
-class TrainConfig:
-
- EPOCHS = 70
-
- def __init__(self):
- self.layers = 5
- self.layers = 9 # TrainConfig.layers param will be 9
- self.sum = 1 + 2 # Will NOT be found due to the expression
- bar = 3 # Will NOT be found since it's locally scoped
-
-
-class TestConfig:
-
- TEST_DIR = 'path'
- METRICS = ['metric']
-```
-
-The following [stage](/doc/command-reference/run) depends on params `BOOL`,
-`INT`, as well as `TrainConfig`'s `EPOCHS` and `layers`:
-
-```dvc
-$ dvc run -n train -d users.csv -o model.pkl \
- -p params.py:BOOL,INT,TrainConfig.EPOCHS,TrainConfig.layers \
- python train.py
-```
-
-Resulting `dvc.yaml` and `dvc.lock` files (notice the `params` list):
-
-```yaml
-stages:
- train:
- cmd: python train.py
- deps:
- - users.csv
- params:
- - BOOL
- - INT
- - TrainConfig.EPOCHS
- - TrainConfig.layers
- outs:
- - model.pkl
-```
-
-```yaml
-train:
- cmd: python train.py
- deps:
- - path: users.csv
- md5: 23be4307b23dcd740763d5fc67993f11
- params:
- INT: 5
- BOOL: true
- TrainConfig.EPOCHS: 70
- TrainConfig.layers: 9
- outs:
- - path: model.pkl
- md5: 1c06b4756f08203cc496e4061b1e7d67
-```
-
-Alternatively, the entire `TestConfig` params group
-([class](https://docs.python.org/3/library/stdtypes.html#classes-and-class-instances))
-can be referenced
-([dictionaries](https://docs.python.org/3/library/stdtypes.html#dict) are also
-supported), instead of the parameters in it:
-
-```dvc
-$ dvc run -n train -d users.csv -o model.pkl \
- -p params.py:BOOL,INT,TestConfig \
- python train.py
-```
-
-## Examples: Print all parameters
-
-Following the previous example, we can use `dvc params diff` to list all of the
-param values available in the workspace:
-
-```dvc
-$ dvc params diff
-Path Param Old New
-params.yaml lr — 0.0041
-params.yaml process.bow — 15000
-params.yaml process.thresh — 0.98
-params.yaml train.epochs — 70
-params.yaml train.layers — 9
-```
-
-This command shows the difference in parameters between the workspace and the
-last committed version of the `params.yaml` file. In our example there's no
-previous version, which is why all `Old` values are `—`.
diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json
index ef01542a78..1528aad501 100644
--- a/content/docs/sidebar.json
+++ b/content/docs/sidebar.json
@@ -270,7 +270,7 @@
{
"label": "params",
"slug": "params",
- "source": "params/index.md",
+ "source": false,
"children": [
{
"label": "params diff",
diff --git a/content/docs/user-guide/concepts/parameters.md b/content/docs/user-guide/concepts/parameters.md
index 473e5562a9..1437c761bd 100644
--- a/content/docs/user-guide/concepts/parameters.md
+++ b/content/docs/user-guide/concepts/parameters.md
@@ -2,15 +2,248 @@
name: Parameters
match: [parameter, parameters, param, params, hyperparameter, hyperparameters]
tooltip: >-
- Pipeline stages (defined in `dvc.yaml`) can depend on specific values inside
- an arbitrary YAML, JSON, TOML, or Python file (`params.yaml` by default).
- Stages are invalidated when any of their
- [parameter](/doc/user-guide/concepts/parameters) values change. See `dvc
- param`.
+ In DVC, [parameters](/doc/user-guide/concepts/parameters) and hyperparameters
+ associated with machine learning experiments and data science projects can be
+ tracked as dependencies in a data pipeline.
+description: >-
+ In DVC, parameters and hyperparameters associated with machine learning
+ experiments and data science projects can be tracked as dependencies in a data
+ pipeline.
---
# Parameters
Pipeline stages (defined in `dvc.yaml`) can depend on specific values inside an
arbitrary YAML, JSON, TOML, or Python file (`params.yaml` by default). Stages
-are invalidated when any of their parameter values change. See `dvc param`.
+are invalidated when any of their parameter values change.
+
+
+
+In order to track parameters and hyperparameters associated to
+machine learning experiments in DVC projects, DVC provides a
+different type of dependencies: _parameters_. Parameters are defined using the
+the `-p` (`--params`) option of `dvc run`, using simple names like `epochs`,
+`learning-rate`, `batch_size`, etc.
+
+In contrast to a regular dependency, a parameter is not a file (or
+directory). Instead, it consists of a _parameter name_ (or key) to find inside a
+YAML 1.2, JSON, TOML, or [Python](#examples-python-parameters-file) _parameters
+file_. Multiple parameter dependencies can be specified from one or more
+parameters files.
+
+The default parameters file name is `params.yaml`. Parameters should be
+organized as a tree hierarchy inside, as DVC will locate param names by their
+tree path. Parameters files have to be manually written, or generated, and these
+can be versioned directly with Git.
+
+Supported parameter _value_ types are: string, integer, float, and arrays. DVC
+itself does not ascribe any specific meaning for these values. They are
+user-defined, and serve as a way to generalize and parametrize an machine
+learning algorithms or data processing code.
+
+DVC saves the param names and their latest values in the `dvc.yaml` file. These
+values will be compared to the ones in the params files to determine if the
+stage is invalidated upon pipeline [reproduction](/doc/command-reference/repro).
+
+> Note that DVC does not pass the parameter values to stage commands. The
+> associated command executed by `dvc run` or `dvc repro` will have to open and
+> parse the parameters file by itself, and use the params specified with `-p`.
+
+The parameters concept helps to define [stage](/doc/command-reference/run)
+dependencies more granularly. A particular parameter or set of parameters will
+be required for the stage invalidation (see `dvc status` and `dvc repro`).
+Changes to other parts of the dependency file will not affect the stage. This
+prevents situations where several stages share a (configuration) file as a
+common dependency, and any change in this dependency invalidates all these
+stages and causes their reproduction unnecessarily.
+
+`dvc params diff` is available to show changes in parameters, displaying the
+param names as well as their current and previous values.
+
+
+
+## Examples
+
+First, let's create a simple parameters file in YAML format, using the default
+file name `params.yaml`:
+
+```yaml
+lr: 0.0041
+
+train:
+ epochs: 70
+ layers: 9
+
+process:
+ thresh: 0.98
+ bow: 15000
+```
+
+Define a [stage](/doc/command-reference/run) that depends on params `lr`,
+`layers`, and `epochs` from the params file above. Full paths should be used to
+specify `layers` and `epochs` from the `train` group:
+
+```dvc
+$ dvc run -n train -d users.csv -o model.pkl \
+ -p lr,train.epochs,train.layers \
+ python train.py
+```
+
+> Note that we could use the same parameter addressing with JSON, TOML, or
+> Python parameters files.
+
+The `train.py` script will have some code to parse the needed parameters. For
+example:
+
+```py
+import yaml
+
+with open("params.yaml", 'r') as fd:
+ params = yaml.safe_load(fd)
+
+lr = params['lr']
+epochs = params['train']['epochs']
+layers = params['train']['layers']
+```
+
+You can find that each parameter and it's value were saved to `dvc.yaml`. These
+values will be compared to the ones in the parameters files whenever `dvc repro`
+is used, to determine if dependency to the params file is invalidated:
+
+```yaml
+stages:
+ train:
+ cmd: python train.py
+ deps:
+ - users.csv
+ params:
+ - lr
+ - train
+ outs:
+ - model.pkl
+```
+
+Alternatively, the entire group of parameters `train` can be referenced, instead
+of specifying each of the group parameters separately:
+
+```dvc
+$ dvc run -n train -d users.csv -o model.pkl \
+ -p lr,train \
+ python train.py
+```
+
+In the examples above, the default parameters file name `params.yaml` was used.
+This file name can be redefined with a prefix in the `-p` argument:
+
+```dvc
+$ dvc run -n train -d logs/ -o users.csv \
+ -p parse_params.yaml:threshold,classes_num \
+ python train.py
+```
+
+## Examples: Python parameters file
+
+Consider this Python parameters file named `params.py`:
+
+```python
+# All standard variable types are supported.
+BOOL = True
+INT = 5
+FLOAT = 0.001
+STR = 'abc'
+DICT = {'a': 1, 'b': 2}
+LIST = [1, 2, 3]
+SET = {4, 5, 6}
+TUPLE = (10, 100)
+NONE = None
+
+# DVC can retrieve class constants and variables defined in __init__
+class TrainConfig:
+
+ EPOCHS = 70
+
+ def __init__(self):
+ self.layers = 5
+ self.layers = 9 # TrainConfig.layers param will be 9
+ self.sum = 1 + 2 # Will NOT be found due to the expression
+ bar = 3 # Will NOT be found since it's locally scoped
+
+
+class TestConfig:
+
+ TEST_DIR = 'path'
+ METRICS = ['metric']
+```
+
+The following [stage](/doc/command-reference/run) depends on params `BOOL`,
+`INT`, as well as `TrainConfig`'s `EPOCHS` and `layers`:
+
+```dvc
+$ dvc run -n train -d users.csv -o model.pkl \
+ -p params.py:BOOL,INT,TrainConfig.EPOCHS,TrainConfig.layers \
+ python train.py
+```
+
+Resulting `dvc.yaml` and `dvc.lock` files (notice the `params` list):
+
+```yaml
+stages:
+ train:
+ cmd: python train.py
+ deps:
+ - users.csv
+ params:
+ - BOOL
+ - INT
+ - TrainConfig.EPOCHS
+ - TrainConfig.layers
+ outs:
+ - model.pkl
+```
+
+```yaml
+train:
+ cmd: python train.py
+ deps:
+ - path: users.csv
+ md5: 23be4307b23dcd740763d5fc67993f11
+ params:
+ INT: 5
+ BOOL: true
+ TrainConfig.EPOCHS: 70
+ TrainConfig.layers: 9
+ outs:
+ - path: model.pkl
+ md5: 1c06b4756f08203cc496e4061b1e7d67
+```
+
+Alternatively, the entire `TestConfig` params group
+([class](https://docs.python.org/3/library/stdtypes.html#classes-and-class-instances))
+can be referenced
+([dictionaries](https://docs.python.org/3/library/stdtypes.html#dict) are also
+supported), instead of the parameters in it:
+
+```dvc
+$ dvc run -n train -d users.csv -o model.pkl \
+ -p params.py:BOOL,INT,TestConfig \
+ python train.py
+```
+
+## Examples: Print all parameters
+
+Following the previous example, we can use `dvc params diff` to list all of the
+param values available in the workspace:
+
+```dvc
+$ dvc params diff
+Path Param Old New
+params.yaml lr — 0.0041
+params.yaml process.bow — 15000
+params.yaml process.thresh — 0.98
+params.yaml train.epochs — 70
+params.yaml train.layers — 9
+```
+
+This command shows the difference in parameters between the workspace and the
+last committed version of the `params.yaml` file. In our example there's no
+previous version, which is why all `Old` values are `—`.
From 7d01b7e925ce57b63cb600ae48def725e34557b0 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Mon, 30 Nov 2020 22:21:14 -0700
Subject: [PATCH 44/59] Fix links: file-and-directories -> dvc cache concept
---
content/docs/api-reference/get_url.md | 4 ++--
content/docs/command-reference/add.md | 8 +++-----
content/docs/command-reference/fetch.md | 4 +---
content/docs/command-reference/gc.md | 2 +-
content/docs/command-reference/push.md | 6 ++----
content/docs/command-reference/run.md | 2 +-
content/docs/user-guide/dvcignore.md | 4 +---
7 files changed, 11 insertions(+), 19 deletions(-)
diff --git a/content/docs/api-reference/get_url.md b/content/docs/api-reference/get_url.md
index ad56e883fc..e5eeb3286e 100644
--- a/content/docs/api-reference/get_url.md
+++ b/content/docs/api-reference/get_url.md
@@ -36,8 +36,8 @@ URL returned depends on the
`remote` used (see the [Parameters](#parameters) section).
If the target is a directory, the returned URL will end in `.dir`. Refer to
-[Structure of cache directory](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory)
-and `dvc add` to learn more about how DVC handles data directories.
+[DVC cache](/doc/user-guide/concepts/dvc-cache) and `dvc add` to learn more
+about how DVC handles data directories.
⚠️ This function does not check for the actual existence of the file or
directory in the remote storage.
diff --git a/content/docs/command-reference/add.md b/content/docs/command-reference/add.md
index 1f9e51d1df..5945db7192 100644
--- a/content/docs/command-reference/add.md
+++ b/content/docs/command-reference/add.md
@@ -38,8 +38,7 @@ other DVC commands), a few actions are taken under the hood:
1. Calculate the file hash.
2. Move the file contents to the cache (by default in `.dvc/cache`), using the
file hash to form the cached file path. (See
- [Structure of cache directory](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory)
- for more details.)
+ [DVC cache](/doc/user-guide/concepts/dvc-cache) for more details.)
3. Attempt to replace the file with a link to the cached data (more details on
file linking further down).
4. Create a corresponding `.dvc` file to track the file, using its path and hash
@@ -81,9 +80,8 @@ used), but DVC does not produce individual `.dvc` files for each file in the
entire tree. Instead, the single `.dvc` file references a special JSON file in
the cache (with `.dir` extension), that in turn points to the added files.
-> Refer to
-> [Structure of cache directory](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory)
-> for more info. on `.dir` cache entries.
+> Refer to [DVC cache](/doc/user-guide/concepts/dvc-cache) for more info. on
+> `.dir` cache entries.
Note that DVC commands that use tracked data support granular targeting of files
and directories, even when contained in a parent directory added as a whole.
diff --git a/content/docs/command-reference/fetch.md b/content/docs/command-reference/fetch.md
index ededa92c63..3f1141a589 100644
--- a/content/docs/command-reference/fetch.md
+++ b/content/docs/command-reference/fetch.md
@@ -175,9 +175,7 @@ $ tree .dvc/cache
Note that the `.dvc/cache` directory was created and populated.
-> Refer to
-> [Structure of cache directory](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory)
-> for more info.
+> Refer to [DVC cache](/doc/user-guide/concepts/dvc-cache) for more info.
Used without arguments (as above), `dvc fetch` downloads all files and
directories needed by all `dvc.yaml` and `.dvc` files in the current branch. For
diff --git a/content/docs/command-reference/gc.md b/content/docs/command-reference/gc.md
index 33f454dfee..adb94f3b55 100644
--- a/content/docs/command-reference/gc.md
+++ b/content/docs/command-reference/gc.md
@@ -29,7 +29,7 @@ of commits (determined by reading the DVC-files in them). See the
[Options](#options) section for more details.
> Note that `dvc gc` tries to fetch any missing
-> [`.dir` files](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory)
+> [`.dir` files](/doc/user-guide/concepts/dvc-cache#structure-of-the-cache-directory)
> from [remote storage](/doc/command-reference/remote) to the local
> cache, in order to determine which files should exist inside
> cached directories. These files may be missing if the cache directory was
diff --git a/content/docs/command-reference/push.md b/content/docs/command-reference/push.md
index 38aa530d0b..d3f886c5ea 100644
--- a/content/docs/command-reference/push.md
+++ b/content/docs/command-reference/push.md
@@ -185,7 +185,7 @@ Finally, we used `dvc status` to double check that all data had been uploaded.
## Example: What happens in the cache?
Let's take a detailed look at what happens to the
-[cache directory](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory)
+[cache directory](/doc/user-guide/concepts/dvc-cache#structure-of-the-cache-directory)
as you run an experiment locally and push data to remote storage. To set the
example consider having created a workspace that contains some code
and data, and having set up a remote.
@@ -232,9 +232,7 @@ The directory `.dvc/cache` is the local cache, while `~/vault/recursive` is a
the cache having more files in it than the remote – which is what the `new`
state means.
-> Refer to
-> [Structure of cache directory](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory)
-> for more info.
+> Refer to [DVC cache](/doc/user-guide/concepts/dvc-cache) for more info.
Next we can copy the remaining data from the cache to the remote using
`dvc push`:
diff --git a/content/docs/command-reference/run.md b/content/docs/command-reference/run.md
index f45e86a616..bcc21342a2 100644
--- a/content/docs/command-reference/run.md
+++ b/content/docs/command-reference/run.md
@@ -97,7 +97,7 @@ Relevant notes:
- Entire directories produced by the stage can be tracked as outputs by DVC,
which generates a single `.dir` entry in the cache (refer to
- [Structure of cache directory](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory)
+ [DVC cache](/doc/user-guide/concepts/dvc-cache#structure-of-the-cache-directory)
for more info.)
- [external dependencies](/doc/user-guide/external-dependencies) and
diff --git a/content/docs/user-guide/dvcignore.md b/content/docs/user-guide/dvcignore.md
index d826859e03..3cd54a6840 100644
--- a/content/docs/user-guide/dvcignore.md
+++ b/content/docs/user-guide/dvcignore.md
@@ -95,9 +95,7 @@ Only the cache entries of the `data/` directory itself and one file have been
stored. Checking the hash value of the data files manually, we can see that
`data2` was cached. This means that `dvc add` did ignore `data1`.
-> Refer to
-> [Structure of cache directory](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory)
-> for more info.
+> Refer to [DVC cache](/doc/user-guide/concepts/dvc-cache) for more info.
## Example: Ignore file state changes
From 7899e148b756f8657c7992bce27a8b635ca7ff58 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Mon, 30 Nov 2020 22:28:41 -0700
Subject: [PATCH 45/59] Fix dvc cache link -> concept
---
content/docs/user-guide/large-dataset-optimization.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/content/docs/user-guide/large-dataset-optimization.md b/content/docs/user-guide/large-dataset-optimization.md
index 7bdeb4e102..152c063ab3 100644
--- a/content/docs/user-guide/large-dataset-optimization.md
+++ b/content/docs/user-guide/large-dataset-optimization.md
@@ -4,7 +4,7 @@ In order to track the data files and directories added with `dvc add` or
`dvc run`, DVC moves all these files to the cache. A
project's cache is the hidden storage (by default located in
`.dvc/cache`) for files that are tracked by DVC, and their different versions.
-(See `dvc cache` and
+(See [DVC cache](/doc/user-guide/concepts/dvc-cache) and
[DVC Files and Directories](/doc/user-guide/dvc-files-and-directories) for more
details.)
From 2556cf89ca1fb8a083442a4162bad8b6b6982a86 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Mon, 30 Nov 2020 22:39:42 -0700
Subject: [PATCH 46/59] Fix links to dvc params -> concept
---
content/docs/command-reference/params/diff.md | 4 ++--
content/docs/command-reference/run.md | 10 +++++-----
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/content/docs/command-reference/params/diff.md b/content/docs/command-reference/params/diff.md
index 248f78644e..0a35d0c6bb 100644
--- a/content/docs/command-reference/params/diff.md
+++ b/content/docs/command-reference/params/diff.md
@@ -23,7 +23,7 @@ in the repository history. Requires that Git is being used to version the
project params.
> Parameter dependencies are defined with the `-p` option in `dvc run`. See also
-> `dvc params`.
+> [parameters](/doc/user-guide/concepts/parameters).
Run without arguments, this command compares parameters currently present in the
workspace (uncommitted changes) with the latest committed version.
@@ -52,7 +52,7 @@ itself does not ascribe any specific meaning for these values.
## Examples
Let's create a simple YAML parameters file named `params.yaml` (default params
-file name, see `dvc params` to learn more):
+file name, see [parameters](/doc/user-guide/concepts/parameters) to learn more):
```yaml
lr: 0.0041
diff --git a/content/docs/command-reference/run.md b/content/docs/command-reference/run.md
index bcc21342a2..6b96b645f6 100644
--- a/content/docs/command-reference/run.md
+++ b/content/docs/command-reference/run.md
@@ -193,8 +193,8 @@ $ dvc run -n my_stage './my_script.sh $MYENVVAR'
on, from a parameters file. This is done by sending a comma separated list as
argument, e.g. `-p learning_rate,epochs`. The default parameters file name is
`params.yaml`, but this can be redefined with a prefix in the argument sent to
- this option, e.g. `-p parse_params.yaml:threshold`. See `dvc params` to learn
- more about parameters.
+ this option, e.g. `-p parse_params.yaml:threshold`. See
+ [parameters](/doc/user-guide/concepts/parameters) to learn more.
- `-m `, `--metrics ` - specify a metrics file produced by this
stage. This option behaves like `-o` but registers the file in a `metrics`
@@ -404,8 +404,8 @@ $ dvc dag
## Example: Using parameter dependencies
To use specific values inside a parameters file as dependencies, create a simple
-YAML file named `params.yaml` (default params file name, see `dvc params` to
-learn more):
+YAML file named `params.yaml` (default params file name, see
+[parameters](/doc/user-guide/concepts/parameters) to learn more):
```yaml
seed: 20180226
@@ -444,4 +444,4 @@ epochs = params['train']['epochs']
DVC will keep an eye on these param values (same as with the regular dependency
files) and know that the stage should be reproduced if/when they change. See
-`dvc params` for more details.
+[parameters](/doc/user-guide/concepts/parameters) for more details.
From 53901b27e01b5b04dd343faa8deb5d426f9aa4dc Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Mon, 30 Nov 2020 23:24:06 -0700
Subject: [PATCH 47/59] Add meta descriptions for concepts
---
content/docs/user-guide/concepts/dvc-cache.md | 3 +++
content/docs/user-guide/concepts/metrics-and-plots.md | 4 ++++
content/docs/user-guide/concepts/remote-storage.md | 4 ++++
content/docs/user-guide/concepts/workspace.md | 3 +++
4 files changed, 14 insertions(+)
diff --git a/content/docs/user-guide/concepts/dvc-cache.md b/content/docs/user-guide/concepts/dvc-cache.md
index 17774ddef2..4486f9f249 100644
--- a/content/docs/user-guide/concepts/dvc-cache.md
+++ b/content/docs/user-guide/concepts/dvc-cache.md
@@ -5,6 +5,9 @@ tooltip: >-
The [DVC cache](/doc/user-guide/concepts/dvc-cache) is a hidden storage (by
default located in the `.dvc/cache` directory) for files that are tracked by
DVC, and their different versions.
+description: >-
+ The DVC cache adds a layer of indirection between code and data to efficiently
+ version large datasets, data science features, and machine learning models.
---
# DVC Cache
diff --git a/content/docs/user-guide/concepts/metrics-and-plots.md b/content/docs/user-guide/concepts/metrics-and-plots.md
index e0fb440227..e0bfa41db0 100644
--- a/content/docs/user-guide/concepts/metrics-and-plots.md
+++ b/content/docs/user-guide/concepts/metrics-and-plots.md
@@ -5,6 +5,10 @@ tooltip: >-
DVC [metrics and plots](/doc/user-guide/concepts/metrics-and-plots) provide
sets of commands to follow the performance of machine learning experiments.
Mark certain stage outputs as metrics and visualize metrics as plots.
+description: >-
+ DVC provides sets of commands to track the performance of machine learning
+ experiments. Mark certain stage outputs as metrics and visualize metrics as
+ plots.
---
# Metrics and Plots
diff --git a/content/docs/user-guide/concepts/remote-storage.md b/content/docs/user-guide/concepts/remote-storage.md
index e275eb4d81..973652ac27 100644
--- a/content/docs/user-guide/concepts/remote-storage.md
+++ b/content/docs/user-guide/concepts/remote-storage.md
@@ -15,6 +15,10 @@ tooltip: >-
colleagues from DVC remotes without spending time and resources to build or
process them locally. Remote storage can also save space on your local
environment.
+description: >-
+ DVC remotes provide a location to store and share data and models, with
+ support for Amazon S3, Google Drive, Azure, and several other remote storage
+ providers.
---
+
# Data Pipelines
diff --git a/content/docs/user-guide/concepts/dvc-cache.md b/content/docs/user-guide/concepts/dvc-cache.md
index 4486f9f249..81f7281e95 100644
--- a/content/docs/user-guide/concepts/dvc-cache.md
+++ b/content/docs/user-guide/concepts/dvc-cache.md
@@ -10,6 +10,8 @@ description: >-
version large datasets, data science features, and machine learning models.
---
+
+
# DVC Cache
diff --git a/content/docs/user-guide/concepts/metrics-and-plots.md b/content/docs/user-guide/concepts/metrics-and-plots.md
index e0bfa41db0..f5550a6f4e 100644
--- a/content/docs/user-guide/concepts/metrics-and-plots.md
+++ b/content/docs/user-guide/concepts/metrics-and-plots.md
@@ -11,6 +11,8 @@ description: >-
plots.
---
+
+
# Metrics and Plots
diff --git a/content/docs/user-guide/concepts/parameters.md b/content/docs/user-guide/concepts/parameters.md
index 1437c761bd..36d1e672cf 100644
--- a/content/docs/user-guide/concepts/parameters.md
+++ b/content/docs/user-guide/concepts/parameters.md
@@ -11,6 +11,8 @@ description: >-
pipeline.
---
+
+
# Parameters
Pipeline stages (defined in `dvc.yaml`) can depend on specific values inside an
diff --git a/content/docs/user-guide/concepts/remote-storage.md b/content/docs/user-guide/concepts/remote-storage.md
index 973652ac27..5be4a6971b 100644
--- a/content/docs/user-guide/concepts/remote-storage.md
+++ b/content/docs/user-guide/concepts/remote-storage.md
@@ -21,6 +21,8 @@ description: >-
providers.
---
+
+
diff --git a/content/docs/user-guide/concepts/workspace.md b/content/docs/user-guide/concepts/workspace.md
index 12b23490f7..67e3a270da 100644
--- a/content/docs/user-guide/concepts/workspace.md
+++ b/content/docs/user-guide/concepts/workspace.md
@@ -10,6 +10,8 @@ description: >-
datasets, source code, ML models, etc. Typically, it's also a Git repository.
---
+
+
# Workspace
The workspace is the directory containing all your project files e.g. raw
From 4bbd393ab58cf2fed0de16301875b17211a53566 Mon Sep 17 00:00:00 2001
From: Jorge Orpinel
Date: Tue, 1 Dec 2020 19:23:04 -0600
Subject: [PATCH 50/59] Update content/docs/command-reference/config.md
---
content/docs/command-reference/config.md | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/content/docs/command-reference/config.md b/content/docs/command-reference/config.md
index fddb6fb19f..8d27d1a504 100644
--- a/content/docs/command-reference/config.md
+++ b/content/docs/command-reference/config.md
@@ -127,8 +127,8 @@ remote. See `dvc remote` for more information.
A DVC project cache is the hidden storage (by default located in
the `.dvc/cache` directory) for files that are tracked by DVC, and their
different versions. (See
-[DVC cache](/doc/user-guide/concepts/dvc-cache#structure-of-the-cache-directory)
-for more details.) This section contains the following options:
+[DVC cache](/doc/user-guide/concepts/dvc-cache) for more details.) This section
+contains the following options:
- `cache.dir` - set/unset cache directory location. A correct value is either an
absolute path, or a path **relative to the config file location**. The default
From 05d380e5d1d113398ebf624791c131713a303d58 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Wed, 2 Dec 2020 22:58:59 -0700
Subject: [PATCH 51/59] Fix formatting in config
---
content/docs/command-reference/config.md | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/content/docs/command-reference/config.md b/content/docs/command-reference/config.md
index 8d27d1a504..a679453133 100644
--- a/content/docs/command-reference/config.md
+++ b/content/docs/command-reference/config.md
@@ -126,9 +126,8 @@ remote. See `dvc remote` for more information.
A DVC project cache is the hidden storage (by default located in
the `.dvc/cache` directory) for files that are tracked by DVC, and their
-different versions. (See
-[DVC cache](/doc/user-guide/concepts/dvc-cache) for more details.) This section
-contains the following options:
+different versions. (See [DVC cache](/doc/user-guide/concepts/dvc-cache) for
+more details.) This section contains the following options:
- `cache.dir` - set/unset cache directory location. A correct value is either an
absolute path, or a path **relative to the config file location**. The default
From 35640e6acad38a5444acdc26e4c392cbea799840 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Wed, 2 Dec 2020 23:14:29 -0700
Subject: [PATCH 52/59] Fix #structure-of-the-cache-directory links
---
content/docs/api-reference/get_url.md | 4 ++--
content/docs/command-reference/add.md | 8 +++++---
content/docs/command-reference/fetch.md | 4 +++-
content/docs/command-reference/push.md | 4 +++-
content/docs/command-reference/run.md | 2 +-
content/docs/user-guide/dvcignore.md | 4 +++-
6 files changed, 17 insertions(+), 9 deletions(-)
diff --git a/content/docs/api-reference/get_url.md b/content/docs/api-reference/get_url.md
index e5eeb3286e..06a7ebba15 100644
--- a/content/docs/api-reference/get_url.md
+++ b/content/docs/api-reference/get_url.md
@@ -36,8 +36,8 @@ URL returned depends on the
`remote` used (see the [Parameters](#parameters) section).
If the target is a directory, the returned URL will end in `.dir`. Refer to
-[DVC cache](/doc/user-guide/concepts/dvc-cache) and `dvc add` to learn more
-about how DVC handles data directories.
+[Structure of the cache directory](/doc/user-guide/concepts/dvc-cache#structure-of-the-cache-directory)
+and `dvc add` to learn more about how DVC handles data directories.
⚠️ This function does not check for the actual existence of the file or
directory in the remote storage.
diff --git a/content/docs/command-reference/add.md b/content/docs/command-reference/add.md
index 7c9899b0f7..b05e484b9c 100644
--- a/content/docs/command-reference/add.md
+++ b/content/docs/command-reference/add.md
@@ -38,7 +38,8 @@ other DVC commands), a few actions are taken under the hood:
1. Calculate the file hash.
2. Move the file contents to the cache (by default in `.dvc/cache`), using the
file hash to form the cached file path. (See
- [DVC cache](/doc/user-guide/concepts/dvc-cache) for more details.)
+ [Structure of the cache directory](/doc/user-guide/concepts/dvc-cache#structure-of-the-cache-directory)
+ for more details.)
3. Attempt to replace the file with a link to the cached data (more details on
file linking further down).
4. Create a corresponding `.dvc` file to track the file, using its path and hash
@@ -80,8 +81,9 @@ used), but DVC does not produce individual `.dvc` files for each file in the
entire tree. Instead, the single `.dvc` file references a special JSON file in
the cache (with `.dir` extension), that in turn points to the added files.
-> Refer to [DVC cache](/doc/user-guide/concepts/dvc-cache) for more info. on
-> `.dir` cache entries.
+> Refer to
+> [Structure of the cache directory](/doc/user-guide/concepts/dvc-cache#structure-of-the-cache-directory)
+> for more info on `.dir` cache entries.
Note that DVC commands that use tracked data support granular targeting of files
and directories, even when contained in a parent directory added as a whole.
diff --git a/content/docs/command-reference/fetch.md b/content/docs/command-reference/fetch.md
index 3f1141a589..ad10dd0278 100644
--- a/content/docs/command-reference/fetch.md
+++ b/content/docs/command-reference/fetch.md
@@ -175,7 +175,9 @@ $ tree .dvc/cache
Note that the `.dvc/cache` directory was created and populated.
-> Refer to [DVC cache](/doc/user-guide/concepts/dvc-cache) for more info.
+> Refer to
+> [Structure of the cache directory](/doc/user-guide/concepts/dvc-cache#structure-of-the-cache-directory)
+> for more info.
Used without arguments (as above), `dvc fetch` downloads all files and
directories needed by all `dvc.yaml` and `.dvc` files in the current branch. For
diff --git a/content/docs/command-reference/push.md b/content/docs/command-reference/push.md
index d3f886c5ea..c0349ec3c1 100644
--- a/content/docs/command-reference/push.md
+++ b/content/docs/command-reference/push.md
@@ -232,7 +232,9 @@ The directory `.dvc/cache` is the local cache, while `~/vault/recursive` is a
the cache having more files in it than the remote – which is what the `new`
state means.
-> Refer to [DVC cache](/doc/user-guide/concepts/dvc-cache) for more info.
+> Refer to
+> [Structure of the cache directory](/doc/user-guide/concepts/dvc-cache#structure-of-the-cache-directory)
+> for more info.
Next we can copy the remaining data from the cache to the remote using
`dvc push`:
diff --git a/content/docs/command-reference/run.md b/content/docs/command-reference/run.md
index 6b96b645f6..86e594b1d9 100644
--- a/content/docs/command-reference/run.md
+++ b/content/docs/command-reference/run.md
@@ -97,7 +97,7 @@ Relevant notes:
- Entire directories produced by the stage can be tracked as outputs by DVC,
which generates a single `.dir` entry in the cache (refer to
- [DVC cache](/doc/user-guide/concepts/dvc-cache#structure-of-the-cache-directory)
+ [Structure of the cache directory](/doc/user-guide/concepts/dvc-cache#structure-of-the-cache-directory)
for more info.)
- [external dependencies](/doc/user-guide/external-dependencies) and
diff --git a/content/docs/user-guide/dvcignore.md b/content/docs/user-guide/dvcignore.md
index 3cd54a6840..834a1adb1c 100644
--- a/content/docs/user-guide/dvcignore.md
+++ b/content/docs/user-guide/dvcignore.md
@@ -95,7 +95,9 @@ Only the cache entries of the `data/` directory itself and one file have been
stored. Checking the hash value of the data files manually, we can see that
`data2` was cached. This means that `dvc add` did ignore `data1`.
-> Refer to [DVC cache](/doc/user-guide/concepts/dvc-cache) for more info.
+> Refer to
+> [Structure of the cache directory](/doc/user-guide/concepts/dvc-cache#structure-of-the-cache-directory)
+> for more info.
## Example: Ignore file state changes
From fe86a57f7b40f81b2d8623152e1ddcd9c29f8e97 Mon Sep 17 00:00:00 2001
From: Jorge Orpinel
Date: Fri, 4 Dec 2020 21:58:44 -0600
Subject: [PATCH 53/59] Update
content/docs/user-guide/concepts/data-pipelines.md
---
content/docs/user-guide/concepts/data-pipelines.md | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/content/docs/user-guide/concepts/data-pipelines.md b/content/docs/user-guide/concepts/data-pipelines.md
index 0a0055107b..0facdf2b14 100644
--- a/content/docs/user-guide/concepts/data-pipelines.md
+++ b/content/docs/user-guide/concepts/data-pipelines.md
@@ -2,10 +2,9 @@
name: 'Data Pipelines'
match: ['data pipeline', 'pipeline', 'pipelines']
tooltip: >-
- In DVC, a [data pipeline](/doc/user-guide/concepts/data-pipelines) is a series
- of data processing stages (for example, console commands that take an input
- and produce an output). A pipeline may produce intermediate data, and has a
- final result.
+ A [data pipeline](/doc/user-guide/concepts/data-pipelines) is a series of data
+ processing stages, chained by their outputs and inputs. They use some initial
+ data, may produce intermediate artifacts, and reach a final result.
description: >-
In DVC, a data pipeline is a series of data processing stages. A pipeline may
produce intermediate data, and has a final result.
From 29170fedd6dcdc927b71fa15e10d93761ea918bc Mon Sep 17 00:00:00 2001
From: Jorge Orpinel
Date: Fri, 4 Dec 2020 22:09:04 -0600
Subject: [PATCH 54/59] Update
content/docs/user-guide/large-dataset-optimization.md
---
content/docs/user-guide/large-dataset-optimization.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/content/docs/user-guide/large-dataset-optimization.md b/content/docs/user-guide/large-dataset-optimization.md
index 152c063ab3..552ebff738 100644
--- a/content/docs/user-guide/large-dataset-optimization.md
+++ b/content/docs/user-guide/large-dataset-optimization.md
@@ -4,7 +4,7 @@ In order to track the data files and directories added with `dvc add` or
`dvc run`, DVC moves all these files to the cache. A
project's cache is the hidden storage (by default located in
`.dvc/cache`) for files that are tracked by DVC, and their different versions.
-(See [DVC cache](/doc/user-guide/concepts/dvc-cache) and
+(See [DVC Cache](/doc/user-guide/concepts/dvc-cache) and
[DVC Files and Directories](/doc/user-guide/dvc-files-and-directories) for more
details.)
From 7e870d740efe7a519ba0774669bbdb13b09764d2 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Wed, 9 Dec 2020 22:20:53 -0700
Subject: [PATCH 55/59] Add anchor to cache link
---
content/docs/user-guide/dvc-files-and-directories.md | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/content/docs/user-guide/dvc-files-and-directories.md b/content/docs/user-guide/dvc-files-and-directories.md
index e577ae5b88..56a2765dde 100644
--- a/content/docs/user-guide/dvc-files-and-directories.md
+++ b/content/docs/user-guide/dvc-files-and-directories.md
@@ -250,9 +250,10 @@ Full parameters (key and value) are listed separately under
hand or with the command `dvc config --local`.
- `.dvc/cache`: The cache directory will store your data in a
- special [structure](/doc/user-guide/concepts/dvc-cache). The data files and
- directories in the workspace will only contain links to the data
- files in the cache. (Refer to
+ special
+ [structure](/doc/user-guide/concepts/dvc-cache#structure-of-the-cache-directory).
+ The data files and directories in the workspace will only contain
+ links to the data files in the cache. (Refer to
[Large Dataset Optimization](/doc/user-guide/large-dataset-optimization). See
`dvc config cache` for related configuration options.
From 1c8fb6bfb1bb84de1d7294a2bd7f2eeea2d84787 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Tue, 15 Dec 2020 21:40:21 -0700
Subject: [PATCH 56/59] Change links to abbr in params diff
---
content/docs/command-reference/params/diff.md | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/content/docs/command-reference/params/diff.md b/content/docs/command-reference/params/diff.md
index 0a35d0c6bb..0dcf1ad5ca 100644
--- a/content/docs/command-reference/params/diff.md
+++ b/content/docs/command-reference/params/diff.md
@@ -1,8 +1,7 @@
# params diff
-Show changes in [parameter dependencies](/doc/user-guide/concepts/parameters)
-between commits in the DVC repository, or between a commit and the
-workspace.
+Show changes in parameter dependencies between commits in the
+DVC repository, or between a commit and the workspace.
## Synopsis
@@ -22,8 +21,8 @@ This command provides a quick way to compare parameter values among experiments
in the repository history. Requires that Git is being used to version the
project params.
-> Parameter dependencies are defined with the `-p` option in `dvc run`. See also
-> [parameters](/doc/user-guide/concepts/parameters).
+> Parameter dependencies are defined with the `-p` option in
+> `dvc run`.
Run without arguments, this command compares parameters currently present in the
workspace (uncommitted changes) with the latest committed version.
@@ -51,8 +50,8 @@ itself does not ascribe any specific meaning for these values.
## Examples
-Let's create a simple YAML parameters file named `params.yaml` (default params
-file name, see [parameters](/doc/user-guide/concepts/parameters) to learn more):
+Let's create a simple YAML parameters file named `params.yaml` (default
+params file name):
```yaml
lr: 0.0041
From de28e19a54835608676ff99c6f209f29463ca62f Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Tue, 15 Dec 2020 21:54:10 -0700
Subject: [PATCH 57/59] Add cache directory to match, replace link with abbr in
push cmd ref
---
content/docs/command-reference/push.md | 3 +--
content/docs/user-guide/concepts/dvc-cache.md | 2 +-
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/content/docs/command-reference/push.md b/content/docs/command-reference/push.md
index c0349ec3c1..6f1072b52b 100644
--- a/content/docs/command-reference/push.md
+++ b/content/docs/command-reference/push.md
@@ -184,8 +184,7 @@ Finally, we used `dvc status` to double check that all data had been uploaded.
## Example: What happens in the cache?
-Let's take a detailed look at what happens to the
-[cache directory](/doc/user-guide/concepts/dvc-cache#structure-of-the-cache-directory)
+Let's take a detailed look at what happens to the cache directory
as you run an experiment locally and push data to remote storage. To set the
example consider having created a workspace that contains some code
and data, and having set up a remote.
diff --git a/content/docs/user-guide/concepts/dvc-cache.md b/content/docs/user-guide/concepts/dvc-cache.md
index 81f7281e95..f852bf2d5b 100644
--- a/content/docs/user-guide/concepts/dvc-cache.md
+++ b/content/docs/user-guide/concepts/dvc-cache.md
@@ -1,6 +1,6 @@
---
name: 'DVC Cache'
-match: ['DVC cache', 'cache', 'caches', 'cached']
+match: ['DVC cache', 'cache', 'caches', 'cached', 'cache directory']
tooltip: >-
The [DVC cache](/doc/user-guide/concepts/dvc-cache) is a hidden storage (by
default located in the `.dvc/cache` directory) for files that are tracked by
From 023edb95da582be17dc21997ebd20192c0b108e8 Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Tue, 15 Dec 2020 22:07:20 -0700
Subject: [PATCH 58/59] Replace parameters links with tooltips in run
---
content/docs/command-reference/run.md | 30 +++++++++++++--------------
1 file changed, 14 insertions(+), 16 deletions(-)
diff --git a/content/docs/command-reference/run.md b/content/docs/command-reference/run.md
index 86e594b1d9..851a700ee5 100644
--- a/content/docs/command-reference/run.md
+++ b/content/docs/command-reference/run.md
@@ -121,10 +121,10 @@ Relevant notes:
### For displaying and comparing data science experiments
-[parameters](/doc/command-reference/params) (`-p`/`--params` option) are a
-special type of key/value dependencies. Multiple parameter dependencies can be
-specified from within one or more YAML, JSON, TOML, or Python parameters files
-(e.g. `params.yaml`). This allows tracking experimental hyperparameters easily.
+Parameters (`-p`/`--params` option) are a special type of key/value
+dependencies. Multiple parameter dependencies can be specified from within one
+or more YAML, JSON, TOML, or Python parameters files (e.g. `params.yaml`). This
+allows tracking experimental hyperparameters easily.
Special types of output files, [metrics](/doc/command-reference/metrics) (`-m`
and `-M` options) and [plots](/doc/command-reference/plots) (`--plots` and
@@ -189,12 +189,11 @@ $ dvc run -n my_stage './my_script.sh $MYENVVAR'
outputs are not tracked by DVC.
- `-p [:]`, `--params [:]` - specify a set
- of [parameter dependencies](/doc/command-reference/params) the stage depends
- on, from a parameters file. This is done by sending a comma separated list as
- argument, e.g. `-p learning_rate,epochs`. The default parameters file name is
- `params.yaml`, but this can be redefined with a prefix in the argument sent to
- this option, e.g. `-p parse_params.yaml:threshold`. See
- [parameters](/doc/user-guide/concepts/parameters) to learn more.
+ of parameter dependencies the stage depends on, from a parameters
+ file. This is done by sending a comma separated list as argument, e.g.
+ `-p learning_rate,epochs`. The default parameters file name is `params.yaml`,
+ but this can be redefined with a prefix in the argument sent to this option,
+ e.g. `-p parse_params.yaml:threshold`.
- `-m `, `--metrics ` - specify a metrics file produced by this
stage. This option behaves like `-o` but registers the file in a `metrics`
@@ -403,9 +402,8 @@ $ dvc dag
## Example: Using parameter dependencies
-To use specific values inside a parameters file as dependencies, create a simple
-YAML file named `params.yaml` (default params file name, see
-[parameters](/doc/user-guide/concepts/parameters) to learn more):
+To use specific values inside a parameters file as dependencies,
+create a simple YAML file named `params.yaml` (default params file name):
```yaml
seed: 20180226
@@ -442,6 +440,6 @@ lr = params['train']['lr']
epochs = params['train']['epochs']
```
-DVC will keep an eye on these param values (same as with the regular dependency
-files) and know that the stage should be reproduced if/when they change. See
-[parameters](/doc/user-guide/concepts/parameters) for more details.
+DVC will keep an eye on these param values (same as with the
+regular dependency files) and know that the stage should be reproduced if/when
+they change.
From 4834c1fb6c3bade37b8c48688df5f42fcdb0d9bd Mon Sep 17 00:00:00 2001
From: jeremydesroches <18587991+jeremydesroches@users.noreply.github.com>
Date: Tue, 15 Dec 2020 22:37:24 -0700
Subject: [PATCH 59/59] Update data pipelines concept terms and description
---
content/docs/user-guide/concepts/data-pipelines.md | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/content/docs/user-guide/concepts/data-pipelines.md b/content/docs/user-guide/concepts/data-pipelines.md
index 0facdf2b14..f8f8ec4631 100644
--- a/content/docs/user-guide/concepts/data-pipelines.md
+++ b/content/docs/user-guide/concepts/data-pipelines.md
@@ -6,11 +6,11 @@ tooltip: >-
processing stages, chained by their outputs and inputs. They use some initial
data, may produce intermediate artifacts, and reach a final result.
description: >-
- In DVC, a data pipeline is a series of data processing stages. A pipeline may
- produce intermediate data, and has a final result.
+ A data pipeline is a series of data processing stages. They use some initial
+ data, may produce intermediate artifacts, and reach a final result.
---
-
+
# Data Pipelines