From 9fea4de44e64ea4a495ccf8861b325ef2c08f00d Mon Sep 17 00:00:00 2001 From: David de la Iglesia Castro Date: Thu, 27 Jan 2022 22:12:01 +0100 Subject: [PATCH 01/11] dvctable: Add `violet` for `dep` columns. --- config/prismjs/dvctable.js | 9 +++++++++ .../Documentation/Markdown/Main/styles.module.css | 5 +++++ 2 files changed, 14 insertions(+) diff --git a/config/prismjs/dvctable.js b/config/prismjs/dvctable.js index 1db6373aae..4121762144 100644 --- a/config/prismjs/dvctable.js +++ b/config/prismjs/dvctable.js @@ -49,6 +49,15 @@ Prism.languages.dvctable = { ...boldAndItalicConfig } }, + 'bg-violet': { + pattern: getTableTextBgColorRegex('(violet|dep)'), + inside: { + hide: { + pattern: /(violet|dep):/ + }, + ...boldAndItalicConfig + } + }, ...boldAndItalicConfig } } diff --git a/src/components/Documentation/Markdown/Main/styles.module.css b/src/components/Documentation/Markdown/Main/styles.module.css index 25c61707fb..274aad58a4 100644 --- a/src/components/Documentation/Markdown/Main/styles.module.css +++ b/src/components/Documentation/Markdown/Main/styles.module.css @@ -234,6 +234,11 @@ color: #000; background-color: #d7feff; } + + .token.bg-violet { + color: #000; + background-color: #d7afff; + } } pre[class*="language-dvctable"] { From 2d73993492a52981d925548abf4c439f1447680d Mon Sep 17 00:00:00 2001 From: David de la Iglesia Castro Date: Thu, 27 Jan 2022 22:38:13 +0100 Subject: [PATCH 02/11] Add dep columns in `exp show` ref. --- content/docs/command-reference/exp/show.md | 192 +++++++++++---------- 1 file changed, 97 insertions(+), 95 deletions(-) diff --git a/content/docs/command-reference/exp/show.md b/content/docs/command-reference/exp/show.md index 64d2b24ef4..47be3a0d00 100644 --- a/content/docs/command-reference/exp/show.md +++ b/content/docs/command-reference/exp/show.md @@ -21,24 +21,25 @@ usage: dvc exp show [-h] [-q | -v] [-a] [-T] [-A] [-n ] Displays experiments and [checkpoints](/doc/command-reference/exp/run#checkpoints) in a detailed table -which includes their parent and name (or hash), as well as project metrics and -parameters. Only the experiments derived from the Git `HEAD` are shown by -default but all experiments can be included with the `--all-commits` option. -Example: +which includes their parent and name (or hash), as well as project +dependencies (violet), metrics (yellow) and parameters (blue). Only +the experiments derived from the Git `HEAD` are shown by default but all +experiments can be included with the `--all-commits` option. Example: ```dvc $ dvc exp show ``` ```dvctable - ─────────────────────────────────────────────────────────────────── - neutral:**Experiment** metric:**avg_prec** metric:**roc_auc** param:**train.n_est** param:**train.min_split** - ─────────────────────────────────────────────────────────────────── - workspace 0.56191 0.93345 50 2 - master 0.55259 0.91536 50 2 - ├── exp-bfe64 0.57833 0.95555 50 8 - └── exp-ad5b1 0.56191 0.93345 50 2 - ─────────────────────────────────────────────────────────────────── + ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + neutral:**Experiment** neutral:**Created** metric:**avg_prec** metric:**roc_auc** param:**featurize.max_features** dep:**model.pkl** dep:**data/features** + ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + workspace - 0.60405 0.9608 3000 484fab5 52c1fdd + random-forest-experiments May 29, 2021 0.60405 0.9608 3000 484fab5 52c1fdd + ├── a2efdc9 [exp-68ac9] 10:21 PM 0.55669 0.93516 1000 e2b5a9a 1b2d542 + ├── e7bd029 [exp-25e9a] 10:21 PM 0.58589 0.945 2000 7aae464 2ac217b + └── 56f3be3 [exp-8f5c4] 10:21 PM 0.51799 0.92333 500 cfbfed4 64ed644 + ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────── ``` Your terminal will enter a @@ -46,9 +47,10 @@ Your terminal will enter a which you can typically exit by typing `Q`. Use `--no-pager` to print the table to standard output. -By default, the printed experiments table will include columns for all metrics -and params from the entire project. The `--only-changed`, `--drop`, `--keep`, -and other [options](#options) can determine which ones should be displayed. +By default, the printed experiments table will include columns for all +dependencies, metrics and params from the entire project. The `--only-changed`, +`--drop`, `--keep`, and other [options](#options) can determine which columns +should be displayed. Experiments in the table are first grouped (by parent commit). They are then sorted inside each group, chronologically by default. The `--sort-by` and @@ -82,8 +84,8 @@ will be generated using the same data from the table. - `--param-deps` - include only parameters that are stage dependencies. -- `--only-changed` - show only parameters and metrics with values that vary - across experiments. +- `--only-changed` - show only dependencies, metrics and params with values that + vary across experiments. - `--drop ` - remove the matching columns. This option has higher priority than `--only-changed`. If both options are combined, `--drop` will @@ -138,23 +140,23 @@ will be generated using the same data from the table. Let's say we have run 3 experiments in our project. The basic usage shows the workspace (Git working tree) and experiments derived from `HEAD` (`master` -branch in this case), and all of their metrics and params (scroll right to see -all): +branch in this case), and all of their dependencies, metrics and params (scroll +right to see all): ```dvc $ dvc exp show ``` ```dvctable - ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── - neutral:**Experiment** neutral:**Created** metric:**avg_prec** metric:**roc_auc** param:**prepare.split** param:**prepare.seed** param:**featurize.max_features** param:**featurize.ngrams** param:**train.seed** param:**train.n_est** param:**train.min_split** - ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── - workspace - 0.60405 0.9608 0.2 20170428 3000 2 20170428 100 64 - master May 29, 2021 0.60405 0.9608 0.2 20170428 3000 2 20170428 100 64 - ├── d384680 [exp-bc055] 08:03 PM 0.51799 0.92333 0.2 20170428 500 2 20170428 100 64 - ├── 6b338f8 [exp-3315b] 08:03 PM 0.58589 0.945 0.2 20170428 2000 2 20170428 100 64 - └── d7fdde2 [exp-1b262] 08:03 PM 0.56447 0.94713 0.2 20170428 1500 2 20170428 100 64 - ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + neutral:**Experiment** neutral:**Created** metric:**avg_prec** metric:**roc_auc** param:**prepare.split** param:**prepare.seed** param:**featurize.max_features** param:**featurize.ngrams** param:**train.seed** param:**train.n_est** param:**train.min_split** dep:**data/prepared** dep:**src/train.py** dep:**src/evaluate.py** dep:**src/prepare.py** dep:**data/features** dep:**data/data.xml** dep:**model.pkl** dep:**src/featurization.py** + ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + workspace - 0.60405 0.9608 0.2 20170428 3000 2 20170428 100 64 20b786b 9ab9549 fb7b520 51549a1 52c1fdd a304afb 484fab5 61c5927 + random-forest-experiments May 29, 2021 0.60405 0.9608 0.2 20170428 3000 2 20170428 100 64 20b786b 9ab9549 fb7b520 51549a1 52c1fdd a304afb 484fab5 61c5927 + ├── e7bd029 [exp-25e9a] 10:21 PM 0.58589 0.945 0.2 20170428 2000 2 20170428 100 64 20b786b 9ab9549 fb7b520 51549a1 2ac217b a304afb 7aae464 61c5927 + ├── a2efdc9 [exp-68ac9] 10:21 PM 0.55669 0.93516 0.2 20170428 1000 2 20170428 100 64 20b786b 9ab9549 fb7b520 51549a1 1b2d542 a304afb e2b5a9a 61c5927 + └── 56f3be3 [exp-8f5c4] 10:21 PM 0.51799 0.92333 0.2 20170428 500 2 20170428 100 64 20b786b 9ab9549 fb7b520 51549a1 64ed644 a304afb cfbfed4 61c5927 + ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── ``` > You can exit this screen with `Q`, typically. @@ -167,15 +169,15 @@ $ dvc exp show --only-changed ``` ```dvctable - ────────────────────────────────────────────────────────────────────────────────────── - neutral:**Experiment** neutral:**Created** metric:**avg_prec** metric:**roc_auc** param:**featurize.max_features** - ────────────────────────────────────────────────────────────────────────────────────── - workspace - 0.60405 0.9608 3000 - master May 29, 2021 0.60405 0.9608 3000 - ├── d7fdde2 [exp-1b262] 08:03 PM 0.56447 0.94713 1500 - ├── 6b338f8 [exp-3315b] 08:03 PM 0.58589 0.945 2000 - └── d384680 [exp-bc055] 08:03 PM 0.51799 0.92333 500 - ────────────────────────────────────────────────────────────────────────────────────── + ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + neutral:**Experiment** neutral:**Created** metric:**avg_prec** metric:**roc_auc** param:**featurize.max_features** dep:**model.pkl** dep:**data/features** + ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + workspace - 0.60405 0.9608 3000 484fab5 52c1fdd + random-forest-experiments May 29, 2021 0.60405 0.9608 3000 484fab5 52c1fdd + ├── a2efdc9 [exp-68ac9] 10:21 PM 0.55669 0.93516 1000 e2b5a9a 1b2d542 + ├── e7bd029 [exp-25e9a] 10:21 PM 0.58589 0.945 2000 7aae464 2ac217b + └── 56f3be3 [exp-8f5c4] 10:21 PM 0.51799 0.92333 500 cfbfed4 64ed644 + ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────── ``` You can also use `--drop` to filter specific columns: @@ -185,15 +187,15 @@ $ dvc exp show --drop prepare ``` ```dvctable - ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── - neutral:**Experiment** neutral:**Created** metric:**avg_prec** metric:**roc_auc** param:**featurize.max_features** param:**featurize.ngrams** param:**train.seed** param:**train.n_est** param:**train.min_split** - ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── - workspace - 0.60405 0.9608 3000 2 20170428 100 64 - master May 29, 2021 0.60405 0.9608 3000 2 20170428 100 64 - ├── 6b338f8 [exp-3315b] 08:03 PM 0.58589 0.945 2000 2 20170428 100 64 - ├── d384680 [exp-bc055] 08:03 PM 0.51799 0.92333 500 2 20170428 100 64 - └── d7fdde2 [exp-1b262] 08:03 PM 0.56447 0.94713 1500 2 20170428 100 64 - ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + neutral:**Experiment** neutral:**Created** metric:**avg_prec** metric:**roc_auc** param:**featurize.max_features** param:**featurize.ngrams** param:**train.seed** param:**train.n_est** param:**train.min_split** dep:**data/prepared** dep:**model.pkl** dep:**data/data.xml** dep:**src/prepare.py** dep:**data/features** dep:**src/evaluate.py** dep:**src/featurization.py** dep:**src/train.py** + ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + workspace - 0.60405 0.9608 3000 2 20170428 100 64 20b786b 484fab5 a304afb 51549a1 52c1fdd fb7b520 61c5927 9ab9549 + random-forest-experiments May 29, 2021 0.60405 0.9608 3000 2 20170428 100 64 20b786b 484fab5 a304afb 51549a1 52c1fdd fb7b520 61c5927 9ab9549 + ├── e7bd029 [exp-25e9a] 10:21 PM 0.58589 0.945 2000 2 20170428 100 64 20b786b 7aae464 a304afb 51549a1 2ac217b fb7b520 61c5927 9ab9549 + ├── a2efdc9 [exp-68ac9] 10:21 PM 0.55669 0.93516 1000 2 20170428 100 64 20b786b e2b5a9a a304afb 51549a1 1b2d542 fb7b520 61c5927 9ab9549 + └── 56f3be3 [exp-8f5c4] 10:21 PM 0.51799 0.92333 500 2 20170428 100 64 20b786b cfbfed4 a304afb 51549a1 64ed644 fb7b520 61c5927 9ab9549 + ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── ``` You can use [regex][regex] to match columns. For example, to remove multiple @@ -204,15 +206,15 @@ $ dvc exp show --drop 'avg_prec|train.min_split' ``` ```dvctable - ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── - neutral:**Experiment** neutral:**Created** metric:**roc_auc** param:**prepare.split** param:**prepare.seed** param:**featurize.max_features** param:**featurize.ngrams** param:**train.seed** param:**train.n_est** - ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── - workspace - 0.9608 0.2 20170428 3000 2 20170428 100 - master May 29, 2021 0.9608 0.2 20170428 3000 2 20170428 100 - ├── d384680 [exp-bc055] Dec 17, 2021 0.92333 0.2 20170428 500 2 20170428 100 - ├── d7fdde2 [exp-1b262] Dec 17, 2021 0.94713 0.2 20170428 1500 2 20170428 100 - └── 6b338f8 [exp-3315b] Dec 17, 2021 0.945 0.2 20170428 2000 2 20170428 100 - ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + neutral:**Experiment** neutral:**Created** metric:**roc_auc** param:**prepare.split** param:**prepare.seed** param:**featurize.max_features** param:**featurize.ngrams** param:**train.seed** param:**train.n_est** dep:**src/prepare.py** dep:**data/prepared** dep:**data/features** dep:**data/data.xml** dep:**src/evaluate.py** dep:**src/featurization.py** dep:**src/train.py** dep:**model.pkl** + ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + workspace - 0.9608 0.2 20170428 3000 2 20170428 100 51549a1 20b786b 52c1fdd a304afb fb7b520 61c5927 9ab9549 484fab5 + 11-random-forest-experiments May 29, 2021 0.9608 0.2 20170428 3000 2 20170428 100 51549a1 20b786b 52c1fdd a304afb fb7b520 61c5927 9ab9549 484fab5 + ├── a2efdc9 [exp-68ac9] 10:21 PM 0.93516 0.2 20170428 1000 2 20170428 100 51549a1 20b786b 1b2d542 a304afb fb7b520 61c5927 9ab9549 e2b5a9a + ├── e7bd029 [exp-25e9a] 10:21 PM 0.945 0.2 20170428 2000 2 20170428 100 51549a1 20b786b 2ac217b a304afb fb7b520 61c5927 9ab9549 7aae464 + └── 56f3be3 [exp-8f5c4] 10:21 PM 0.92333 0.2 20170428 500 2 20170428 100 51549a1 20b786b 64ed644 a304afb fb7b520 61c5927 9ab9549 cfbfed4 + ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── ``` If combined `--only-changed` has the least priority, `--drop` comes next, and @@ -223,15 +225,15 @@ $ dvc exp show --only-changed --drop Created --keep 'train.(?!seed)' ``` ```dvctable - ─────────────────────────────────────────────────────────────────────────────────────────────────────── - neutral:**Experiment** metric:**avg_prec** metric:**roc_auc** param:**featurize.max_features** param:**train.n_est** param:**train.min_split** - ─────────────────────────────────────────────────────────────────────────────────────────────────────── - workspace 0.60405 0.9608 3000 100 64 - master 0.60405 0.9608 3000 100 64 - ├── d384680 [exp-bc055] 0.51799 0.92333 500 100 64 - ├── 6b338f8 [exp-3315b] 0.58589 0.945 2000 100 64 - └── d7fdde2 [exp-1b262] 0.56447 0.94713 1500 100 64 - ─────────────────────────────────────────────────────────────────────────────────────────────────────── + ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + neutral:**Experiment** metric:**avg_prec** metric:**roc_auc** param:**featurize.max_features** param:**train.n_est** param:**train.min_split** dep:**model.pkl** dep:**data/features** + ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + workspace 0.60405 0.9608 3000 100 64 484fab5 52c1fdd + random-forest-experiments 0.60405 0.9608 3000 100 64 484fab5 52c1fdd + ├── e7bd029 [exp-25e9a] 0.58589 0.945 2000 100 64 7aae464 2ac217b + ├── a2efdc9 [exp-68ac9] 0.55669 0.93516 1000 100 64 e2b5a9a 1b2d542 + └── 56f3be3 [exp-8f5c4] 0.51799 0.92333 500 100 64 cfbfed4 64ed644 + ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── ``` Sort experiments by the `roc_auc` metric, in descending order: @@ -241,15 +243,15 @@ $ dvc exp show --only-changed --sort-by=roc_auc --sort-order desc ``` ```dvctable - ────────────────────────────────────────────────────────────────────────────────────── - neutral:**Experiment** neutral:**Created** metric:**avg_prec** metric:**roc_auc** param:**featurize.max_features** - ────────────────────────────────────────────────────────────────────────────────────── - workspace - 0.60405 0.9608 3000 - master May 29, 2021 0.60405 0.9608 3000 - ├── d7fdde2 [exp-1b262] 08:03 PM 0.56447 0.94713 1500 - ├── 6b338f8 [exp-3315b] 08:03 PM 0.58589 0.945 2000 - └── d384680 [exp-bc055] 08:03 PM 0.51799 0.92333 500 - ────────────────────────────────────────────────────────────────────────────────────── + ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + neutral:**Experiment** neutral:**Created** metric:**avg_prec** metric:**roc_auc** param:**featurize.max_features** dep:**model.pkl** dep:**data/features** + ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + workspace - 0.60405 0.9608 3000 484fab5 52c1fdd + 11-random-forest-experiments May 29, 2021 0.60405 0.9608 3000 484fab5 52c1fdd + ├── e7bd029 [exp-25e9a] 10:21 PM 0.58589 0.945 2000 7aae464 2ac217b + ├── a2efdc9 [exp-68ac9] 10:21 PM 0.55669 0.93516 1000 e2b5a9a 1b2d542 + └── 56f3be3 [exp-8f5c4] 10:21 PM 0.51799 0.92333 500 cfbfed4 64ed644 + ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── ``` To see all experiments throughout the Git history: @@ -259,27 +261,27 @@ $ dvc exp show --all-commits --only-changed --sort-by=roc_auc ``` ```dvctable - ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── - neutral:**Experiment** neutral:**Created** metric:**avg_prec** metric:**roc_auc** param:**featurize.max_features** param:**featurize.ngrams** param:**train.n_est** param:**train.min_split** - ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── - workspace - 0.60405 0.9608 3000 2 100 64 - try-large-dataset Jun 01, 2021 0.67038 0.96693 3000 2 100 64 - master May 29, 2021 0.60405 0.9608 3000 2 100 64 - ├── d384680 [exp-bc055] 08:03 PM 0.51799 0.92333 500 2 100 64 - ├── 6b338f8 [exp-3315b] 08:03 PM 0.58589 0.945 2000 2 100 64 - └── d7fdde2 [exp-1b262] 08:03 PM 0.56447 0.94713 1500 2 100 64 - cc51022 May 28, 2021 0.55259 0.91536 1500 2 50 2 - 7ab3585 May 27, 2021 0.52048 0.9032 1500 2 50 2 - 53b2d9d May 25, 2021 0.52048 0.9032 500 1 50 2 - 872cd6c May 24, 2021 - - 500 1 50 2 - 8188b34 May 23, 2021 - - 500 1 50 2 - 9244ec3 May 22, 2021 - - 500 1 50 2 - 08a3b89 May 21, 2021 - - - - - - - 16ba2cd May 20, 2021 - - - - - - - f0c0269 May 18, 2021 - - - - - - - 3e07290 May 17, 2021 - - - - - - - 90b2aea May 16, 2021 - - - - - - - ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + neutral:**Experiment** neutral:**Created** metric:**avg_prec** metric:**roc_auc** param:**prepare.split** param:**prepare.seed** param:**featurize.max_features** param:**featurize.ngrams** param:**train.seed** param:**train.n_est** param:**train.min_split** dep:**src/train.py** dep:**model.pkl** dep:**data/data.xml** dep:**src/evaluate.py** dep:**data/features** dep:**src/prepare.py** dep:**data/prepared** dep:**src/featurization.py** + ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + workspace - 0.60405 0.9608 0.2 20170428 3000 2 20170428 100 64 9ab9549 484fab5 a304afb fb7b520 52c1fdd 51549a1 20b786b 61c5927 + bee447d Jun 01, 2021 0.67038 0.96693 0.2 20170428 3000 2 20170428 100 64 9ab9549 fe89bd4 c1fa36d fb7b520 7c68668 51549a1 030d866 61c5927 + 11-random-forest-experiments May 29, 2021 0.60405 0.9608 0.2 20170428 3000 2 20170428 100 64 9ab9549 484fab5 a304afb fb7b520 52c1fdd 51549a1 20b786b 61c5927 + ├── 56f3be3 [exp-8f5c4] 10:21 PM 0.51799 0.92333 0.2 20170428 500 2 20170428 100 64 9ab9549 cfbfed4 a304afb fb7b520 64ed644 51549a1 20b786b 61c5927 + ├── a2efdc9 [exp-68ac9] 10:21 PM 0.55669 0.93516 0.2 20170428 1000 2 20170428 100 64 9ab9549 e2b5a9a a304afb fb7b520 1b2d542 51549a1 20b786b 61c5927 + └── e7bd029 [exp-25e9a] 10:21 PM 0.58589 0.945 0.2 20170428 2000 2 20170428 100 64 9ab9549 7aae464 a304afb fb7b520 2ac217b 51549a1 20b786b 61c5927 + bigrams-experiment May 28, 2021 0.55259 0.91536 0.2 20170428 1500 2 20170428 50 2 9ab9549 17b3d1e a304afb fb7b520 f237c73 51549a1 20b786b 61c5927 + 9-bigrams-model May 27, 2021 0.52048 0.9032 0.2 20170428 1500 2 20170428 50 2 9ab9549 c4c0670 a304afb fb7b520 2b5e0fd 51549a1 20b786b 61c5927 + 8-evaluation May 25, 2021 0.52048 0.9032 0.2 20170428 500 1 20170428 50 2 9ab9549 c4c0670 a304afb fb7b520 2b5e0fd 51549a1 20b786b 61c5927 + 7-ml-pipeline May 24, 2021 - - 0.2 20170428 500 1 20170428 50 2 9ab9549 - a304afb - 2b5e0fd 51549a1 20b786b 61c5927 + 6-prepare-stage May 23, 2021 - - 0.2 20170428 500 1 20170428 50 2 - - a304afb - - 51549a1 - - + 5-source-code May 22, 2021 - - 0.2 20170428 500 1 20170428 50 2 - - - - - - - - + 4-import-data May 21, 2021 - - - - - - - - - - - - - - - - - + 3-config-remote May 20, 2021 - - - - - - - - - - - - - - - - - + 2-track-data May 18, 2021 - - - - - - - - - - - - - - - - - + 1-dvc-init May 17, 2021 - - - - - - - - - - - - - - - - - + 0-git-init May 16, 2021 - - - - - - - - - - - - - - - - - + ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── ``` Note that in this example, Git commits remain in chronological order. The @@ -309,7 +311,7 @@ Combine with other flags for further filtering: ```dvc $ dvc exp show --all-branches --pcp --sort-by roc_auc - --exclude-metrics avg_prec + --drop avg_prec ``` ![](/img/ref_pcp_filter.png) _Excluded avg_prec column_ From 8f8d3a43eec39bc198bea67ced9db84fba24ba08 Mon Sep 17 00:00:00 2001 From: David de la Iglesia Castro Date: Thu, 27 Jan 2022 22:46:13 +0100 Subject: [PATCH 03/11] Mention dependencies in start and user-guide. --- content/docs/start/experiments.md | 13 ++++---- .../comparing-experiments.md | 31 ++++++++++--------- 2 files changed, 23 insertions(+), 21 deletions(-) diff --git a/content/docs/start/experiments.md b/content/docs/start/experiments.md index 94473306fd..4e316ac145 100644 --- a/content/docs/start/experiments.md +++ b/content/docs/start/experiments.md @@ -6,8 +6,8 @@ title: 'Get Started: Experiments' In machine learning projects, the number of experiments grows rapidly. DVC can track these experiments, list and compare their most relevant -parameters and metrics, navigate among them, and commit only the ones that we -need to Git. +dependencies, metrics and parameters, navigate among them, and commit only the +ones that we need to Git. > ⚠️This video is out-of-date and will be updated soon! Where there are > discrepancies between docs and video, please follow the docs. @@ -198,10 +198,11 @@ $ dvc exp show ───────────────────────────────────────────────────────────────────────────────────────────── ``` -By default, it shows all the parameters and the metrics with the timestamp. If -you have a large number of parameters, metrics or experiments, this may lead to -a cluttered view. You can limit the table to specific metrics, or parameters, or -hide the timestamp column (`Created`) using the `--drop` option of the command. +By default, it shows all the dependencies, metrics and parameters with the +timestamp. If you have a large number of dependencies, metrics, parameters or +experiments, this may lead to a cluttered view. You can limit the table to +specific dependencies, metrics, or parameters, or hide the timestamp column +(`Created`) using the `--drop` option of the command. ```dvc $ dvc exp show --drop 'Created|train|loss' diff --git a/content/docs/user-guide/experiment-management/comparing-experiments.md b/content/docs/user-guide/experiment-management/comparing-experiments.md index 6a274687ca..1865aa27b8 100644 --- a/content/docs/user-guide/experiment-management/comparing-experiments.md +++ b/content/docs/user-guide/experiment-management/comparing-experiments.md @@ -89,21 +89,22 @@ refs/tags/baseline-experiment: Experimentation is about generating many possibilities before selecting a few of them. You can get a table of experiments with `dvc exp show`, which displays all -the parameters and metrics in a nicely formatted table. +the dependencies (violet), metrics (yellow) and parameters (blue) +in a nicely formatted table. ```dvc $ dvc exp show ``` ```dvctable - ─────────────────────────────────────────────────────────────────────────────────────────── - neutral:**Experiment** neutral:**Created** metric:**loss** metric:**acc** param:**train.epochs** param:**model.conv_units** - ─────────────────────────────────────────────────────────────────────────────────────────── - workspace - 0.23657 0.9127 10 16 - baseline-experiment Sep 06, 2021 0.23657 0.9127 10 16 - ├── 6d13f33 [cnn-64] Sep 09, 2021 0.23385 0.9153 10 64 - ├── 69503c6 [cnn-128] Sep 09, 2021 0.23243 0.916 10 128 - ─────────────────────────────────────────────────────────────────────────────────────────── + ──────────────────────────────────────────────────────────────────────────────────────────────────────────────── + neutral:**Experiment** neutral:**Created** metric:**loss** metric:**acc** param:**train.epochs** param:**model.conv_units** dep:**src** dep:**data** + ──────────────────────────────────────────────────────────────────────────────────────────────────────────────── + workspace - 0.03332 0.9888 10 16 695e061 6875529 + baseline-experiment Jan 14, 2022 0.03332 0.9888 10 16 695e061 6875529 + ├── 38d6c53 [cnn-64] Jan 19, 2022 0.038246 0.988 10 64 c77a505 6875529 + └── bc0faf5 [cnn-128] Jan 19, 2022 0.038325 0.989 10 128 bc75d6a 6875529 + ──────────────────────────────────────────────────────────────────────────────────────────────────────────────── ``` `dvc exp show` only tabulates experiments in the workspace and in `HEAD`. You @@ -111,9 +112,9 @@ can use `--all` flag to show all the experiments in the project instead. ## Customize the table of experiments -The table output may become cluttered if you have a large number of parameters -and metrics. `dvc exp show` provides several options to select the parameters -and metrics to be shown in the table. +The table output may become cluttered if you have a large number of +dependencies, metrics and parameters. `dvc exp show` provides several options to +select the columns to be shown in the table. The `--include-params` and `--include-metrics` options take a list of comma-separated parameter or metrics names (defined in `dvc.yaml`). @@ -196,9 +197,9 @@ You can also generate an interactive [parallel coordinates plot](https://en.wikipedia.org/wiki/Parallel_coordinates) with `dvc exp show --pcp`. -This plot is useful to explore the relationships between the metrics and params -used in experiments. You can reorder the columns to make some patterns more -easily visible. +This plot is useful to explore the relationships between the dependencies, +metrics and params used in experiments. You can reorder the columns to make some +patterns more easily visible. The `--pcp` flag can be combined with other options of the command. For example, use `--sort-by` to sort the experiments and determine the color of the lines From 783ef9fb8239d05bdaa3a086a24c77293a4fbc8e Mon Sep 17 00:00:00 2001 From: David de la Iglesia Castro Date: Fri, 28 Jan 2022 10:30:25 +0100 Subject: [PATCH 04/11] Update table --- content/docs/start/experiments.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/content/docs/start/experiments.md b/content/docs/start/experiments.md index 4e316ac145..539d6fbd06 100644 --- a/content/docs/start/experiments.md +++ b/content/docs/start/experiments.md @@ -89,13 +89,13 @@ $ dvc exp show ``` ```dvctable - ───────────────────────────────────────────────────────────────────────────────────────────── - white:**Experiment** white:**Created** yellow:**loss** yellow:**acc** blue:**train.epochs** blue:**model.conv_units** - ───────────────────────────────────────────────────────────────────────────────────────────── - workspace - 0.23282 0.9152 10 16 - 7317bc6 Jul 18, 2021 - - 10 16 - └── 1a1d858 [exp-6dccf] 03:21 PM 0.23282 0.9152 10 16 - ───────────────────────────────────────────────────────────────────────────────────────────── + ───────────────────────────────────────────────────────────────────────────────────────────────────────────────── + Experiment Created loss acc train.epochs model.conv_units data src + ───────────────────────────────────────────────────────────────────────────────────────────────────────────────── + workspace - 0.03247 0.9887 10 16 6875529 c5f2f29 + baseline-experiment Jan 14, 2022 0.03332 0.9888 10 16 6875529 695e061 + └── 999710f [exp-ff24d] 10:54 PM 0.03247 0.9887 10 16 6875529 c5f2f29 + ───────────────────────────────────────────────────────────────────────────────────────────────────────────────── ``` The `workspace` row in the table shows the results of the most recent experiment From 32f45a0e57af85dca23d1dc85081ff293be3b905 Mon Sep 17 00:00:00 2001 From: David de la Iglesia Castro Date: Fri, 28 Jan 2022 11:53:56 +0100 Subject: [PATCH 05/11] Update tables in get started. --- content/docs/start/experiments.md | 65 ++++++++++++++++--------------- 1 file changed, 33 insertions(+), 32 deletions(-) diff --git a/content/docs/start/experiments.md b/content/docs/start/experiments.md index 539d6fbd06..a8f3c11b24 100644 --- a/content/docs/start/experiments.md +++ b/content/docs/start/experiments.md @@ -89,13 +89,13 @@ $ dvc exp show ``` ```dvctable - ───────────────────────────────────────────────────────────────────────────────────────────────────────────────── - Experiment Created loss acc train.epochs model.conv_units data src - ───────────────────────────────────────────────────────────────────────────────────────────────────────────────── - workspace - 0.03247 0.9887 10 16 6875529 c5f2f29 - baseline-experiment Jan 14, 2022 0.03332 0.9888 10 16 6875529 695e061 - └── 999710f [exp-ff24d] 10:54 PM 0.03247 0.9887 10 16 6875529 c5f2f29 - ───────────────────────────────────────────────────────────────────────────────────────────────────────────────── + ──────────────────────────────────────────────────────────────────────────────────────────────────────── + neutral:**Experiment** neutral:**Created** metric:**loss** metric:**acc** param:**train.epochs** param:**model.conv_units** dep:**data** + ──────────────────────────────────────────────────────────────────────────────────────────────────────── + workspace - 0.03247 0.9887 10 16 6875529 + baseline-experiment Jan 14, 2022 0.03332 0.9888 10 16 6875529 + └── 999710f [exp-ff24d] 10:54 PM 0.03247 0.9887 10 16 6875529 + ──────────────────────────────────────────────────────────────────────────────────────────────────────── ``` The `workspace` row in the table shows the results of the most recent experiment @@ -184,43 +184,44 @@ $ dvc exp show ``` ```dvctable - ───────────────────────────────────────────────────────────────────────────────────────────── - white:**Experiment** white:**Created** yellow:**loss** yellow:**acc** blue:**train.epochs** blue:**model.conv_units** - ───────────────────────────────────────────────────────────────────────────────────────────── - workspace - 0.23508 0.9151 10 24 - 7317bc6 Jul 18, 2021 - - 10 16 - ├── e2647ef [exp-ee8a4] 05:14 PM 0.23146 0.9145 10 64 - ├── 15c9451 [exp-a9be6] 05:14 PM 0.25231 0.9102 10 32 - ├── 9c32227 [exp-17dd9] 04:46 PM 0.23687 0.9167 10 256 - ├── 8a9cb15 [exp-29d93] 04:46 PM 0.24459 0.9134 10 128 - ├── dfc536f [exp-a1bd9] 03:35 PM 0.23508 0.9151 10 24 - └── 1a1d858 [exp-6dccf] 03:21 PM 0.23282 0.9152 10 16 - ───────────────────────────────────────────────────────────────────────────────────────────── + ──────────────────────────────────────────────────────────────────────────────────────────────────────── + neutral:**Experiment** neutral:**Created** metric:**loss** metric:**acc** param:**train.epochs** param:**model.conv_units** dep:**data** + ──────────────────────────────────────────────────────────────────────────────────────────────────────── + workspace - 0.031865 0.9897 10 24 6875529 + baseline-experiment Jan 14, 2022 0.03332 0.9888 10 16 6875529 + ├── 43a3b4f [exp-7f82e] Jan 27, 2022 0.042424 0.9874 10 256 6875529 + ├── 6d15fac [exp-75369] Jan 27, 2022 0.037164 0.989 10 128 6875529 + ├── 47896c1 [exp-76693] Jan 27, 2022 0.03845 0.9876 10 64 6875529 + ├── da84ac7 [exp-4a081] Jan 27, 2022 0.035497 0.988 10 32 6875529 + ├── 5846c68 [exp-953fa] Jan 27, 2022 0.031865 0.9897 10 24 6875529 + └── 999710f [exp-ff24d] Jan 27, 2022 0.03247 0.9887 10 16 6875529 + ──────────────────────────────────────────────────────────────────────────────────────────────────────── ``` By default, it shows all the dependencies, metrics and parameters with the timestamp. If you have a large number of dependencies, metrics, parameters or experiments, this may lead to a cluttered view. You can limit the table to specific dependencies, metrics, or parameters, or hide the timestamp column -(`Created`) using the `--drop` option of the command. +(`Created`) using the [`--drop`](/doc/command-reference/exp/show#--drop) option +of the command. ```dvc $ dvc exp show --drop 'Created|train|loss' ``` ```dvctable - ───────────────────────────────────────────────────── - white:**Experiment** yellow:**acc** blue:**model.conv_units** - ───────────────────────────────────────────────────── - workspace 0.9151 24 - 7317bc6 - 16 - ├── e2647ef [exp-ee8a4] 0.9145 64 - ├── 15c9451 [exp-a9be6] 0.9102 32 - ├── 9c32227 [exp-17dd9] 0.9167 256 - ├── 8a9cb15 [exp-29d93] 0.9134 128 - ├── dfc536f [exp-a1bd9] 0.9151 24 - └── 1a1d858 [exp-6dccf] 0.9152 16 - ───────────────────────────────────────────────────── + ─────────────────────────────────────────────────────────────── + neutral:**Experiment** metric:**acc** param:**model.conv_units** dep:**data** + ─────────────────────────────────────────────────────────────── + workspace 0.9897 24 6875529 + baseline-experiment 0.9888 16 6875529 + ├── 43a3b4f [exp-7f82e] 0.9874 256 6875529 + ├── 6d15fac [exp-75369] 0.989 128 6875529 + ├── 47896c1 [exp-76693] 0.9876 64 6875529 + ├── da84ac7 [exp-4a081] 0.988 32 6875529 + ├── 5846c68 [exp-953fa] 0.9897 24 6875529 + └── 999710f [exp-ff24d] 0.9887 16 6875529 + ─────────────────────────────────────────────────────────────── ``` After selecting an experiment from the table, you can create a Git branch that From 2bf9556355bc04a64107d41a2c58e20551c8cbc9 Mon Sep 17 00:00:00 2001 From: David de la Iglesia Castro Date: Fri, 28 Jan 2022 12:59:46 +0100 Subject: [PATCH 06/11] Mention column order --- content/docs/command-reference/exp/show.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/content/docs/command-reference/exp/show.md b/content/docs/command-reference/exp/show.md index 47be3a0d00..8f72ce847a 100644 --- a/content/docs/command-reference/exp/show.md +++ b/content/docs/command-reference/exp/show.md @@ -21,9 +21,11 @@ usage: dvc exp show [-h] [-q | -v] [-a] [-T] [-A] [-n ] Displays experiments and [checkpoints](/doc/command-reference/exp/run#checkpoints) in a detailed table -which includes their parent and name (or hash), as well as project -dependencies (violet), metrics (yellow) and parameters (blue). Only -the experiments derived from the Git `HEAD` are shown by default but all +which includes their parent and name (or hash), as well as colored columns for +(left to right): metrics (yellow), parameters (blue) and +dependencies (violet). + +Only the experiments derived from the Git `HEAD` are shown by default but all experiments can be included with the `--all-commits` option. Example: ```dvc From 7953e1fd51291e601496ba916fa7cd621cd2e11c Mon Sep 17 00:00:00 2001 From: David de la Iglesia Castro Date: Mon, 31 Jan 2022 12:15:05 +0100 Subject: [PATCH 07/11] Order metrics, params and dependencies. --- content/docs/command-reference/exp/show.md | 12 ++++++------ content/docs/start/experiments.md | 9 ++++----- .../experiment-management/comparing-experiments.md | 12 ++++++------ 3 files changed, 16 insertions(+), 17 deletions(-) diff --git a/content/docs/command-reference/exp/show.md b/content/docs/command-reference/exp/show.md index 8f72ce847a..fa1055760b 100644 --- a/content/docs/command-reference/exp/show.md +++ b/content/docs/command-reference/exp/show.md @@ -49,10 +49,10 @@ Your terminal will enter a which you can typically exit by typing `Q`. Use `--no-pager` to print the table to standard output. -By default, the printed experiments table will include columns for all -dependencies, metrics and params from the entire project. The `--only-changed`, -`--drop`, `--keep`, and other [options](#options) can determine which columns -should be displayed. +By default, the printed experiments table will include columns for all metrics, +params and dependencies from the entire project. The `--only-changed`, `--drop`, +`--keep`, and other [options](#options) can determine which columns should be +displayed. Experiments in the table are first grouped (by parent commit). They are then sorted inside each group, chronologically by default. The `--sort-by` and @@ -86,7 +86,7 @@ will be generated using the same data from the table. - `--param-deps` - include only parameters that are stage dependencies. -- `--only-changed` - show only dependencies, metrics and params with values that +- `--only-changed` - show only metrics, params and dependencies with values that vary across experiments. - `--drop ` - remove the matching columns. This option has higher @@ -142,7 +142,7 @@ will be generated using the same data from the table. Let's say we have run 3 experiments in our project. The basic usage shows the workspace (Git working tree) and experiments derived from `HEAD` (`master` -branch in this case), and all of their dependencies, metrics and params (scroll +branch in this case), and all of their metrics, params and dependencies (scroll right to see all): ```dvc diff --git a/content/docs/start/experiments.md b/content/docs/start/experiments.md index a8f3c11b24..3a9c4bee54 100644 --- a/content/docs/start/experiments.md +++ b/content/docs/start/experiments.md @@ -198,12 +198,11 @@ $ dvc exp show ──────────────────────────────────────────────────────────────────────────────────────────────────────── ``` -By default, it shows all the dependencies, metrics and parameters with the -timestamp. If you have a large number of dependencies, metrics, parameters or +By default, it shows all the metrics, parameters and dependencies with the +timestamp. If you have a large number of metrics, parameters, dependencies or experiments, this may lead to a cluttered view. You can limit the table to -specific dependencies, metrics, or parameters, or hide the timestamp column -(`Created`) using the [`--drop`](/doc/command-reference/exp/show#--drop) option -of the command. +specific columns, including the timestamp (`Created`), using the +[`--drop`](/doc/command-reference/exp/show#--drop) option of the command. ```dvc $ dvc exp show --drop 'Created|train|loss' diff --git a/content/docs/user-guide/experiment-management/comparing-experiments.md b/content/docs/user-guide/experiment-management/comparing-experiments.md index 1865aa27b8..3b9e7e75ff 100644 --- a/content/docs/user-guide/experiment-management/comparing-experiments.md +++ b/content/docs/user-guide/experiment-management/comparing-experiments.md @@ -89,7 +89,7 @@ refs/tags/baseline-experiment: Experimentation is about generating many possibilities before selecting a few of them. You can get a table of experiments with `dvc exp show`, which displays all -the dependencies (violet), metrics (yellow) and parameters (blue) +the metrics (yellow), parameters (blue) and dependencies (violet) in a nicely formatted table. ```dvc @@ -112,9 +112,9 @@ can use `--all` flag to show all the experiments in the project instead. ## Customize the table of experiments -The table output may become cluttered if you have a large number of -dependencies, metrics and parameters. `dvc exp show` provides several options to -select the columns to be shown in the table. +The table output may become cluttered if you have a large number of metrics, +parameters and dependencies. `dvc exp show` provides several options to select +the columns to be shown in the table. The `--include-params` and `--include-metrics` options take a list of comma-separated parameter or metrics names (defined in `dvc.yaml`). @@ -197,8 +197,8 @@ You can also generate an interactive [parallel coordinates plot](https://en.wikipedia.org/wiki/Parallel_coordinates) with `dvc exp show --pcp`. -This plot is useful to explore the relationships between the dependencies, -metrics and params used in experiments. You can reorder the columns to make some +This plot is useful to explore the relationships between the metrics, params and +dependencies used in experiments. You can reorder the columns to make some patterns more easily visible. The `--pcp` flag can be combined with other options of the command. For example, From 37b5f37de6b1273f3139a787352e5ab2ee7da577 Mon Sep 17 00:00:00 2001 From: David de la Iglesia Castro Date: Wed, 2 Feb 2022 18:36:22 +0100 Subject: [PATCH 08/11] Replace params with parameters --- content/docs/command-reference/exp/show.md | 14 +- .../comparing-experiments.md | 14 +- scripts/create_dvctables.py | 161 ++++++++++++++++++ 3 files changed, 175 insertions(+), 14 deletions(-) create mode 100644 scripts/create_dvctables.py diff --git a/content/docs/command-reference/exp/show.md b/content/docs/command-reference/exp/show.md index fa1055760b..8fdbeddbf5 100644 --- a/content/docs/command-reference/exp/show.md +++ b/content/docs/command-reference/exp/show.md @@ -50,9 +50,9 @@ which you can typically exit by typing `Q`. Use `--no-pager` to print the table to standard output. By default, the printed experiments table will include columns for all metrics, -params and dependencies from the entire project. The `--only-changed`, `--drop`, -`--keep`, and other [options](#options) can determine which columns should be -displayed. +parameters and dependencies from the entire project. The `--only-changed`, +`--drop`, `--keep`, and other [options](#options) can determine which columns +should be displayed. Experiments in the table are first grouped (by parent commit). They are then sorted inside each group, chronologically by default. The `--sort-by` and @@ -86,8 +86,8 @@ will be generated using the same data from the table. - `--param-deps` - include only parameters that are stage dependencies. -- `--only-changed` - show only metrics, params and dependencies with values that - vary across experiments. +- `--only-changed` - show only metrics, parameters and dependencies with values + that vary across experiments. - `--drop ` - remove the matching columns. This option has higher priority than `--only-changed`. If both options are combined, `--drop` will @@ -142,8 +142,8 @@ will be generated using the same data from the table. Let's say we have run 3 experiments in our project. The basic usage shows the workspace (Git working tree) and experiments derived from `HEAD` (`master` -branch in this case), and all of their metrics, params and dependencies (scroll -right to see all): +branch in this case), and all of their metrics, parameters and dependencies +(scroll right to see all): ```dvc $ dvc exp show diff --git a/content/docs/user-guide/experiment-management/comparing-experiments.md b/content/docs/user-guide/experiment-management/comparing-experiments.md index 0bc24af6ac..bc66dd3f32 100644 --- a/content/docs/user-guide/experiment-management/comparing-experiments.md +++ b/content/docs/user-guide/experiment-management/comparing-experiments.md @@ -177,9 +177,9 @@ $ dvc exp show --no-timestamp --include-params=model.conv_units --exclude-metric ``` By default `dvc exp show` sorts the experiments by their timestamp. You can sort -the columns by params or metrics by the option `--sort-by` and `--sort-order`. -`--sort-by` takes a metric or parameter name, and `--sort-order` takes either -`asc` or `desc`. +the columns by parameters or metrics by the option `--sort-by` and +`--sort-order`. `--sort-by` takes a metric or parameter name, and `--sort-order` +takes either `asc` or `desc`. ```dvc $ dvc exp show --sort-by auc --sort-order desc @@ -202,8 +202,8 @@ You can also generate an interactive [parallel coordinates plot](https://en.wikipedia.org/wiki/Parallel_coordinates) with `dvc exp show --pcp`. -This plot is useful to explore the relationships between the metrics, params and -dependencies used in experiments. You can reorder the columns to make some +This plot is useful to explore the relationships between the metrics, parameters +and dependencies used in experiments. You can reorder the columns to make some patterns more easily visible. The `--pcp` flag can be combined with other options of the command. For example, @@ -483,8 +483,8 @@ $ dvc exp diff exp-25a26 cnn-64 --json The output is a JSON dictionary with two keys, `metrics` and `params`, which have dictionaries as values. `metrics` and `params` dictionaries has keys for -each of the metrics or params file, and for each file metrics and parameters are -listed as keys. +each of the metrics or parameters file, and for each file metrics and parameters +are listed as keys. As an example, we can get only a specific metric with [jq]: diff --git a/scripts/create_dvctables.py b/scripts/create_dvctables.py new file mode 100644 index 0000000000..2de0f098b7 --- /dev/null +++ b/scripts/create_dvctables.py @@ -0,0 +1,161 @@ +import json +import os +import subprocess +import sys +import tempfile + +from pathlib import Path + +COLORS = { + "neutral": [ + "Experiment", + "Created" + ], + "metric": [ + "avg_prec", + "roc_auc", + "loss", + "acc" + ], + "param": [ + "prepare.split", + "prepare.seed", + "featurize.max_features", + "featurize.ngrams", + "train.seed", + "train.n_est", + "train.min_split", + "train.epochs", + "model.conv_units" + ], + "dep": [ + "data/features", + "src/evaluate.py", + "src/featurization.py", + "src/prepare.py", + "data/data.xml", + "model.pkl", + "src/train.py", + "data/prepared" + ] +} + + +def _add_color_highlight(input_folder, colors): + print("Adding color highlight") + for exp_show_output in Path(input_folder).iterdir(): + text = exp_show_output.read_text() + for color, columns in colors.items(): + for column in columns: + text = text.replace( + column, + f"{color}:**{column}**" + ) + exp_show_output.write_text(text) + + +def _dump_tables(input_folder, output_folder): + tables = [] + print("Dumping to tables.js format") + for exp_show_output in Path(input_folder).iterdir(): + text = exp_show_output.read_text() + print(f"${exp_show_output.stem}") + print(text) + tables.append( + { + "placeholder": f"${exp_show_output.stem}", + "replacement": text + } + ) + return tables + + +def _capture_exp_show_output(show_calls, output_folder, prefix): + print("Capturing raw exp show output") + for extra_args, suffix in show_calls: + with open(f"tables/{prefix}{suffix}.md", "w") as f: + subprocess.run( + ["dvc", "exp", "show"] + extra_args, + stdout=f) + + +def _get_started_tables(): + with tempfile.TemporaryDirectory() as tmpdir: + os.chdir(tmpdir) + + subprocess.run(["git", "clone", "https://github.com/iterative/example-get-started"]) + + os.chdir("example-get-started") + os.makedirs("tables") + + subprocess.run(["dvc", "pull"]) + + subprocess.run( + ["dvc", "exp", "run", "--queue", "-S", "featurize.max_features=500"]) + subprocess.run( + ["dvc", "exp", "run", "--queue", "-S", "featurize.max_features=1000"]) + subprocess.run( + ["dvc", "exp", "run", "--queue", "-S", "featurize.max_features=2000"]) + + subprocess.run(["dvc", "exp", "run", "--run-all", "-j", "3"]) + + show_calls = [ + ([], ""), + (["--only-changed"], "-only-changed"), + (["--drop", "prepare"], "-drop-prepare"), + (["--drop", "avg_prec|train.min_split"], "-drop-regex"), + (["--only-changed", "--drop", "Created", "--keep", "train.(?!seed)"], "-combined"), + (["--only-changed", "--sort-by=roc_auc", "--sort-order", "desc"], "-sort-desc"), + (["--all-commits", "--only-changed", "--sort-by=roc_auc"], "-all-commits"), + ] + + _capture_exp_show_output(show_calls, "tables", "get-started-exp-show") + _add_color_highlight("tables", COLORS) + tables = { + exp_show_output.stem: exp_show_output.read_text() + for exp_show_output in Path("tables").iterdir() + } + + return tables + + +def _dvc_experiments_tables(): + with tempfile.TemporaryDirectory() as tmpdir: + os.chdir(tmpdir) + + subprocess.run(["git", "clone", "https://github.com/iterative/example-dvc-experiments"]) + + os.chdir("example-dvc-experiments") + os.makedirs("tables") + + subprocess.run(["dvc", "exp", "pull", "--no-cache", "origin", "cnn-64"]) + subprocess.run(["dvc", "exp", "pull", "--no-cache", "origin", "cnn-128"]) + + show_calls = [ + ([], ""), + ] + _capture_exp_show_output(show_calls, "tables", "dvc-experiments-exp-show") + _add_color_highlight("tables", COLORS) + tables = { + exp_show_output.stem: exp_show_output.read_text() + for exp_show_output in Path("tables").iterdir() + } + + return tables + + +def create_all_tables(output_folder): + cwd = Path.cwd() + all_tables = {} + all_tables.update(_get_started_tables()) + all_tables.update(_dvc_experiments_tables()) + os.chdir(cwd) + + print(f"Saving to {output_folder}") + Path(output_folder).mkdir(exist_ok=True, parents=True) + for k, v in all_tables.items(): + (Path(output_folder) / k).write_text(v) + + +if __name__ == "__main__": + create_all_tables(sys.argv[1]) From 30a41a12ab8be0f03d4851b3073ade6106291db1 Mon Sep 17 00:00:00 2001 From: David de la Iglesia Castro Date: Wed, 2 Feb 2022 18:38:01 +0100 Subject: [PATCH 09/11] Remove dvctables --- scripts/create_dvctables.py | 161 ------------------------------------ 1 file changed, 161 deletions(-) delete mode 100644 scripts/create_dvctables.py diff --git a/scripts/create_dvctables.py b/scripts/create_dvctables.py deleted file mode 100644 index 2de0f098b7..0000000000 --- a/scripts/create_dvctables.py +++ /dev/null @@ -1,161 +0,0 @@ -import json -import os -import subprocess -import sys -import tempfile - -from pathlib import Path - -COLORS = { - "neutral": [ - "Experiment", - "Created" - ], - "metric": [ - "avg_prec", - "roc_auc", - "loss", - "acc" - ], - "param": [ - "prepare.split", - "prepare.seed", - "featurize.max_features", - "featurize.ngrams", - "train.seed", - "train.n_est", - "train.min_split", - "train.epochs", - "model.conv_units" - ], - "dep": [ - "data/features", - "src/evaluate.py", - "src/featurization.py", - "src/prepare.py", - "data/data.xml", - "model.pkl", - "src/train.py", - "data/prepared" - ] -} - - -def _add_color_highlight(input_folder, colors): - print("Adding color highlight") - for exp_show_output in Path(input_folder).iterdir(): - text = exp_show_output.read_text() - for color, columns in colors.items(): - for column in columns: - text = text.replace( - column, - f"{color}:**{column}**" - ) - exp_show_output.write_text(text) - - -def _dump_tables(input_folder, output_folder): - tables = [] - print("Dumping to tables.js format") - for exp_show_output in Path(input_folder).iterdir(): - text = exp_show_output.read_text() - print(f"${exp_show_output.stem}") - print(text) - tables.append( - { - "placeholder": f"${exp_show_output.stem}", - "replacement": text - } - ) - return tables - - -def _capture_exp_show_output(show_calls, output_folder, prefix): - print("Capturing raw exp show output") - for extra_args, suffix in show_calls: - with open(f"tables/{prefix}{suffix}.md", "w") as f: - subprocess.run( - ["dvc", "exp", "show"] + extra_args, - stdout=f) - - -def _get_started_tables(): - with tempfile.TemporaryDirectory() as tmpdir: - os.chdir(tmpdir) - - subprocess.run(["git", "clone", "https://github.com/iterative/example-get-started"]) - - os.chdir("example-get-started") - os.makedirs("tables") - - subprocess.run(["dvc", "pull"]) - - subprocess.run( - ["dvc", "exp", "run", "--queue", "-S", "featurize.max_features=500"]) - subprocess.run( - ["dvc", "exp", "run", "--queue", "-S", "featurize.max_features=1000"]) - subprocess.run( - ["dvc", "exp", "run", "--queue", "-S", "featurize.max_features=2000"]) - - subprocess.run(["dvc", "exp", "run", "--run-all", "-j", "3"]) - - show_calls = [ - ([], ""), - (["--only-changed"], "-only-changed"), - (["--drop", "prepare"], "-drop-prepare"), - (["--drop", "avg_prec|train.min_split"], "-drop-regex"), - (["--only-changed", "--drop", "Created", "--keep", "train.(?!seed)"], "-combined"), - (["--only-changed", "--sort-by=roc_auc", "--sort-order", "desc"], "-sort-desc"), - (["--all-commits", "--only-changed", "--sort-by=roc_auc"], "-all-commits"), - ] - - _capture_exp_show_output(show_calls, "tables", "get-started-exp-show") - _add_color_highlight("tables", COLORS) - tables = { - exp_show_output.stem: exp_show_output.read_text() - for exp_show_output in Path("tables").iterdir() - } - - return tables - - -def _dvc_experiments_tables(): - with tempfile.TemporaryDirectory() as tmpdir: - os.chdir(tmpdir) - - subprocess.run(["git", "clone", "https://github.com/iterative/example-dvc-experiments"]) - - os.chdir("example-dvc-experiments") - os.makedirs("tables") - - subprocess.run(["dvc", "exp", "pull", "--no-cache", "origin", "cnn-64"]) - subprocess.run(["dvc", "exp", "pull", "--no-cache", "origin", "cnn-128"]) - - show_calls = [ - ([], ""), - ] - _capture_exp_show_output(show_calls, "tables", "dvc-experiments-exp-show") - _add_color_highlight("tables", COLORS) - tables = { - exp_show_output.stem: exp_show_output.read_text() - for exp_show_output in Path("tables").iterdir() - } - - return tables - - -def create_all_tables(output_folder): - cwd = Path.cwd() - all_tables = {} - all_tables.update(_get_started_tables()) - all_tables.update(_dvc_experiments_tables()) - os.chdir(cwd) - - print(f"Saving to {output_folder}") - Path(output_folder).mkdir(exist_ok=True, parents=True) - for k, v in all_tables.items(): - (Path(output_folder) / k).write_text(v) - - -if __name__ == "__main__": - create_all_tables(sys.argv[1]) From 31ffbe02c3b9af0facb97ada039dc4e9ce0a7d37 Mon Sep 17 00:00:00 2001 From: David de la Iglesia Castro Date: Thu, 3 Feb 2022 12:14:23 +0100 Subject: [PATCH 10/11] Apply suggestions from code review Co-authored-by: Dave Berenbaum --- content/docs/start/experiments.md | 4 ++-- .../experiment-management/comparing-experiments.md | 6 +++--- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/content/docs/start/experiments.md b/content/docs/start/experiments.md index 6c9398f656..81e6a7357c 100644 --- a/content/docs/start/experiments.md +++ b/content/docs/start/experiments.md @@ -6,7 +6,7 @@ title: 'Get Started: Experiments' In machine learning projects, the number of experiments grows rapidly. DVC can track these experiments, list and compare their most relevant -dependencies, metrics and parameters, navigate among them, and commit only the +metrics, parameters, and dependencies, navigate among them, and commit only the ones that we need to Git. > ⚠️This video is out-of-date and will be updated soon! Where there are @@ -201,7 +201,7 @@ $ dvc exp show By default, it shows all the metrics, parameters and dependencies with the timestamp. If you have a large number of metrics, parameters, dependencies or experiments, this may lead to a cluttered view. You can limit the table to -specific columns, including the timestamp (`Created`), using the +specific columns using the [`--drop`](/doc/command-reference/exp/show#--drop) option of the command. ```dvc diff --git a/content/docs/user-guide/experiment-management/comparing-experiments.md b/content/docs/user-guide/experiment-management/comparing-experiments.md index bc66dd3f32..4c2fdbd1e9 100644 --- a/content/docs/user-guide/experiment-management/comparing-experiments.md +++ b/content/docs/user-guide/experiment-management/comparing-experiments.md @@ -177,7 +177,7 @@ $ dvc exp show --no-timestamp --include-params=model.conv_units --exclude-metric ``` By default `dvc exp show` sorts the experiments by their timestamp. You can sort -the columns by parameters or metrics by the option `--sort-by` and +the metrics or parameters columns by the option `--sort-by` and `--sort-order`. `--sort-by` takes a metric or parameter name, and `--sort-order` takes either `asc` or `desc`. @@ -482,8 +482,8 @@ $ dvc exp diff exp-25a26 cnn-64 --json ``` The output is a JSON dictionary with two keys, `metrics` and `params`, which -have dictionaries as values. `metrics` and `params` dictionaries has keys for -each of the metrics or parameters file, and for each file metrics and parameters +have dictionaries as values. `metrics` and `params` dictionaries have keys for +each of the metrics or parameters files, and for each file metrics and parameters are listed as keys. As an example, we can get only a specific metric with [jq]: From 9ce32171c1fd255c7f47898330d7c4dfca430c5a Mon Sep 17 00:00:00 2001 From: "restyled-io[bot]" <32688539+restyled-io[bot]@users.noreply.github.com> Date: Thu, 3 Feb 2022 12:15:09 +0100 Subject: [PATCH 11/11] Restyled by prettier (#3254) Co-authored-by: Restyled.io --- content/docs/start/experiments.md | 4 ++-- .../experiment-management/comparing-experiments.md | 10 +++++----- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/content/docs/start/experiments.md b/content/docs/start/experiments.md index 81e6a7357c..debac65371 100644 --- a/content/docs/start/experiments.md +++ b/content/docs/start/experiments.md @@ -201,8 +201,8 @@ $ dvc exp show By default, it shows all the metrics, parameters and dependencies with the timestamp. If you have a large number of metrics, parameters, dependencies or experiments, this may lead to a cluttered view. You can limit the table to -specific columns using the -[`--drop`](/doc/command-reference/exp/show#--drop) option of the command. +specific columns using the [`--drop`](/doc/command-reference/exp/show#--drop) +option of the command. ```dvc $ dvc exp show --drop 'Created|train|loss' diff --git a/content/docs/user-guide/experiment-management/comparing-experiments.md b/content/docs/user-guide/experiment-management/comparing-experiments.md index 4c2fdbd1e9..b80b85d1f8 100644 --- a/content/docs/user-guide/experiment-management/comparing-experiments.md +++ b/content/docs/user-guide/experiment-management/comparing-experiments.md @@ -177,9 +177,9 @@ $ dvc exp show --no-timestamp --include-params=model.conv_units --exclude-metric ``` By default `dvc exp show` sorts the experiments by their timestamp. You can sort -the metrics or parameters columns by the option `--sort-by` and -`--sort-order`. `--sort-by` takes a metric or parameter name, and `--sort-order` -takes either `asc` or `desc`. +the metrics or parameters columns by the option `--sort-by` and `--sort-order`. +`--sort-by` takes a metric or parameter name, and `--sort-order` takes either +`asc` or `desc`. ```dvc $ dvc exp show --sort-by auc --sort-order desc @@ -483,8 +483,8 @@ $ dvc exp diff exp-25a26 cnn-64 --json The output is a JSON dictionary with two keys, `metrics` and `params`, which have dictionaries as values. `metrics` and `params` dictionaries have keys for -each of the metrics or parameters files, and for each file metrics and parameters -are listed as keys. +each of the metrics or parameters files, and for each file metrics and +parameters are listed as keys. As an example, we can get only a specific metric with [jq]: