From 158be158dd376a529bf55399c7a8ca9522a4241e Mon Sep 17 00:00:00 2001 From: Emre Sahin Date: Wed, 19 Jan 2022 18:27:18 +0300 Subject: [PATCH 1/9] use exp init instead of stage add --- content/docs/start/experiments.md | 53 ++++++++++++------------------- 1 file changed, 21 insertions(+), 32 deletions(-) diff --git a/content/docs/start/experiments.md b/content/docs/start/experiments.md index 94473306fd..e866e793f9 100644 --- a/content/docs/start/experiments.md +++ b/content/docs/start/experiments.md @@ -21,39 +21,34 @@ the [`example-dvc-experiments`][ede] project.
-### ⚙️ Installing the example project +### ⚙️ Initializing a project into DVC experiments -These commands are run in the [`example-dvc-experiments`][ede] project. You can -run the commands in this document after cloning the repository, installing the -requirements, and pulling the data. +If you already have a DVC project, that's great. You can start to use `dvc exp` +commands right away to run experiments in your project. (See the [user's guide] +for detailed information.) Here, we briefly discuss how to structure an ML +project into a DVC experiments project with `dvc exp init`. -#### Clone the project and create virtual environment +[user's guide]: /doc/user-guide/experiment-management/ -Please clone the project and create a virtual environment. - -> We strongly recommend to create a virtual environment to keep the libraries we -> use isolated from the rest of your system. This prevents version conflicts. +A typical machine learning project has data, a set of scripts that trains a +model, a bunch of hyperparameters that modify these models, and outputs metrics +and plots to evaluate the models. DVC makes certain assumptions about the names +of these elements to initialize a project with: ```dvc -$ git clone https://github.com/iterative/example-dvc-experiments -b get-started -$ cd example-dvc-experiments -$ virtualenv .venv -$ . .venv/bin/activate -$ python -m pip install -r requirements.txt +$ dvc exp init python src/train.py ``` -#### Get the data set +Here, `python src/train.py` describes how you run experiments. It could be any +other command. -The repository we cloned doesn't contain the dataset. Instead of storing the -data in the Git repository, we use DVC to retrieve from a shared data store. In -this case, we use `dvc pull` to update the missing data files. +If your project uses different names for them, you can set directories for +source code (default: `src`), data (`data/`), models (`models/`), plots +(`plots/`), and files for hyperparameters (`params.yaml`), metrics +(`metrics.json`) with the options supplied to `dvc exp init`. -```dvc -$ dvc pull -``` - -The repository already contains the necessary configuration to run the -experiments. +You can also set these options in a dialog format with +`dvc exp init --interactive`.
@@ -68,19 +63,13 @@ Experiment results have been applied to your workspace. ... ``` -It runs the specified command (`python train.py`) in `dvc.yaml`. That command -writes the metrics values to `metrics.json`. +It runs the command we specified (`python train.py`), and creates models, plots +and metrics in respective directories. This experiment is then associated with the values found in the parameters file (`params.yaml`), and other dependencies (`data/images/`) with these produced metrics. -The purpose of the `dvc exp` family of commands is to let you run, capture, and -compare the machine learning experiments at once as you iterate on your project. -The artifacts like models and metrics produced by each experiment are tracked by -DVC, and the associated parameters and metrics can be committed to Git as text -files. - You can review the experiment results with `dvc exp show` and see these metrics and results in a nicely formatted table: From 5451d82272d8aededfe1e3011d433fd99b8ccae4 Mon Sep 17 00:00:00 2001 From: Emre Sahin Date: Wed, 26 Jan 2022 18:41:34 +0300 Subject: [PATCH 2/9] into -> with in the title --- content/docs/start/experiments.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/docs/start/experiments.md b/content/docs/start/experiments.md index e866e793f9..65657b09c6 100644 --- a/content/docs/start/experiments.md +++ b/content/docs/start/experiments.md @@ -21,7 +21,7 @@ the [`example-dvc-experiments`][ede] project.
-### ⚙️ Initializing a project into DVC experiments +### ⚙️ Initializing a project with DVC experiments If you already have a DVC project, that's great. You can start to use `dvc exp` commands right away to run experiments in your project. (See the [user's guide] From ea7c4537b176ef652d9fa77f55a6ebf1094453b4 Mon Sep 17 00:00:00 2001 From: Emre Sahin Date: Wed, 26 Jan 2022 18:42:44 +0300 Subject: [PATCH 3/9] rephrase into -> with + using --- content/docs/start/experiments.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/docs/start/experiments.md b/content/docs/start/experiments.md index 65657b09c6..79050d739d 100644 --- a/content/docs/start/experiments.md +++ b/content/docs/start/experiments.md @@ -26,7 +26,7 @@ the [`example-dvc-experiments`][ede] project. If you already have a DVC project, that's great. You can start to use `dvc exp` commands right away to run experiments in your project. (See the [user's guide] for detailed information.) Here, we briefly discuss how to structure an ML -project into a DVC experiments project with `dvc exp init`. +project with DVC experiments using `dvc exp init`. [user's guide]: /doc/user-guide/experiment-management/ From 6b200a4c45c28873aaf0471437b77c7f1f4b680d Mon Sep 17 00:00:00 2001 From: Emre Sahin Date: Wed, 26 Jan 2022 18:45:36 +0300 Subject: [PATCH 4/9] modify -> tune --- content/docs/start/experiments.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/content/docs/start/experiments.md b/content/docs/start/experiments.md index 79050d739d..352f9f4961 100644 --- a/content/docs/start/experiments.md +++ b/content/docs/start/experiments.md @@ -30,10 +30,10 @@ project with DVC experiments using `dvc exp init`. [user's guide]: /doc/user-guide/experiment-management/ -A typical machine learning project has data, a set of scripts that trains a -model, a bunch of hyperparameters that modify these models, and outputs metrics -and plots to evaluate the models. DVC makes certain assumptions about the names -of these elements to initialize a project with: +A typical machine learning project has data, a set of scripts that train a +model, a bunch of hyperparameters that tune training and models, and outputs +metrics and plots to evaluate the models. DVC makes certain assumptions about +the names of these elements to initialize a project with: ```dvc $ dvc exp init python src/train.py From 4e8c149ce7ba27a7adf9cec16235150c4787fffc Mon Sep 17 00:00:00 2001 From: Emre Sahin Date: Wed, 26 Jan 2022 18:47:00 +0300 Subject: [PATCH 5/9] describes -> specifies --- content/docs/start/experiments.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/docs/start/experiments.md b/content/docs/start/experiments.md index 352f9f4961..93fee00102 100644 --- a/content/docs/start/experiments.md +++ b/content/docs/start/experiments.md @@ -39,7 +39,7 @@ the names of these elements to initialize a project with: $ dvc exp init python src/train.py ``` -Here, `python src/train.py` describes how you run experiments. It could be any +Here, `python src/train.py` specifies how you run experiments. It could be any other command. If your project uses different names for them, you can set directories for From 7209a4f323fadc01f185df0bea71dfa1ac83ff5d Mon Sep 17 00:00:00 2001 From: Emre Sahin Date: Tue, 1 Feb 2022 14:20:42 +0300 Subject: [PATCH 6/9] makes assumptions -> has sane defaults --- content/docs/start/experiments.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/docs/start/experiments.md b/content/docs/start/experiments.md index 93fee00102..63455b8537 100644 --- a/content/docs/start/experiments.md +++ b/content/docs/start/experiments.md @@ -32,8 +32,8 @@ project with DVC experiments using `dvc exp init`. A typical machine learning project has data, a set of scripts that train a model, a bunch of hyperparameters that tune training and models, and outputs -metrics and plots to evaluate the models. DVC makes certain assumptions about -the names of these elements to initialize a project with: +metrics and plots to evaluate the models. `dvc exp init` has sane defaults about +the names of these elements to initialize a project: ```dvc $ dvc exp init python src/train.py From 6dc915400c4c273e033a24fce1ea28f6d9719b64 Mon Sep 17 00:00:00 2001 From: Emre Sahin Date: Tue, 1 Feb 2022 14:21:14 +0300 Subject: [PATCH 7/9] h3 -> h2 --- content/docs/start/experiments.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/docs/start/experiments.md b/content/docs/start/experiments.md index 63455b8537..9203e30dec 100644 --- a/content/docs/start/experiments.md +++ b/content/docs/start/experiments.md @@ -21,7 +21,7 @@ the [`example-dvc-experiments`][ede] project.
-### ⚙️ Initializing a project with DVC experiments +## ⚙️ Initializing a project with DVC experiments If you already have a DVC project, that's great. You can start to use `dvc exp` commands right away to run experiments in your project. (See the [user's guide] From 5aac8daa6a8c6e42259671096c1abf8bdfc7a72d Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 2 Feb 2022 09:14:38 -0600 Subject: [PATCH 8/9] Update content/docs/start/experiments.md Co-authored-by: Ivan Shcheklein --- content/docs/start/experiments.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/docs/start/experiments.md b/content/docs/start/experiments.md index 9203e30dec..8871b18ca4 100644 --- a/content/docs/start/experiments.md +++ b/content/docs/start/experiments.md @@ -43,7 +43,7 @@ Here, `python src/train.py` specifies how you run experiments. It could be any other command. If your project uses different names for them, you can set directories for -source code (default: `src`), data (`data/`), models (`models/`), plots +source code (default: `src/`), data (`data/`), models (`models/`), plots (`plots/`), and files for hyperparameters (`params.yaml`), metrics (`metrics.json`) with the options supplied to `dvc exp init`. From aa622684f9e59eb820b8d702887415c726460b8e Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 2 Feb 2022 09:18:42 -0600 Subject: [PATCH 9/9] Apply suggestions from code review --- content/docs/start/experiments.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/docs/start/experiments.md b/content/docs/start/experiments.md index 8871b18ca4..d0ee6291a3 100644 --- a/content/docs/start/experiments.md +++ b/content/docs/start/experiments.md @@ -24,11 +24,11 @@ the [`example-dvc-experiments`][ede] project. ## ⚙️ Initializing a project with DVC experiments If you already have a DVC project, that's great. You can start to use `dvc exp` -commands right away to run experiments in your project. (See the [user's guide] +commands right away to run experiments in your project. (See the [User Guide] for detailed information.) Here, we briefly discuss how to structure an ML project with DVC experiments using `dvc exp init`. -[user's guide]: /doc/user-guide/experiment-management/ +[user guide]: /doc/user-guide/experiment-management/experiments-overview A typical machine learning project has data, a set of scripts that train a model, a bunch of hyperparameters that tune training and models, and outputs