diff --git a/config/prismjs/dvc-commands.js b/config/prismjs/dvc-commands.js index 41edc54b68..937f9da24e 100644 --- a/config/prismjs/dvc-commands.js +++ b/config/prismjs/dvc-commands.js @@ -25,9 +25,6 @@ module.exports = [ 'plots modify', 'plots diff', 'plots', - 'pipeline show', - 'pipeline list', - 'pipeline', 'move', 'metrics show', 'metrics diff', diff --git a/content/docs/command-reference/checkout.md b/content/docs/command-reference/checkout.md index 82bd56d501..baafff3655 100644 --- a/content/docs/command-reference/checkout.md +++ b/content/docs/command-reference/checkout.md @@ -65,8 +65,8 @@ progress made by the checkout. There are two methods to restore a file missing from the cache, depending on the situation. In some cases a pipeline must be reproduced (using `dvc repro`) to -regenerate its outputs (see also `dvc pipeline`). In other cases the cache can -be pulled from remote storage using `dvc pull`. +regenerate its outputs (see also `dvc dag`). In other cases the cache can be +pulled from remote storage using `dvc pull`. ## Options diff --git a/content/docs/command-reference/dag.md b/content/docs/command-reference/dag.md new file mode 100644 index 0000000000..ff01c3f9f4 --- /dev/null +++ b/content/docs/command-reference/dag.md @@ -0,0 +1,108 @@ +# dag + +Show [stages](/doc/command-reference/run) in a pipeline that lead to the +specified stage. By default it lists +[DVC-files](/doc/user-guide/dvc-files-and-directories). + +## Synopsis + +```usage +usage: dvc dag [-h] [-q | -v] [--dot] [--full] [target] + +positional arguments: + targets Stage or output to show pipeline for (optional) + Finds all stages in the workspace by default. +``` + +## Description + +A data pipeline, in general, is a series of data processing +[stages](/doc/command-reference/run) (for example console commands that take an +input and produce an output). A pipeline may produce intermediate +data, and has a final result. Machine learning (ML) pipelines typically start a +with large raw datasets, include intermediate featurization and training stages, +and produce a final model, as well as accuracy +[metrics](/doc/command-reference/metrics). + +In DVC, pipeline stages and commands, their data I/O, interdependencies, and +results (intermediate or final) are specified with `dvc add` and `dvc run`, +among other commands. This allows DVC to restore one or more pipelines of stages +interconnected by their dependencies and outputs later. (See `dvc repro`.) + +> DVC builds a dependency graph +> ([DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) to do this. + +`dvc dag` displays the stages of a pipeline up to the target stage. If `target` +is omitted, it will show the full project DAG. + +## Options + +- `--dot` - show DAG in + [DOT]() + format. It can be passed to third party visualization utilities. + +- `--full` - show full DAG that the `target` belongs too, instead of showing the + part that consists only of the target ancestors. + +- `-h`, `--help` - prints the usage/help message, and exit. + +- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no + problems arise, otherwise 1. + +- `-v`, `--verbose` - displays detailed tracing information. + +## Paging the output + +This command's output is automatically piped to +[Less](), if available in the +terminal. (The exact command used is `less --chop-long-lines --clear-screen`.) +If `less` is not available (e.g. on Windows), the output is simply printed out. + +> It's also possible to +> [enable Less paging on Windows](/doc/user-guide/running-dvc-on-windows#enabling-paging-with-less). + +### Providing a custom pager + +It's possible to override the default pager via the `DVC_PAGER` environment +variable. For example, the following command will replace the default pager with +[`more`](), for a single run: + +```dvc +$ DVC_PAGER=more dvc dag +``` + +For a persistent change, define `DVC_PAGER` in the shell configuration. For +example in Bash, we could add the following line to `~/.bashrc`: + +```bash +export DVC_PAGER=more +``` + +## Examples + +Visualize DVC pipeline: + +```dvc +$ dvc dag + +---------+ + | prepare | + +---------+ + * + * + * + +-----------+ + | featurize | + +-----------+ + ** ** + ** * + * ** ++-------+ * +| train | ** ++-------+ * + ** ** + ** ** + * * + +----------+ + | evaluate | + +----------+ +``` diff --git a/content/docs/command-reference/init.md b/content/docs/command-reference/init.md index 4965558e99..54cbf603b4 100644 --- a/content/docs/command-reference/init.md +++ b/content/docs/command-reference/init.md @@ -61,9 +61,8 @@ sub-projects to mitigate the issues of initializing in the Git repository root: download files and directories, to reproduce pipelines, etc. It can be expensive in the large repositories with a lot of projects. -- Not enough isolation/granularity - commands like `dvc metrics diff`, - `dvc pipeline show` and others by default dump all the metrics, all the - pipelines, etc. +- Not enough isolation/granularity - commands like `dvc metrics diff`, `dvc dag` + and others by default dump all the metrics, all the pipelines, etc. #### How does it affect DVC commands? diff --git a/content/docs/command-reference/pipeline/index.md b/content/docs/command-reference/pipeline/index.md deleted file mode 100644 index b9cf3eb4ae..0000000000 --- a/content/docs/command-reference/pipeline/index.md +++ /dev/null @@ -1,47 +0,0 @@ -# pipeline - -A set of commands to manage -[pipelines](/doc/tutorials/get-started/data-pipelines): -[show](/doc/command-reference/pipeline/show) and -[list](/doc/command-reference/pipeline/list). - -## Synopsis - -```usage -usage: dvc pipeline [-h] [-q | -v] {show,list} ... - -positional arguments: - COMMAND - show Show pipeline. - list List pipelines. -``` - -## Description - -A data pipeline, in general, is a series of data processing -[stages](/doc/command-reference/run) (for example console commands that take an -input and produce an output). A pipeline may produce intermediate -data, and has a final result. Machine learning (ML) pipelines typically start a -with large raw datasets, include intermediate featurization and training stages, -and produce a final model, as well as accuracy -[metrics](/doc/command-reference/metrics). - -In DVC, pipeline stages and commands, their data I/O, interdependencies, and -results (intermediate or final) are specified with `dvc add` and `dvc run`, -among other commands. This allows DVC to restore one or more pipelines of stages -interconnected by their dependencies and outputs later. (See `dvc repro`.) - -> DVC builds a dependency graph -> ([DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) to do this. - -`dvc pipeline` commands help users display the existing project pipelines in -different ways. - -## Options - -- `-h`, `--help` - prints the usage/help message, and exit. - -- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no - problems arise, otherwise 1. - -- `-v`, `--verbose` - displays detailed tracing information. diff --git a/content/docs/command-reference/pipeline/list.md b/content/docs/command-reference/pipeline/list.md deleted file mode 100644 index 3ee8cfdb2b..0000000000 --- a/content/docs/command-reference/pipeline/list.md +++ /dev/null @@ -1,41 +0,0 @@ -# pipeline list - -List connected groups of [stages](/doc/command-reference/run) (pipelines). - -## Synopsis - -```usage -usage: dvc pipeline list [-h] [-q | -v] -``` - -## Description - -Displays a list of all existing stages in the project, grouped in -their corresponding [pipeline](/doc/command-reference/pipeline), when connected. - -> Note that the stages in these lists are in ascending order, that is, from last -> to first. - -## Options - -- `-h`, `--help` - prints the usage/help message, and exit. - -- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no - problems arise, otherwise 1. - -- `-v`, `--verbose` - displays detailed tracing information. - -## Examples - -List available pipelines: - -```dvc -$ dvc pipeline list -Dvcfile -====================================================================== -raw.dvc -data.dvc -output.dvc -====================================================================== -2 pipelines total -``` diff --git a/content/docs/command-reference/pipeline/show.md b/content/docs/command-reference/pipeline/show.md deleted file mode 100644 index 57848ead1e..0000000000 --- a/content/docs/command-reference/pipeline/show.md +++ /dev/null @@ -1,156 +0,0 @@ -# pipeline show - -Show [stages](/doc/command-reference/run) in a pipeline that lead to the -specified stage. By default it lists -[DVC-files](/doc/user-guide/dvc-files-and-directories). - -## Synopsis - -```usage -usage: dvc pipeline show [-h] [-q | -v] [-c | -o] [-l] [--ascii] - [--dot] [--tree] - [targets [targets ...]] - -positional arguments: - targets DVC-files to show pipeline for. Optional. - (Finds all DVC-files in the workspace by default.) -``` - -## Description - -`dvc show` displays the stages of a pipeline up to one or more target DVC-files -(stage files). All stages are shown unless specific `targets` are specified. The -`-c` and `-o` options allow to list the corresponding commands or data file flow -instead of stages. - -> Note that the stages in these lists are in descending order, that is, from -> first to last. - -## Options - -- `-c`, `--commands` - show pipeline as a list (diagram if `--ascii` or `--dot` - is used) of commands instead of paths to DVC-files. - -- `-o`, `--outs` - show pipeline as a list (diagram if `--ascii` or `--dot` is - used) of stage outputs instead of paths to DVC-files. - -- `--ascii` - visualize pipeline. It will print a graph (ASCII) instead of a - list of path to DVC-files. (`less` pager may be used, see - [Paging the output](#paging-the-output) below for details). - -- `--dot` - show contents of `.dot` files with a DVC pipeline graph. It can be - passed to third party visualization utilities. - -- `--tree` - list dependencies tree like recursive directory listing. - -- `-l`, `--locked` - print frozen stages only. See `dvc freeze`. - -- `-h`, `--help` - prints the usage/help message, and exit. - -- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no - problems arise, otherwise 1. - -- `-v`, `--verbose` - displays detailed tracing information. - -## Paging the output - -This command's output is automatically piped to -[Less](), if available in the -terminal. (The exact command used is `less --chop-long-lines --clear-screen`.) -If `less` is not available (e.g. on Windows), the output is simply printed out. - -> It's also possible to -> [enable Less paging on Windows](/doc/user-guide/running-dvc-on-windows#enabling-paging-with-less). - -### Providing a custom pager - -It's possible to override the default pager via the `DVC_PAGER` environment -variable. For example, the following command will replace the default pager with -[`more`](), for a single run: - -```bash -$ DVC_PAGER=more dvc pipeline show --ascii my-pipeline.dvc -``` - -For a persistent change, define `DVC_PAGER` in the shell configuration. For -example in Bash, we could add the following line to `~/.bashrc`: - -```bash -export DVC_PAGER=more -``` - -## Examples - -Default mode: show stage files that `output.dvc` recursively depends on: - -```dvc -$ dvc pipeline show output.dvc -raw.dvc -data.dvc -output.dvc -``` - -The same as previous, but show commands instead of DVC-files: - -```dvc -$ dvc pipeline show output.dvc --commands -download.py s3://mybucket/myrawdata raw -cleanup.py raw data -process.py data output -``` - -Visualize DVC pipeline: - -```dvc -$ dvc pipeline show eval.txt.dvc --ascii - .------------------------. - | data/Posts.xml.zip.dvc | - `------------------------' - * - * - * - .---------------. - | Posts.xml.dvc | - `---------------' - * - * - * - .---------------. - | Posts.tsv.dvc | - `---------------' - * - * - * - .---------------------. - | Posts-train.tsv.dvc | - `---------------------' - * - * - * - .--------------------. - | matrix-train.p.dvc | - `--------------------' - *** *** - ** *** - ** ** -.-------------. ** -| model.p.dvc | ** -`-------------' *** - *** *** - ** ** - ** ** - .--------------. - | eval.txt.dvc | - `--------------' -``` - -List dependencies recursively if the graph has a tree structure: - -```dvc -$ dvc pipeline show e.file.dvc --tree -e.file.dvc -├── c.file.dvc -│ └── b.file.dvc -│ └── a.file.dvc -└── d.file.dvc -``` diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index 5cf75dbde2..ea924c7a82 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -129,8 +129,8 @@ only execute the final stage. The stage is only executed if the user types "y". - `-p`, `--pipeline` - reproduce the entire pipelines that the stage file - `targets` belong to. Use `dvc pipeline show .dvc` to show the parent - pipeline of a target stage. + `targets` belong to. Use `dvc dag ` to show the parent pipeline of a + target stage. - `-P`, `--all-pipelines` - reproduce all pipelines, for all the stage files present in `DVC` repository. diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json index ff700adeb2..d71a123f9a 100644 --- a/content/docs/sidebar.json +++ b/content/docs/sidebar.json @@ -155,6 +155,10 @@ "label": "config", "slug": "config" }, + { + "label": "dag", + "slug": "dag" + }, { "label": "destroy", "slug": "destroy" @@ -236,21 +240,6 @@ } ] }, - { - "label": "pipeline", - "slug": "pipeline", - "source": "pipeline/index.md", - "children": [ - { - "label": "pipeline list", - "slug": "list" - }, - { - "label": "pipeline show", - "slug": "show" - } - ] - }, { "label": "plots", "slug": "plots", diff --git a/content/docs/user-guide/running-dvc-on-windows.md b/content/docs/user-guide/running-dvc-on-windows.md index e658eee649..1d4e216a08 100644 --- a/content/docs/user-guide/running-dvc-on-windows.md +++ b/content/docs/user-guide/running-dvc-on-windows.md @@ -70,8 +70,8 @@ directory, as explained in ## Enabling paging with `less` By default, DVC tries to use [Less]() -as pager for the output of `dvc pipeline show`. Windows doesn't have the `less` -command available however. Fortunately, there is a easy way of installing it via +as pager for the output of `dvc dag`. Windows doesn't have the less command +available however. Fortunately, there is a easy way of installing `less` via [Chocolatey](https://chocolatey.org/) (please install the tool first): ```dvc diff --git a/redirects-list.json b/redirects-list.json index 391cbf27e5..9285fed991 100644 --- a/redirects-list.json +++ b/redirects-list.json @@ -33,6 +33,9 @@ "^/doc/command-reference/plot$ /doc/command-reference/plots", "^/doc/command-reference/lock$ /doc/command-reference/freeze", "^/doc/command-reference/unlock$ /doc/command-reference/unfreeze", + "^/doc/command-reference/pipeline$ /doc/command-reference/dag", + "^/doc/command-reference/pipeline/show$ /doc/command-reference/dag", + "^/doc/command-reference/pipeline/list$ /doc/command-reference/dag", "^/(.+)/$ /$1" ]