Skip to content

cmd ref: improve run and commit / how to add outs/deps without re-running stage? #460

@maximerischard

Description

@maximerischard

Often when I run a command with dvc run, I realise that I have forgotten to specify one of the outputs. I would therefore like to update the DVC file with an additional output, but without re-running the (potentially expensive) command.

With the help of @efiop on the discourse channel, I was able to figure out that this can be achieved with the following steps:

  1. dvc run with the additional output and the --no-exec flag
  2. dvc commit to add the new output to the dvc cache, compute its checksum and add it to the dvc file.

This works perfectly, but looking at the documentation, it wasn't obvious that this is what dvc commit would do. In particular, the opening line “Record changes to the repository by updating DVC-files and saving outputs to cache.” It wasn't clear to me that “updating DVC-files” meant recomputing the checksums of the outputs.

In the step-by-step explanation of what dvc commit does:

What commit means is that DVC:

  • Computes a checksum for the file/directory.
  • Enters the checksum and file name into the DVC-file.
  • Tells the SCM to ignore the file/directory (e.g. add entry to .gitignore). Note that if the workspace was initialized with no SCM support (dvc init --no-scm), this does not happen.
  • Adds the file/directory or to the DVC cache.

I would suggest the first bullet could be reworded as “computes the checksum of each output file/directory, as well as the checksum of the DVC-file itself” (if my understanding is correct). The second bullet should read “enter the checksums of the outputs and of the DVC-file into the DVC-file”. I'm actually still unsure what is meant by “enters the file name”. Aren't all file names already present in the DVC-file?

UPDATE (From #612)
Dependencies can also be added to a stage without re-running a stage , using the same steps as described above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A: docsArea: user documentation (gatsby-theme-iterative)C: refContent of /doc/*-referencetype: enhancementSomething is not clear, small updates, improvement suggestions

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions