live: Add log_artifact#4373
Conversation
There was a problem hiding this comment.
A few places we should think about adding it:
- api-reference/live#methods (this one is a blocker)
- get-started
- how-it-works#track-the-results
For the last one, it might be helpful to explain a bit of how to use DVC. For example:
When using
Live.log_artifact("model.pt"), DVCLive will cache
themodel.ptfile with DVC to avoid tracking large artifacts in Git. It will
generate amodel.pt.dvcmetadata file, which you should track in Git.
You can retrieve the artifact from the Git commit.
I have left the get-started out for now. I think will make more sense to add it there after treeverse/dvclive#465 |
|
|
||
| When using `Live.log_artifact("model.pt")`, DVCLive will | ||
| [cache](/doc/start/data-management/data-versioning) the `model.pt` file with DVC | ||
| to avoid tracking large artifacts in Git. It will generate a `model.pt.dvc` | ||
| metadata file, which you should track in Git. You can | ||
| [retrieve](/doc/start/data-management/data-versioning#retrieving) the artifact | ||
| from the Git commit. |
There was a problem hiding this comment.
Thanks for including my suggestion, but I think it feels a little disconnected. Feel free to drop it and we can do a follow-up PR if that's easier.
I think we should rework this whole section. Proposed text:
# Track the results
DVCLive expects each run to be tracked by Git, so it will save each run to the
same path and overwrite the results each time. Include
[`save_dvc_exp=True`](/doc/dvclive/api-reference/live#parameters) to
auto-track in Git as a <abbr>DVC experiment</abbr>. DVC experiments
are Git commits that DVC can find but that don't clutter your Git history
or create extra branches in your repo.
### Track artifacts
Models and data are often large and aren't easily tracked in Git.
`Live.log_artifact("model.pt")` will [cache](/doc/start/data-management/data-versioning)
the `model.pt` file with DVC and make Git ignore it. It will generate a `model.pt.dvc`
metadata file, which can be tracked in Git and becomes part of the experiment.
You can [retrieve](/doc/start/data-management/data-versioning#retrieving) the
versioned artifact from the Git commit.
### Run experiments with `dvc exp run`
You may [run experiments](/doc/user-guide/experiment-management/running-experiments)
using DVC <abbr>pipelines</abbr>. Once you setup your pipeline, you can run it with `dvc exp run`.
This will track the inputs and outputs of your code, cache them so you never waste time repeating
steps, and enable other features like queuing, hyperparameter tuning, and
grid searches.
There was a problem hiding this comment.
Related: treeverse/dvclive#468 (comment) cc @shcheklein

Per treeverse/dvclive#464