-
Notifications
You must be signed in to change notification settings - Fork 406
Closed
Labels
A: docsArea: user documentation (gatsby-theme-iterative)Area: user documentation (gatsby-theme-iterative)C: guideContent of /doc/user-guideContent of /doc/user-guide✨ epicPlaceholder ticket for multi-sprint direction, use story, improvementPlaceholder ticket for multi-sprint direction, use story, improvement
Description
UPDATE: #2856 (comment)
This is the plan for data management trail that focuses on:
- Adding data to DVC projects
& Versioning data in DVC projects - The cache (local)
- & Shared cache (external)
- Removing data from DVC projects
- + Creating remotes (link to config/remotes)
- & Sync with remotes
See also guide: extractremote add/modifydetails from cmd ref. #2866 - Accessing public datasets and data registries (get, import)
- External data topics (See how: use DVC when data is stored in an external drive #563 (comment))
- probably address guide: consolidate external data mgmt guides #520
Details
Adding data to DVC projects
-
Initialize a DVC repository and use
dvc addto add files. -
We'll assume MNIST data exist in a folder and will add it.
Versioning data in DVC projects
- Overwrite Fashion-MNIST data on top of MNIST and update the dataset.
- Go back and forth in Git history to get different datasets in the same folder.
Creating remotes
-
Add a Google Drive folder as a remote.
-
Make it default
Pushing to/pulling from remotes
- Push the cache to the remote we created
- Clone the repository to somewhere (e.g. ssh or local folder)
- Pull the cache
Accessing public datasets and registries
- Get the Fashion MNIST data from dataset-registry
Removing data from DVC projects
- Remove certain folders from workspace
- Delete the corresponding cache files
UPDATE: start with a reorg, see #2856 (comment) below (may be enough).
Metadata
Metadata
Assignees
Labels
A: docsArea: user documentation (gatsby-theme-iterative)Area: user documentation (gatsby-theme-iterative)C: guideContent of /doc/user-guideContent of /doc/user-guide✨ epicPlaceholder ticket for multi-sprint direction, use story, improvementPlaceholder ticket for multi-sprint direction, use story, improvement