This can be seen as revisiting feature request #1026
UPDATE: Please scroll down to #2458 (comment) for most recent, summarized requirement.
Here is the original context also (still relevant):
There's different scenarios in which being able to manipulate files granularly independently of how they were committed/pushed to DVC could be useful. The problem with using dvc add -R now is that it can generate lots of .dvc files, but what if a directory could be added without -R (producing a single DVC-file) and yet other commands (lock, update, get, etc) could be applied to individual files inside the added directory tree?
Example (from treeverse/dataset-registry@7476a85)
Project 1:
$ tree
.
└── tutorial
└── nlp
├── Posts.xml.zip
└── pipeline.zip
$ dvc add tutorial
...
$ dvc push
...
Project 2:
$ dvc import {project-1-url} tutorial/nlp/pipeline.zip
...
$ tree
.
├── tutorial
│ └── nlp
│ └── pipeline.zip
└── tutorial.dvc
Not sure about where the .dvc would have to be placed in this example though.
And also this is how Git works, I believe. Files are tracked individually (in fact it doesn't even recognize empty dirs).
UPDATE: Please scroll down to #2458 (comment) for most recent, summarized requirement.
Here is the original context also (still relevant):
There's different scenarios in which being able to manipulate files granularly independently of how they were committed/pushed to DVC could be useful. The problem with using
dvc add -Rnow is that it can generate lots of.dvcfiles, but what if a directory could be added without-R(producing a single DVC-file) and yet other commands (lock, update, get, etc) could be applied to individual files inside the added directory tree?Example (from treeverse/dataset-registry@7476a85)
Project 1:
Project 2:
And also this is how Git works, I believe. Files are tracked individually (in fact it doesn't even recognize empty dirs).