cmd: rewrite push/pull et al.#1602
Conversation
add granularity notes rel #1384 (review)
|
I think this is ready to merge. Needs an approval though. |
| storage. It will not upload files associated with earlier commits in the | ||
| <abbr>repository</abbr> (if using Git), nor will it upload files that have not | ||
| changed. | ||
| - The push command checks the appropriate `dvc.lock` and `.dvc` files in the |
There was a problem hiding this comment.
It's a command-reference, but I don't think this information should be here (perhaps on Dvc files?).
Also, @jorgeorpinel, dvc.yaml is a discovery file which dvc reads to find outputs/dependencies, etc. and dvc.lock is a file to correlate those outputs/deps with the exact version through the checksums. dvc.lock is never used if there's no corresponding deps/outs/stages in dvc.yaml. So, either both dvc.yaml and dvc.lock are read together or we could just say dvc.yaml is read (and consider dvc.lock as an implementation detail).
There was a problem hiding this comment.
It's a command-reference, but I don't think this information should be here
could you clarify please what information exactly do you mean?
So, either both dvc.yaml and dvc.lock are read together or we could just say dvc.yaml is read (and consider dvc.lock as an implementation detail) ..
we come back to a single term "DVC files" w/o specifying details with some tooltip that mentions that DVC files == dvc.yaml + dvc.lock + .dvc . All of those files play their role here. We can expand later and bit explicit if needed where it is needed.
There was a problem hiding this comment.
dvc.yaml is mentioned just before: "targets ... limit what to pull. It accepts ... and stage names (found in dvc.yaml)". Then, when we say "push checks the appropriate dvc.lock files", by "appropriate" we mean corresponding to what was found in dvc.yaml. But let me see if I can clarify this further... ⌛
just say dvc.yaml is read (and consider dvc.lock as an implementation detail
This is not a bad idea... We do go over implementation details in cmd refs though, in fact they're probably the lowest-level docs we have. But we could def. hide some of those details in expandable sections.
I'm just not sure dvc.lock should be considered a low-level detail; It's pretty visible to the user.
we come back to a single term "DVC files" w/o specifying details with some tooltip that mentions that DVC files
Yes, let's try to merge this one as best we can though, and follow up on that in #1663 ?
See also treeverse/dvc/issues/4393.
There was a problem hiding this comment.
let me see if I can clarify this further... ⌛
I turn this whole bullet list into a more simple paragraph in dc87fe5. PTAL
| with `git commit` and `git push`). | ||
| The `dvc push` command uploads data to | ||
| [remote storage](/doc/command-reference/remote). It doesn't save any changes to | ||
| the code, `dvc.yaml`, or `.dvc` files (that should be saved with `git commit` |
There was a problem hiding this comment.
should include dvc.lock here ...
| `.dvc` files in the <abbr>workspace</abbr>. The command options will either | ||
| limit or expand the set of stages or `.dvc` files to consult. | ||
| Without arguments, it uploads all files and directories missing from remote | ||
| storage, found as <abbr>outputs</abbr> in the stages (in `dvc.lock`) or `.dvc` |
There was a problem hiding this comment.
Hmmm yeah... Kinda. dvc.lock is where you find the output file names — that's what this meant. But probably this paragraph should be simplified and not even mention either one. See 246f446.
shcheklein
left a comment
There was a problem hiding this comment.
looks like dvc.yaml, dvc.lock, .dvc is still a problem in many places ... I really suggest to stop and find a simple solution, something similar to DVC files + expand where we explain exact mechanism (reads dvc.yaml to get tracked files and directories lists, then dvc.lock to get hashes ... etc)
I agree. But I don't think there's a simple solution if we want to be thorough in the explanations until treeverse/dvc/issues/4278 or related discussions are figured out. What we can do now is avoid going into details (where stage names are listed, where hash values are, etc.) and just talk about stages and maybe .dvc files. Users will have to know that stages are in dvc.yaml and what dvc.lock does.
Unfortunately, even if I agreed on that term, I think that we can't go back to pre 1.x simplicity. This is because DVC 1.x is more complex, since it introduced dvc.yaml and lockfiles in addition to DVC-files. But I agree that expandable sections can be our friends here. My only question is can we merge this work so far and follow the above going fwd? (even have an immediate PR on these same docs) Or do you see still specific problems in these changes @shcheklein? Thanks |
the problem we don't merge this is not the "DVC metafiles" issue, but some actual factual problems (e.g. dvc.lock -> dvc.yaml). |
|
|
||
| The default remote is used (see `dvc config core.remote`) unless the `--remote` | ||
| > Note that pulling data does not change any `dvc.yaml` or `.dvc` files, nor | ||
| > does it save any changes to the code, `dvc.lock`, or `.dvc` files (that should |
There was a problem hiding this comment.
not sure why would I expect it to save changes in the first place?
There was a problem hiding this comment.
True. I copied this from pull without editing accordingly. Updated.
I've also applied similar notes to fetch and commit, but didn't push it in this PR for now. I'll make another small PR soon.
shcheklein
left a comment
There was a problem hiding this comment.
Approving to move forward. But i'm quite lost with this PR. For push and pull I'm not sure what have we solved here. And both do not loo very well structured to be honest.
|
OK. Well, I'll merge this to move forward and come back to cleaning up all these refs. One thing that improved was the correctness of DVC metafile mentions, although it will have to be updated again soon. Another thing that improved is the consistency among these as well as fetch, and to a lesser degree also commit and status. |
Continuation of #1384 with out-of-scope details. Waiting for that PR to close.