Skip to content

repro --pull implies --allow-missing#10434

Closed
dberenbaum wants to merge 1 commit into
mainfrom
pull-allow-missing
Closed

repro --pull implies --allow-missing#10434
dberenbaum wants to merge 1 commit into
mainfrom
pull-allow-missing

Conversation

@dberenbaum
Copy link
Copy Markdown
Contributor

See #10412 (comment). When we introduced --allow-missing, we also updated the behavior of --pull, but kept them as separate flags for a couple reasons (see discussion here):

  • didn't want to change behavior of --pull too much
  • worried about too much complexity in one flag
  • wanted to wait and see how it was used

Since then, every user I have encountered expects the behavior to be what --pull --allow-missing does. I don't see any valid case where someone using --pull would not want --allow-missing (we are basically already implying this by treating missing data as needing to be pulled).

@dberenbaum dberenbaum requested a review from skshetry May 20, 2024 14:49
Comment thread dvc/commands/repro.py
"Try automatically pulling missing cache for outputs restored "
"from the run-cache."
"Try automatically pulling missing dependencies and outputs. "
"Implies --allow-missing."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it better to deprecate then allow-missing? and here describe the behavior in a more explicit way (by not referring to another option, but actually saying what is happening)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to deprecate now because:

  1. That is a clear breaking change that I don't think we should do unless we want to do a major version change
  2. Unlike --pull, --allow-missing on its own can be useful. It was introduced to solve New command: dvc verify - check that the pipeline is up to date without having to pull or run it #5369, in which case the user doesn't want to pull anything.

Open to other ideas on how to explain it but worried about overcomplicating it. I think this is how people expect --pull works now. With --pull and --allow-missing, if you have run the stage before and pushed it, it won't get run again. Without --allow-missing, the stage may still be considered changed and run again even if it can be pulled from the run-cache.

@dberenbaum
Copy link
Copy Markdown
Contributor Author

This raised a couple other issues when I started looking deeper:

  1. With this PR, --pull may skip unchanged stages without pulling them since --allow-missing is intended to skip these stages entirely. This saves time pulling that data but also may be unexpected/more breaking than if we pull unstaged changes.
  2. Regardless of this PR, --allow-missing will skip pull and checkout for unchanged stages, but it will still checkout unchanged stages from the local cache. This may also be expensive, and arguably we should skip checkout for these.

I'm working on an alternate PR that would address these questions.

@dberenbaum
Copy link
Copy Markdown
Contributor Author

This raised a couple other issues when I started looking deeper:

1. With this PR, `--pull` may skip unchanged stages without pulling them since `--allow-missing` is intended to skip these stages entirely. This saves time pulling that data but also may be unexpected/more breaking than if we pull unstaged changes.

2. Regardless of this PR, `--allow-missing` will skip pull and checkout for unchanged stages, but it will still checkout unchanged stages from the local cache. This may also be expensive, and arguably we should skip checkout for these.

I'm working on an alternate PR that would address these questions.

@dberenbaum dberenbaum closed this May 21, 2024
@skshetry skshetry deleted the pull-allow-missing branch August 20, 2024 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants