-
Notifications
You must be signed in to change notification settings - Fork 106
[DO NOT MERGE] Tentative vfs-2.26.0 #251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Looks pretty good except for one commit:
That one looks incorrect, as 26f924d explicitly got rid of those conditional blocks guarded by In other words, I think this hunk of 1d62837ce08053b18fb9ad774ed2867590494991: @@ -396,7 +400,10 @@ static int check_updates(struct unpack_trees_options *o)
if (ce->ce_flags & CE_WT_REMOVE) {
display_progress(progress, ++cnt);
- unlink_entry(ce);
+ if (o->update && !o->dry_run) {
+ unlink_entry(ce);
+ sum_unlink++;
+ }
}
}
should be replaced by @@ -396,7 +400,8 @@ static int check_updates(struct unpack_trees_options *o)
if (ce->ce_flags & CE_WT_REMOVE) {
display_progress(progress, ++cnt);
unlink_entry(ce);
+ sum_unlink++;
}
}
For comparison, in the pre-rebase commit 14195c4, that hunk looked like this: @@ -391,8 +395,10 @@ static int check_updates(struct unpack_trees_options *o)
if (ce->ce_flags & CE_WT_REMOVE) {
display_progress(progress, ++cnt);
- if (o->update && !o->dry_run)
+ if (o->update && !o->dry_run) {
unlink_entry(ce);
+ sum_unlink++;
+ }
}
}
remove_marked_cache_entries(index, 0);And likewise, also in 1d62837ce08053b18fb9ad774ed2867590494991, the hunk @@ -439,7 +447,10 @@ static int check_updates(struct unpack_trees_options *o)
ce->name);
display_progress(progress, ++cnt);
ce->ce_flags &= ~CE_UPDATE;
- errs |= checkout_entry(ce, &state, NULL, NULL);
+ if (o->update && !o->dry_run) {
+ errs |= checkout_entry(ce, &state, NULL, NULL);
+ sum_checkout++;
+ }
}
}
stop_progress(&progress);should be replaced by @@ -439,7 +447,8 @@ static int check_updates(struct unpack_trees_options *o)
ce->name);
display_progress(progress, ++cnt);
ce->ce_flags &= ~CE_UPDATE;
errs |= checkout_entry(ce, &state, NULL, NULL);
+ sum_checkout++;
}
stop_progress(&progress);For comparison, in the pre-rebase commit 14195c4, that hunk looked like this: @@ -436,6 +443,7 @@ static int check_updates(struct unpack_trees_options *o)
ce->ce_flags &= ~CE_UPDATE;
if (o->update && !o->dry_run) {
errs |= checkout_entry(ce, &state, NULL, NULL);
+ sum_checkout++;
}
}
} |
dscho
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As pointed out, I would like to have two changes in 1d62837ce0. Otherwise the PR looks good to go.
Thanks for the extremely thorough description, @dscho! I'll make the suggested changes and try again. |
554ecfe to
c805465
Compare
While using the reset --stdin feature on windows path added may have a \r at the end of the path that wasn't getting removed so didn't match the path in the index and wasn't reset. Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Saeed Noursalehi <sanoursa@microsoft.com>
Signed-off-by: Johannes Schindelin <johasc@microsoft.com>
This header file will accumulate GVFS-specific definitions. Signed-off-by: Kevin Willford <kewillf@microsoft.com>
This does not do anything yet. The next patches will add various values for that config setting that correspond to the various features offered/required by GVFS. Signed-off-by: Kevin Willford <kewillf@microsoft.com>
The two existing members of the run_hook*() family, run_hook_ve() and run_hook_le(), are good for callers that know the precise number of parameters already. Let's introduce a new sibling that takes an argv array for callers that want to pass a variable number of parameters. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This takes a substantial amount of time, and if the user is reasonably sure that the files' integrity is not compromised, that time can be saved. Git no longer verifies the SHA-1 by default, anyway. Signed-off-by: Kevin Willford <kewillf@microsoft.com>
The idea is to allow blob objects to be missing from the local repository, and to load them lazily on demand. After discussing this idea on the mailing list, we will rename the feature to "lazy clone" and work more on this. Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
Signed-off-by: Kevin Willford <kewillf@microsoft.com>
This adds hard-coded call to GVFS.hooks.exe before and after each Git command runs. To make sure that this is only called on repositories cloned with GVFS, we test for the tell-tale .gvfs. Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
Hydrate missing loose objects in check_and_freshen() when running virtualized. Add test cases to verify read-object hook works when running virtualized. This hook is called in check_and_freshen() rather than check_and_freshen_local() to make the hook work also with alternates. Helped-by: Kevin Willford <kewillf@microsoft.com> Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
Prevent the sparse checkout to delete files that were marked with skip-worktree bit and are not in the sparse-checkout file. This is because everything with the skip-worktree bit turned on is being virtualized and will be removed with the change of HEAD. There was only one failing test when running with these changes that was checking to make sure the worktree narrows on checkout which was expected since we would no longer be narrowing the worktree. Signed-off-by: Kevin Willford <kewillf@microsoft.com>
The use case here is to allow usage statistics to be gathered by running hooks before and after every hook, and to make that configurable via hooks. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
…ng objects This commit converts the existing read_object hook proc model for downloading missing blobs to use a background process that is started the first time git encounters a missing blob and stays running until git exits. Git and the read-object process communicate via stdin/stdout and a versioned, capability negotiated interface as documented in Documentation/technical/read-object-protocol.txt. The advantage of this over the previous hook proc is that it saves the overhead of spawning a new hook process for every missing blob. The model for the background process was refactored from the recent git LFS work. I refactored that code into a shared module (sub-process.c/h) and then updated convert.c to consume the new library. I then used the same sub-process module when implementing the read-object background process. Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
While performing a fetch with a virtual file system we know that there will be missing objects and we don't want to download them just because of the reachability of the commits. We also don't want to download a pack file with commits, trees, and blobs since these will be downloaded on demand. This flag will skip the first connectivity check and by returning zero will skip the upload pack. It will also skip the second connectivity check but continue to update the branches to the latest commit ids. Signed-off-by: Kevin Willford <kewillf@microsoft.com>
GVFS Git introduced pre-command and post-command hooks, to gather usage statistics and to be able to adjust the worktree if necessary. As run_hooks() implicitly calls setup_git_directory(), and that function does surprising things to the global state (sometimes even changing the current working directory), it cannot be used here. This commit introduces the pre-command/post-command hooks, based on the previous patches that culminate in support for running hooks early, i.e. before setup_git_directory() was called. Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
If we are going to write an object there is no use in calling the read object hook to get an object from a potentially remote source. We would rather just write out the object and avoid the potential round trip for an object that doesn't exist. This change adds a flag to the check_and_freshen() and freshen_loose_object() functions' signatures so that the hook is bypassed when the functions are called before writing loose objects. The check for a local object is still performed so we don't overwrite something that has already been written to one of the objects directories. Based on a patch by Kevin Willford. Signed-off-by: Johannes Schindelin <johasc@microsoft.com>
Ensure all filters and EOL conversions are blocked when running under GVFS so that our projected file sizes will match the actual file size when it is hydrated on the local machine. Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
Suggested by Ben Peart. Signed-off-by: Johannes Schindelin <johasc@microsoft.com>
…OST request If our POST request includes a commit ID, then the the remote will send a pack-file containing the commit and all trees reachable from its root tree. With the current implementation, this causes a failure since we call install_loose() when asking for one object. Modify the condition to check for install_pack() when the response type changes.
Earlier versions of the test always returned a packfile in response to a POST. Now we look at the number of objects in the POST request. If > 1, always send a packfile. If = 1 and it is a commit, send a packfile. Otherwise, send a loose object. This is to better model the behavior of the GVFS server/protocol which treats commits differently. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach gvfs-helper to support "/gvfs/prefetch" REST API. This includes a new `gvfs-helper prefetch --since=<t>` command line option. And a new `objects.prefetch` verb in `gvfs-helper server` mode. If `since` argument is omitted, `gvfs-helper` will search the local shared-cache for the most recent prefetch packfile and start from there. The <t> is usually a seconds-since-epoch, but may also be a "friendly" date -- such as "midnight", "yesterday" and etc. using the existing date selection mechanism. Add `gh_client__prefetch()` API to allow `git.exe` to easily call prefetch (and using the same long-running process as immediate and queued object fetches). Expanded t5799 unit tests to include prefetch tests. Test setup now also builds some commits-and-trees packfiles for testing purposes with well-known timestamps. Expanded t/helper/test-gvfs-protocol.exe to support "/gvfs/prefetch" REST API. Massive refactor of existing packfile handling in gvfs-helper.c to reuse more code between "/gvfs/objects POST" and "/gvfs/prefetch". With this we now properly name packfiles with the checksum SHA1 rather than a date string. Refactor also addresses some of the confusing tempfile setup and install_<result> code processing (introduced to handle the ambiguity of how POST works with commit objects). Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
When using fsmonitor the CE_FSMONITOR_VALID flag should be checked when wanting to know if the entry has been updated. If the flag is set the entry should be considered up to date and the same as if the CE_UPTODATE is set. In order to trust the CE_FSMONITOR_VALID flag, the fsmonitor data needs to be refreshed when the fsmonitor bitmap is applied to the index in tweak_fsmonitor. Since the fsmonitor data is kept up to date for every command, some tests needed to be updated to take that into account. istate->untracked->use_fsmonitor was set in tweak_fsmonitor when the fsmonitor bitmap data was loaded and is now in refresh_fsmonitor since that is being called in tweak_fsmonitor. refresh_fsmonitor will only be called once and any other callers should be setting it when refreshing the fsmonitor data so that code can use the fsmonitor data when checking untracked files. When writing the index, fsmonitor_last_update is used to determine if the fsmonitor bitmap should be created and the extension data written to the index. When running through unpack-trees this is not copied to the result index. This makes the next time a git command is ran do all the work of lstating all files to determine what is clean since all entries in the index are marked as dirty since there wasn't any fsmonitor data saved in the index extension. Copying the fsmonitor_last_update to the result index will cause the extension data for fsmonitor to be in the index for the next git command to use. Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
The fsmonitor script that can be used for running all the git tests using watchman was causing some of the tests to fail because it wrote to stderr and created some files for debugging purposes. Add a new debug script to use with debugging and modify the other script to remove the code that would cause tests to fail. Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
test-gvfs-prococol, t5799: tests for gvfs-helper
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
fsmonitor updates for improved performance
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
The gvfs-helper allows us to download prefetch packs using a simple subprocess call. The gvfs-helper-client.h method will automatically compute the timestamp if passing 0, and passing NULL for the number of downloaded packs is valid. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
This replaces #223. There was a strangely-subtle issue about reading the trailing hash from the downloaded packs that caused issues when reading from the origin remote. Add `gvfs-helper prefetch` command line option and `objects.prefetch` mode in `gvfs-helper server`. Sorry, but this contains a major refactor of the packfile and loose file handling to let me share it with the prefetch code. As a side benefit, I collapsed the tempfile creation before the request goes out and merged the install_ code after the result is returned. I also changed packfile code to use the packfile-checksum rather than a timestamp so that we look more like normal Git. More details are in the commit message.
Teach gvfs-helper to better support the concurrent fetching of the same packfile by multiple instances. If 2 instances of gvfs-helper did a POST and requested the same set of OIDs, they might receive the exact same packfile (same checksum SHA). Both processes would then race to install their copy of the .pack and .idx files into the ODB/pack directory. This is not a problem on Unix (because of filesystem semantics). On Windows, this can cause an EBUSY/EPERM problem for the loser while the winner is holding a handle to the target files. (The existing packfile code already handled simple the existence and/or replacement case.) The solution presented here is to silently let the loser claim victory IIF the .pack and .idx are already present in the ODB. (We can't check this in advance because we don't know the packfile SHA checksum until after we receive it and run index-pack.) We avoid using a per-packfile lockfile (or a single lockfile for the `vfs-` prefix) to avoid the usual issues with stale lockfiles. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
This is a follow-up to #227. 1. When a new flag is added to our Git config, we can run `gvfs-helper prefetch` inside of our `git fetch` calls. This will help ensure we have updated commits and trees even if the background prefetches have fallen behind (or are not running). 2. With a new `--no-update-remote-refs` we can avoid updating the `refs/remotes` namespace. This will allow us to run `git fetch --all --no-update-remote-refs +refs/heads/*:refs/hidden/*` and we will get the new refs into a local folder (that doesn't appear anywhere). The most important thing is that users will still see when their remote refs update.
When we create temp files for downloading packs, we use a name based on the current timestamp. There is no randomness in the name, so we can have collisions in the same second. Retry the temp pack names using a new "-<retry>" suffix to the name before the ".temp". Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
gvfs-helper: better support for concurrent packfile fetches
When we create temp files for downloading packs, we use a name based on the current timestamp. There is no randomness in the name, so we can have collisions in the same second. Retry the temp pack names using a new "-<retry>" suffix to the name before the ".temp". This is a follow-up to #229.
When using the GVFS protocol, we should _never_ call "git fetch-pack" to attempt downloading a pack-file via the regular Git protocol. It appears that the mechanism that prevented this in the VFS for Git world is due to the read-object hook populating the commits at the new ref tips in a different way than the gvfs-helper does. By acting as if the fetch-pack succeeds here in remote-curl, we prevent a failed fetch. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
This reverts commit cff4e91. This is temporary until we fix this behavior upstream. For now, we need to allow the sparse-checkout command to run when the status is not clean. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
If a user runs git update-git-for-windows, then they will upgrade to a version that does not support microsoft/vfsforgit or microsoft/scalar. Therefore, let's prevent this. This addresses #241
…g gvfs helper The `gvfs-helper` is supposed to avoid calling `git fetch-pack` by downloading objects through the GVFS protocol instead. For some reason, some `git fetch` calls still end up calling `git fetch-pack` which gets a complaint from the remote because it does not support that kind of fetch. Put a hard stop in the `fetch_git()` method to prevent this process run.
Disable `git update-git-for-windows`
When computing changed-path Bloom filters or performing a name-only diff, we do not need the blob contents before completing the diff values. Thus, we do not need to download a pack containing the blobs we do not have on-disk before completing our diff calculation. This prevents downloading every blob in a partial clone during "git log --raw" and "git diff --name-only" commands. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
c805465 to
fd45198
Compare
See microsoft/git#251 for details. A few reaction items needed to happen: 1. We need to ignore whitespace lines when verifying Git commands. Some added an extra whitespace line. 2. The rebase backend changed, and it causes `git rebase --continue` to open an editor if the commit had conflicts. By changing `core.editor=true`, we can no-op the editor portion instead of running forever with an open `vi` editor. 3. With the updated rebase backend, the commit OID is written to output. This causes our output between the control repo and the Scalar repo to be different. Set `GIT_COMMITTER_DATE` to be constant to avoid this problem. 4. The `fsmonitor` hook version was updated, so we need to update our hook along with Git. By copying from the templates directory instead of the samples initialized in `.git/hooks`, we are future-proof and will always get the latest copy recommended by Git. Resolves #301.
See microsoft/git#251 for details. This required an update to our test infrastructure. The rebase merge backend changed in a way that it would now open an editor during `git rebase --continue`, causing a test to wait for `vim` to close. Set the editor to be a no-op. This also changes the output to include the commit oid, so use `GIT_COMMITTER_TIME` to be a constant to keep the commits the same.
Currently based on
v2.26.0-rc2.windows.1.Here is the range-diff: