Skip to content

Conversation

@derrickstolee
Copy link

@derrickstolee derrickstolee commented Nov 6, 2018

I've been looking into the performance of git push for very large repos. Our users are reporting that 60-80% of git push time is spent during the "Enumerating objects" phase of git pack-objects.

A git push process runs several processes during its run, but one includes git send-pack which calls git pack-objects and passes the known have/wants into stdin using object ids. However, the default setting for core.warnAmbiguousRefs requires git pack-objects to check for ref names matching the ref_rev_parse_rules array in refs.c. This means that every object is triggering at least six "file exists?" queries.

When there are a lot of refs, this can add up significantly! My PerfView trace for a simple push measured 3 seconds spent checking these paths.

The fix is to set the global warn_on_object_refname_ambiguity to 0 for the section that is performing these object reads.

In addition to this patch submission, we are looking into merging it into our fork sooner [1].

[1] microsoft#67

Changes in V2: Instead of using the "-c" flag from send-pack, just set the global. I left the name of the cover letter the same to not confuse anyone viewing the message without threading.

Cc: peff@peff.net

@derrickstolee
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 6, 2018

Submitted as pull.68.git.gitgitgadget@gmail.com

@derrickstolee derrickstolee force-pushed the send-pack-config branch 2 times, most recently from 3cf6fa0 to a14c312 Compare November 6, 2018 20:01
@derrickstolee derrickstolee changed the title send-pack: set core.warnAmbiguousRefs=false pack-objects: ignore ambiguous object warnings Nov 6, 2018
@derrickstolee derrickstolee force-pushed the send-pack-config branch 3 times, most recently from 2e62055 to 002868e Compare November 6, 2018 20:22
@derrickstolee derrickstolee changed the title pack-objects: ignore ambiguous object warnings send-pack: set core.warnAmbiguousRefs=false Nov 6, 2018
@derrickstolee
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 6, 2018

Submitted as pull.68.v2.git.gitgitgadget@gmail.com

A git push process runs several processes during its run, but one
includes git send-pack which calls git pack-objects and passes
the known have/wants into stdin using object ids. However, the
default setting for core.warnAmbiguousRefs requires git pack-objects
to check for ref names matching the ref_rev_parse_rules array in
refs.c. This means that every object is triggering at least six
"file exists?" queries.  When there are a lot of refs, this can
add up significantly! I observed a simple push spending three
seconds checking these paths.

The fix here is similar to 4c30d50 "rev-list: disable object/refname
ambiguity check with --stdin".  While the get_object_list() method
reads the objects from stdin, turn warn_on_object_refname_ambiguity
flag (which is usually true) to false.  Just for code hygiene, save
away the original at the beginning and restore it once we are done.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
@dscho dscho merged commit a4544b3 into gitgitgadget:master Nov 19, 2018
@dscho
Copy link
Member

dscho commented Dec 14, 2018

I made the mistake of hitting the "Merge" button... But this patch made it upstream, as a4544b3

@gitgitgadget gitgitgadget deleted a comment from gitgitgadget bot Dec 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants