[performance] beating git in `git verify-pack` ✅ 🚀

## When trying to beat `git verify-pack`

### Attempt 1
I remembered timings on a cold cache that indicated something around 5:50min for git to run a verify pack on the linux kernel pack. However, turns out that if the environment is a little more controlled, git is still considerably faster than us despite using an LRU cache and despite using multiple cores quite efficiently.

<img width="1092" alt="hard-to-beat-the-king" src="https://user-images.githubusercontent.com/63622/85982332-39355780-ba18-11ea-84b8-7cccc6033f15.png">

##### Observation

Git uses a [streaming pack approach](https://github.com/git/git/blob/f402ea68166bd77f09b176c96005ac7f8886e14b/builtin/index-pack.c#L1180:L1180) which is optimized to apply objects inversely. It works by 

* decompressing all deltas
* applying all deltas that depend on a base, recursively (and thus avoiding to have to decompress deltas multiple times)

We work using a memory mapped file which is optimized for random access, but won't be very fast for this kind of workload.

##### How to fix

Wait until we have implemented a streaming pack as well and try again, having the same algorithmical benefits possibly faired with more efficient memory handling.
Git for some reason limits the application to 3 threads, even though we do benefit from having more threads so could be faster just because of this.
The streaming (indexing) phase of reading a pack can be parallelised in case we have a pack on disk, and it should be easy to implement if the index datastructure itself is threadsafe (but might not be worth the complexity or memory overhead, let's see).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[performance] beating git in `git verify-pack` ✅ 🚀 #1

When trying to beat `git verify-pack`

Attempt 1

Observation

How to fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[performance] beating git in git verify-pack ✅ 🚀 #1

Description

When trying to beat git verify-pack

Attempt 1

Observation

How to fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[performance] beating git in `git verify-pack` ✅ 🚀 #1

When trying to beat `git verify-pack`