Conversation
in a way which is more favorable to compression ratio, though very slightly slower (~-1%). More details in the PR.
|
Just completed speed measurements. The impact on speed is a bit larger than I expected, being closer to 2% than 1%. Should it be deemed not good enough, an alternative modification could be to change insertion of |
|
Some results from
Outcome :
This variant seems only marginally better than current "default". It doesn't look like a clear "must do". |
|
Measuring again this patch, but with
As can be seen, patch's results are more favorable with Based on
So, as a global summary, when adding these results, it feels more on the positive side. |
|
Results on an Intel i9-9900K with turboboost enabled pinned to CPU#0 compiled with gcc-9.1.0:
|
|
Testing a new variation of complementary insertion, using the same nb of insertions as current, but organized differently (long at As a consequence, the speed impact is negligible, barely measurable. core i7-9700k (disabled turboboost), Ubuntu x64 19.04, gcc v9.1.0 :
edit : seems like an easier sell. Doesn't change the needle much, but is rather positive. Speed regression is insensible, so no bad effect for already deployed systems using level 3. |
same number of complementary insertions, just organized differently (long at `ip-2`, short at `ip-1`).
|
Patch updated, |
|
I see no compression speed regression on my machine with gcc-9.1.0, and see a small speed boost with clang. Strangely, I can reproduce the decompression speed loss on gcc, but on clang I get a decompression speed boost. |
|
Updated |
Tweak hashing similar to : facebook/zstd#1681 Allow switching between different repeats.
in a way which is more favorable to compression ratio, though slightly slower.
Detailed benchmark :
core i7-9700k (disabled turboboost), Ubuntu x64 19.04,
gccv8.3.0 :devratioNote : I must re-run speed tests, as it seems there is enough difference between cores to deserve pinning all measurements to the same core, for proper comparison.
edit : completed speed measurements.
edit 2 : final heuristic changed, see later benchmarks.