Skip to content

Reduce external sequence producer API overhead by 25%#3471

Merged
embg merged 2 commits intofacebook:devfrom
embg:fast_seq_parse
Feb 2, 2023
Merged

Reduce external sequence producer API overhead by 25%#3471
embg merged 2 commits intofacebook:devfrom
embg:fast_seq_parse

Conversation

@embg
Copy link
Contributor

@embg embg commented Jan 31, 2023

Adds a cctxParam to disable repcode search during external sequence parsing (only in explicit block delim mode for now, since that's where we currently care about parsing speed). For external matchfinders which don't explicitly search for repcode matches, this sacrifices less than 1% compression ratio on silesia.tar.

In general, the compression ratio trade-off is matchfinder- and data-dependent. Users should benchmark against their own data to determine if the trade-off is worth it. I have enabled by default below compression level 10, because the speed improvement is so great that I imagine few practical use-cases would gain enough ratio from disabling to justify disabling.

In the future, we might be able to use SIMD to run repcode search much faster, and change the trade-off such that enabling at low levels makes sense.

Overall external matchfinder API overhead (non-external-matchfinder compression CPU) is currently about 50% inside ZSTD_copySequencesToSeqStoreExplicitBlockDelim(), so this PR reduces overall overhead by about 25% (see perf numbers below).

Before:
Screenshot 2023-01-31 at 4 11 35 PM

After:
Screenshot 2023-01-31 at 4 12 56 PM

cc @daweiq @GarenJian-Intel

@embg embg requested a review from Cyan4973 February 1, 2023 17:11
@embg embg merged commit 31e41b3 into facebook:dev Feb 2, 2023
@embg embg changed the title Reduce external matchfinder API overhead by 25% Reduce external sequence producer API overhead by 25% Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants