Conversation
like assert() but cannot be disabled. proper separation of user contract errors (CONTROL()) and invariant verification (assert()).
…imit by disabling continue mode when index is close to limit.
lib/compress/zstd_compress.c
Outdated
| * memset() will be triggered before reduceIndex(). | ||
| */ | ||
| #define ZSTD_INDEXOVERFLOW_MARGIN (16 MB) | ||
| static int ZSTD_index_valid_for_continue(ZSTD_window_t w) |
There was a problem hiding this comment.
nit: ZSTD_indexValidForContinue() is more inline with naming.
Also : minor speed optimization : shortcut to ZSTD_reset_matchState() rather than the full reset process. It still needs to be completed with ZSTD_continueCCtx() for proper initialization. Also : changed position of LDM hash tables in the context, so that the "regular" hash tables can be at a predictable position, hence allowing the shortcut to ZSTD_reset_matchState() without complex conditions.
|
This last update implements an optimization : Seems to work fine so far, but I'm not against a second look, In particular, I would like to be completely sure that it works well in combination with LDM. @terrelln likely has a better understanding of this part. |
|
It should be fine. LDM manages its own window and indices, so this change shouldn't impact it. I'll take a closer look in a bit, just to be sure. |
terrelln
left a comment
There was a problem hiding this comment.
Sorry this fell off my radar, LGTM!
I recently received a user performance report stating that
reduceIndex()can require a non-negligible delay to perform its job, especially when applied to a large context.reduceIndex()is required in 2 scenarios :continuemode as a consequence of constantly keeping the same parameters.For the first case, there's nothing new.
For the second case, the issue is that
reduceIndex()is a measurable load compared to a single compression job. User reported delay in the ~80ms range. I could find similar delay on my laptop withllvmin the 20-30ms range when using large tables (~30 MB). User results are exacerbated by the use of Visual Studio, which is worse at auto-vectorization (reduceIndex()performance is tied to auto-vectorization).The thing is, for such case, we do not actually need a
reduceIndex()operation. A simplermemset()would work just as well. And of course, a big difference is thatmemset()is actually fast (I could measure up to 10x faster), and performance differences between compilers is unlikely to be "large".This patch tries to favor
memset()overreduceIndex()by testingcurrentIndexwhen starting a new compression (detected by triggeringZSTD_resetCCtx_internal()). It will disallowcontinuemode whencurrentIndexis considered "too close" to the limit.The limit was already defined in
zstd_compress_internal.hasZSTD_CURRENT_MAX, and "too close" has been arbitrarily defined as< ZSTD_INDEXOVERFLOW_MARGIN (== 16 MB).