Skip to content

Add different dict modes to compression ratio regression test, update results.csv#2559

Merged
senhuang42 merged 3 commits intofacebook:devfrom
senhuang42:add_dict_regression_tests_backup
Mar 25, 2021
Merged

Add different dict modes to compression ratio regression test, update results.csv#2559
senhuang42 merged 3 commits intofacebook:devfrom
senhuang42:add_dict_regression_tests_backup

Conversation

@senhuang42
Copy link

@senhuang42 senhuang42 commented Mar 25, 2021

On the row-hash PR, DMS was silently broken until recently fixed/discovered - our coverage of tests on different dict modes (DMS, DDS, Copy, Load) is not particularly high. This should be a good first step in ensuring we detect such things. (The row-hash PR adds a DMS unit test in fuzzer.c as well, to check for regressions)

This PR:

  • Adds ratio regression tests to clevels >= 1 for each dictMode when testing using advanced API that can honor requested parameters.
  • Updates results.csv
    • New rows with new dictmode tests
    • Adjustments for improvements introduced with block splitter and levels 16, 19
    • Minor spelling fix
  • Fixes compiler warnings when building this regression test

Testing: I tested this on row-hash with a broken DMS, and indeed the compression results were bad.

.cli_args = "-" #x, \
.param_values = PARAM_VALUES(level_##x##_param_values_dictload), \
.use_dictionary = 1, \
.advanced_api_only = 1, \
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I added the new configs to LEVEL() so we automatically get testing for all the levels we try.

An alternative approach is to have each of these just be their own config, with a fixed compression level (and if we wanted to test more clevels/strategies, we could just add more configs). But that does have the downside of not automatically hitting all the tested clevels.

I do prefer having these run on all the compression levels, in case in the future, strategies get changed in a way that might affect the dictionary strategies.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense

@senhuang42 senhuang42 force-pushed the add_dict_regression_tests_backup branch from d980071 to bbbd578 Compare March 25, 2021 18:11
@terrelln
Copy link
Contributor

Thanks for adding this Sen! Really glad to see increased coverage of dictionary compression.

@senhuang42 senhuang42 merged commit ab216bc into facebook:dev Mar 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants