-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
The compression level selected in Arrow for ZSTD is 1 which is the minimal compression level for the compressor. This leads to very high compression speed at the sacrifice of compression ratio.
The user should be allowed to select the compression level as both speed and ratio are data specific.
The proposed solution is to expose the knob via an environment variable such as ARROW_ZSTD_COMPRESSION_LEVEL.
Example:
export ARROW_ZSTD_COMPRESSION_LEVEL=10
./my_parquet_app
Here is a test run with compression levels of 1, 2 and 5:
Level Time (s) Size (mb)
1 13.02 181
2 13.10 177
5 19.44 148
Reporter: Martin Radev / @martinradev
Assignee: Martin Radev / @martinradev
Related issues:
- [C++/Python] Add ability to set codec options for lz4 codec (is related to)
PRs and other links:
Note: This issue was originally created as ARROW-6216. Please see the migration documentation for further details.