Skip to content

Commit a3a6c1a

Browse files
committed
Add Zstandard compression support
1 parent 3e9f09f commit a3a6c1a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+379
-311
lines changed

MANIFEST.in

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ global-exclude *.xpt
3636
global-exclude *.cpt
3737
global-exclude *.xz
3838
global-exclude *.zip
39+
global-exclude *.zst
3940
global-exclude *~
4041
global-exclude .DS_Store
4142
global-exclude .git*

ci/deps/actions-38-slow.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,3 +34,4 @@ dependencies:
3434
- xlsxwriter
3535
- xlwt
3636
- numba
37+
- zstandard

ci/deps/actions-39-slow.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ dependencies:
3737
- xlsxwriter
3838
- xlwt
3939
- pyreadstat
40+
- zstandard
4041
- pip
4142
- pip:
4243
- pyxlsb

ci/deps/actions-39.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ dependencies:
3636
- xlsxwriter
3737
- xlwt
3838
- pyreadstat
39+
- zstandard
3940
- pip
4041
- pip:
4142
- pyxlsb

ci/deps/azure-macos-38.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ dependencies:
3030
- xlrd
3131
- xlsxwriter
3232
- xlwt
33+
- zstandard
3334
- pip
3435
- pip:
3536
- cython>=0.29.24

ci/deps/azure-windows-38.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,3 +32,4 @@ dependencies:
3232
- xlrd
3333
- xlsxwriter
3434
- xlwt
35+
- zstandard

ci/deps/azure-windows-39.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ dependencies:
3636
- xlsxwriter
3737
- xlwt
3838
- pyreadstat
39+
- zstandard
3940
- pip
4041
- pip:
4142
- pyxlsb

ci/deps/circle-38-arm64.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ dependencies:
1515
- numpy
1616
- python-dateutil
1717
- pytz
18+
- zstandard
1819
- pip
1920
- flask
2021
- pip:

doc/source/getting_started/install.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -402,3 +402,13 @@ qtpy Clipboard I/O
402402
xclip Clipboard I/O on linux
403403
xsel Clipboard I/O on linux
404404
========================= ================== =============================================================
405+
406+
407+
Compression
408+
^^^^^^^^^^^
409+
410+
========================= ================== =============================================================
411+
Dependency Minimum Version Notes
412+
========================= ================== =============================================================
413+
Zstandard Zstandard compression
414+
========================= ================== =============================================================

doc/source/user_guide/io.rst

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -316,14 +316,14 @@ chunksize : int, default ``None``
316316
Quoting, compression, and file format
317317
+++++++++++++++++++++++++++++++++++++
318318

319-
compression : {``'infer'``, ``'gzip'``, ``'bz2'``, ``'zip'``, ``'xz'``, ``None``, ``dict``}, default ``'infer'``
319+
compression : {``'infer'``, ``'gzip'``, ``'bz2'``, ``'zip'``, ``'xz'``, ``'zstd'``, ``None``, ``dict``}, default ``'infer'``
320320
For on-the-fly decompression of on-disk data. If 'infer', then use gzip,
321-
bz2, zip, or xz if ``filepath_or_buffer`` is path-like ending in '.gz', '.bz2',
322-
'.zip', or '.xz', respectively, and no decompression otherwise. If using 'zip',
321+
bz2, zip, xz, or zstandard if ``filepath_or_buffer`` is path-like ending in '.gz', '.bz2',
322+
'.zip', '.xz', '.zst', respectively, and no decompression otherwise. If using 'zip',
323323
the ZIP file must contain only one data file to be read in.
324324
Set to ``None`` for no decompression. Can also be a dict with key ``'method'``
325-
set to one of {``'zip'``, ``'gzip'``, ``'bz2'``} and other key-value pairs are
326-
forwarded to ``zipfile.ZipFile``, ``gzip.GzipFile``, or ``bz2.BZ2File``.
325+
set to one of {``'zip'``, ``'gzip'``, ``'bz2'``, ``'zstd'``} and other key-value pairs are
326+
forwarded to ``zipfile.ZipFile``, ``gzip.GzipFile``, ``bz2.BZ2File``, or ``zstandard.ZstdDecompressor``.
327327
As an example, the following could be passed for faster compression and to
328328
create a reproducible gzip archive:
329329
``compression={'method': 'gzip', 'compresslevel': 1, 'mtime': 1}``.
@@ -4032,18 +4032,18 @@ Compressed pickle files
40324032
'''''''''''''''''''''''
40334033

40344034
:func:`read_pickle`, :meth:`DataFrame.to_pickle` and :meth:`Series.to_pickle` can read
4035-
and write compressed pickle files. The compression types of ``gzip``, ``bz2``, ``xz`` are supported for reading and writing.
4035+
and write compressed pickle files. The compression types of ``gzip``, ``bz2``, ``xz``, ``zstd`` are supported for reading and writing.
40364036
The ``zip`` file format only supports reading and must contain only one data file
40374037
to be read.
40384038

40394039
The compression type can be an explicit parameter or be inferred from the file extension.
4040-
If 'infer', then use ``gzip``, ``bz2``, ``zip``, or ``xz`` if filename ends in ``'.gz'``, ``'.bz2'``, ``'.zip'``, or
4041-
``'.xz'``, respectively.
4040+
If 'infer', then use ``gzip``, ``bz2``, ``zip``, ``xz``, ``zstd`` if filename ends in ``'.gz'``, ``'.bz2'``, ``'.zip'``,
4041+
``'.xz'``, or ``'.zst'``, respectively.
40424042

40434043
The compression parameter can also be a ``dict`` in order to pass options to the
40444044
compression protocol. It must have a ``'method'`` key set to the name
40454045
of the compression protocol, which must be one of
4046-
{``'zip'``, ``'gzip'``, ``'bz2'``}. All other key-value pairs are passed to
4046+
{``'zip'``, ``'gzip'``, ``'bz2'``, ``'xz'``, ``'zstd'``}. All other key-value pairs are passed to
40474047
the underlying compression library.
40484048

40494049
.. ipython:: python

0 commit comments

Comments
 (0)