Skip to content

[R] Add support for more compression codecs in Windows build #23278

@asfimport

Description

@asfimport

When I attempt to write a parquet file using lz4, zstd, or brotli compression using R arrow 0.15.0, I am unable to do so due to the codec support not being built (example below).

 

> arrow::write_parquet(payout_strategy, sink = "records_test_lz4.parquet",compression = "lz4")
Error in parquet___arrow___FileWriter__WriteTable(self, table, chunk_size) : 
 Arrow error: IOError: Arrow error: NotImplemented: LZ4 codec support not built

 

I believe that the error is generated through https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/compression.cc#L124-L145, but I am not sure how to call 

install.packages("arrow")

in R to enable the ARROW_WITH_ZSTD/LZ4/BROTLI flags, or whether I should be doing installing zstd separately from arrow and then doing something pre- or post-install to link zstd with arrow. From #1209, it appears that zstd support has been added to arrow and parquet in general, and the R package readme (https://github.com/apache/arrow/tree/master/r) notes "On macOS and Windows, installing a binary package from CRAN will handle Arrow's C++ dependencies for you", but I get the sense that does not apply to zstd.

 

Is there guidance as to how to enable zstd and other compression codecs prior to or after downloading the R arrow package? Could this be added to the R documentation somewhere for future reference?

Environment: Windows 10
Reporter: Grant Nguyen / @gnguy
Assignee: Grant Nguyen / @gnguy

PRs and other links:

Note: This issue was originally created as ARROW-6960. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions