Skip to content

[JAVA][C++]Support Parquet Read and Write in Java #279

@asfimport

Description

@asfimport

We added a new java interface to support parquet read and write from hdfs or local file.

The purpose of this implementation is that when we loading and dumping parquet data in Java, we can only use rowBased put and get methods. Since arrow already has C++ implementation to load and dump parquet, so we wrapped those codes as Java APIs.

After test, we noticed in our workload, performance improved more than 2x comparing with rowBased load and dump. So we want to contribute codes to arrow.

since this is a total independent change, there is no codes change to current arrow codes. We added two folders as listed:  java/adapter/parquet and cpp/src/jni/parquet

Reporter: Chendi.Xue

Related issues:

PRs and other links:

Note: This issue was originally created as ARROW-6720. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions