Skip to content

Unable to Stream Parquet File using JDBC #460

@rickysaltzer

Description

@rickysaltzer

I've seemed to hit a roadblock when trying to stream a Parquet file into ClickHouse using the JDBC library. Although the library doesn't contain a ClickHouseFormat.PARQUET I attempted to just specify the format via the sql parameter.

Curious to know if this is a known limitation, or if I'm just doing something wrong.

Query:

        connection.createStatement()
            .write()
            .sql("INSERT INTO `${studio.getDBName()}`.`${table}` FORMAT Parquet")
            .data(File("/tmp/pq/dump.parquet"))
            .send()

Error:

DB::Exception: Error while reading Parquet data: IOError: Couldn't deserialize thrift: TProtocolException: Invalid data
Deserializing page header failed.
 (version 20.4.4.18 (official build))

I've verified the Parquet file's correctness by using the clickhouse-client to import it, which went without issue.

cat /tmp/pq/dump.parquet | clickhouse-client ...

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions