Skip to content

[Format] Add an Arrow Canonical Extension Type for Parquet Variant #46908

@alamb

Description

@alamb

Describe the enhancement requested

Parquet has added a new type for semi-structured data called Variant which is defined here:

As it is common for engines to read data from Parquet into Arrow for in memory processing it is useful to have support for Variant in Arrow. @CurtHagenlocher proposes adding native Variant support in the Arrow format itself here:

An alternate approach is to add a Canonical Extension Type

@zeroshade wrote up a proposal

And implemented an implementation in Go

This ticket tracks the idea of adding Variant as an official extension type

See also @neilechao 's PR to add variant read support to parquet

Component(s)

Format

Metadata

Metadata

Assignees

No fields configured for Feature.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions