Skip to content

Conversation

@jhorstmann
Copy link
Contributor

Which issue does this PR close?

Related to #5338.

Rationale for this change

This marks Encoding::BIT_PACKED as deprecated, so that users will get notified if they explicitly use it for a v1 LevelEncoder.

We probably can't remove or fix the encoding without a lengthy deprecation cycle since users might want to roundtrip existing files using the incorrect bit order, but they should at least be made aware of the compatibility issues.

The encoding module should be behind the experimental feature flag, so hopefully not many people are using this api and we might be able to create a spec compliant implementation in the future.

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added the parquet Changes to the parquet crate label Jan 30, 2024
@mapleFU
Copy link
Member

mapleFU commented Jan 30, 2024

Seems that BIT_PACKED can be used for boolean data, would that also be deprecated?

@jhorstmann
Copy link
Contributor Author

Seems that BIT_PACKED can be used for boolean data, would that also be deprecated?

Boolean columns use either PLAIN or RLE encoding, where the PLAIN encoding is actually using bitpacking starting from the least-significant bit. I think the documentation for this is confusing since PLAIN Boolean links to the deprecated bit-packed encoding, followed by the note "LSB first", making this a quite different encoding.

@mapleFU
Copy link
Member

mapleFU commented Jan 30, 2024

Ah you're right, boolean PLAIN will pack into bits, but encoding is PLAIN, my bad.

@tustvold tustvold merged commit 93c7a12 into apache:master Jan 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants