Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark Encoding::BIT_PACKED as deprecated and document its compatibility issues #5348

Merged
merged 2 commits into from
Jan 31, 2024

Conversation

jhorstmann
Copy link
Contributor

Which issue does this PR close?

Related to #5338.

Rationale for this change

This marks Encoding::BIT_PACKED as deprecated, so that users will get notified if they explicitly use it for a v1 LevelEncoder.

We probably can't remove or fix the encoding without a lengthy deprecation cycle since users might want to roundtrip existing files using the incorrect bit order, but they should at least be made aware of the compatibility issues.

The encoding module should be behind the experimental feature flag, so hopefully not many people are using this api and we might be able to create a spec compliant implementation in the future.

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added the parquet Changes to the parquet crate label Jan 30, 2024
@mapleFU
Copy link
Member

mapleFU commented Jan 30, 2024

Seems that BIT_PACKED can be used for boolean data, would that also be deprecated?

@jhorstmann
Copy link
Contributor Author

Seems that BIT_PACKED can be used for boolean data, would that also be deprecated?

Boolean columns use either PLAIN or RLE encoding, where the PLAIN encoding is actually using bitpacking starting from the least-significant bit. I think the documentation for this is confusing since PLAIN Boolean links to the deprecated bit-packed encoding, followed by the note "LSB first", making this a quite different encoding.

@mapleFU
Copy link
Member

mapleFU commented Jan 30, 2024

Ah you're right, boolean PLAIN will pack into bits, but encoding is PLAIN, my bad.

@tustvold tustvold merged commit 93c7a12 into apache:master Jan 31, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants