fix endianness related issue with complex metric compression#17391
Closed
clintropolis wants to merge 1 commit intoapache:masterfrom
Closed
fix endianness related issue with complex metric compression#17391clintropolis wants to merge 1 commit intoapache:masterfrom
clintropolis wants to merge 1 commit intoapache:masterfrom
Conversation
gianm
reviewed
Oct 22, 2024
| } | ||
|
|
||
| /** | ||
| * Preferred byte order to read values from a buffer with {@link #fromByteBuffer(ByteBuffer, int)} and similar methods |
Contributor
There was a problem hiding this comment.
Should also update fromByteBuffer contract to say that the caller must pass in a buffer whose order matches getByteOrder(). Please double-check all callers of fromByteBuffer to make sure they do that properly.
| @@ -34,17 +34,29 @@ public class CompressedVariableSizedBlobColumnSupplier implements Supplier<Compr | |||
| public static CompressedVariableSizedBlobColumnSupplier fromByteBuffer( | |||
Contributor
There was a problem hiding this comment.
This method seems like a bug waiting to happen, where someone should provide both compressionOrder and valueOrder but only provides compressionOrder because this method exists. Remove it?
Merged
5 tasks
Member
Author
|
closing in favor of #17422, though i will likely resurrect some of the ideas in this PR later |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR is a follow-up to #16863 to fix a problem for any complex metrics which do not specify the byte order of the underlying buffer and were counting on the java big endian default (such as
approxHistogram).The underlying
CompressedBlockReaderalready had the ability to specify an order on the buffer for values it reads, but it was just specifying the same order that the compression was using, which is native order by default for performance reasons.To push this into the block reader, I have added
ObjectStrategy.getByteOrder()with a default implementation returning big endian, though it probably would have been safe to hard-code to use big endian, since the things that need little endian were already explicitly ordering stuff on value read. However, thinking forward it seems like it would be nice to not have to re-order the values again in the object strategy if we can know ahead of time.I have added implementations for
ObjectStrategywhich i knew were little endian, even though they are already explicitly ordering values on read due to the behavior of the uncompressed complex columns.