-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-29372: Iceberg: [V3] Fix inconsistencies when vectorizartion is enabled in case of variant shredding. #6245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| } | ||
|
|
||
| public static boolean shouldUseVariantShredding(Properties tableProps, Schema tableSchema) { | ||
| return isVariantShreddingEnabled(tableProps) && hasVariantFields(tableSchema); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe use Maps.fromProperties to avoid code duplication. or introduce a generic signature UnaryOperator<String> tableProps -> tableProps::getProperty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
…enabled in case of variant shredding.
deniskuzZ
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, pending tests
| (1, parse_json('{"a": 1}')), | ||
| (2, parse_json('{"b": 2}')); | ||
|
|
||
| SELECT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it cover some specific test-case? with qtest you can't test the physical layout anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
earlier it was giving wrong results:
NULL 2
NULL NULL
now it is correct, I added the scenario that I tried in #6152 (comment)
I didn't put like all cases to show output is same, just it is correct, the previous original case was throwing exception, this was giving wrong result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wonder if we same issue could be reproduced on existing table tbl_shredded_variant by inserting (1, parse_json('{"name": "John", "age": 30, "active": true}')) with complete schema last
SELECT
variant_get(data, '$.name') as name;
variant_get(data, '$.age', 'int') as age;
FROM tbl_shredded_variant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, added vectorization support in #6224
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No Denys, it throws the same MALFORMED_VARIANT exception like now, I tried pulling the full one below, added a new record but any query is giving the same exception as in the previous thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IDK, works fine for me: 1062dc5#diff-779297369ca87b0f30855401004c6819e0b64fb82167861f08dcba08e2ae3859
Execution mode: vectorized
1 NULL
NULL 2
|
…enabled in case of variant shredding. (apache#6245)



What changes were proposed in this pull request?
As of now Vectorization isn't supported for Shredded Variants, so we fallback to non-vectorized mode in that case
Why are the changes needed?
#6152 (comment)
Does this PR introduce any user-facing change?
Vectorization enabled/disabled doesn't bother.
How was this patch tested?
UT