fix: Cast string to boolean not compatible with Spark#107
Merged
sunchao merged 5 commits intoapache:mainfrom Feb 26, 2024
erenavsarogullari:comet-issue-17
Merged
fix: Cast string to boolean not compatible with Spark#107sunchao merged 5 commits intoapache:mainfrom erenavsarogullari:comet-issue-17
sunchao merged 5 commits intoapache:mainfrom
erenavsarogullari:comet-issue-17
Conversation
viirya
reviewed
Feb 24, 2024
Member
Comet doesn't support ansi enabled case for now. I think we can ignore them first. |
viirya
reviewed
Feb 24, 2024
viirya
reviewed
Feb 25, 2024
viirya
reviewed
Feb 25, 2024
viirya
reviewed
Feb 25, 2024
Comment on lines
+80
to
+83
| (DataType::Utf8, DataType::Boolean) => Self::spark_cast_utf8_to_boolean::<i32>(&array), | ||
| (DataType::LargeUtf8, DataType::Boolean) => { | ||
| Self::spark_cast_utf8_to_boolean::<i64>(&array) | ||
| } |
Member
There was a problem hiding this comment.
We probably need to handle dictionary types too. Maybe as a follow up.
Member
Author
There was a problem hiding this comment.
Is this for casting from Dictionary.value_type to Boolean?
Member
There was a problem hiding this comment.
i.e., Dictionary(_, DataType::Utf8) and Dictionary(_, DataType:: LargeUtf8)
Member
Author
There was a problem hiding this comment.
Sure, please feel free to create an issue so i can have a look after this PR. Thanks
viirya
approved these changes
Feb 25, 2024
sunchao
approved these changes
Feb 26, 2024
Member
|
Merged, thanks! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #17.
Rationale for this change
Comet needs to be compatible with Spark on string to boolean casting results.
What changes are included in this PR?
Currently, Spark supports following boolean values and Comet also needs to compatible with Spark.
Spark Reference API:
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala#L74
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala#L585
Note: I will also add
spark.sql.ansi.enabledcase support.How are these changes tested?
Added new UT.