-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-15706] [SQL] Fix Wrong Answer when using IF NOT EXISTS in INSERT OVERWRITE for DYNAMIC PARTITION #13447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #59753 has finished for PR 13447 at commit
|
|
Test build #59755 has finished for PR 13447 at commit
|
| val tableIdent = visitTableIdentifier(ctx.tableIdentifier) | ||
| val partitionKeys = Option(ctx.partitionSpec).map(visitPartitionSpec).getOrElse(Map.empty) | ||
|
|
||
| if (ctx.EXISTS != null && ctx.partitionSpec == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could enforce this in grammar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do it. Thanks!
|
@gatorsmile the overall approach seems good to me. We are currently fixing this for the SQL codepath. I was wondering if are other codepaths that can cause this unwanted behavior? If there are then we should move the check to the |
|
I think your concern is valid. Will add an So far, dynamic partitioning is used by the data.write.mode(SaveMode.Overwrite).partitionBy("part").insertInto("partitioned")The default value is always
However, there still exist other bugs in |
|
Test build #59865 has finished for PR 13447 at commit
|
|
Test build #59873 has finished for PR 13447 at commit
|
|
retest this please |
|
Test build #60098 has finished for PR 13447 at commit
|
|
@hvanhovell Does the latest code changes resolve your comment? Thanks! |
|
Seems the behavior of the following query is weird (if it works). (the case without IF NOT EXISTS) For this case, the number of column (1) of the query is less than the required number of columns (2). There should be an exception, right? |
|
@yhuai You are right. This is not a good test case for verifying this. I will add a case like sql(
"""
|INSERT OVERWRITE TABLE table_with_partition
|partition (p1='a',p2) IF NOT EXISTS
|SELECT 'blarr3', 'newPartition'
""".stripMargin)The above statement returns an empty result set, which is still a wrong answer. Regarding your follow-up question, I also realized this is another bug we should capture, as shown above: #13447 (comment). My original plan is to submit a PR for addressing that issue after #12313 is merged. It sounds like we do not plan to do this in 2.0: #12313, I can do it now. Thanks! |
|
Test build #60344 has finished for PR 13447 at commit
|
|
retest this please |
|
Test build #60619 has finished for PR 13447 at commit
|
|
cc @hvanhovell @yhuai Could you please review this PR again? Thanks! |
|
Thanks. LGTM. Merging to master and branch 2.0. |
…T OVERWRITE for DYNAMIC PARTITION #### What changes were proposed in this pull request? `IF NOT EXISTS` in `INSERT OVERWRITE` should not support dynamic partitions. If we specify `IF NOT EXISTS`, the inserted statement is not shown in the table. This PR is to issue an exception in this case, just like what Hive does. Also issue an exception if users specify `IF NOT EXISTS` if users do not specify any `PARTITION` specification. #### How was this patch tested? Added test cases into `PlanParserSuite` and `InsertIntoHiveTableSuite` Author: gatorsmile <gatorsmile@gmail.com> Closes #13447 from gatorsmile/insertIfNotExist. (cherry picked from commit e5d703b) Signed-off-by: Yin Huai <yhuai@databricks.com>
What changes were proposed in this pull request?
IF NOT EXISTSinINSERT OVERWRITEshould not support dynamic partitions. If we specifyIF NOT EXISTS, the inserted statement is not shown in the table.This PR is to issue an exception in this case, just like what Hive does. Also issue an exception if users specify
IF NOT EXISTSif users do not specify anyPARTITIONspecification.How was this patch tested?
Added test cases into
PlanParserSuiteandInsertIntoHiveTableSuite