-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-19881][SQL] Support Dynamic Partition Inserts params with SET command #17223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #74247 has started for PR 17223 at commit |
| "hive.exec.max.dynamic.partitions", | ||
| "hive.exec.max.dynamic.partitions.pernode", | ||
| "hive.exec.max.created.files", | ||
| "hive.error.on.empty.partition" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need only 4 among 6 parameters in https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-DynamicPartitionInserts .
|
Retest this please |
|
Test build #74255 has finished for PR 17223 at commit
|
|
Could you review this when you have sometime, @gatorsmile ? |
|
Hi, @cloud-fan . |
| "hive.error.on.empty.partition" | ||
| ).foreach { param => | ||
| if (sqlConf.contains(param)) { | ||
| client.runSqlHive(s"set $param=${sqlConf.getConfString(param)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we do it when users issuing the SET command? Is it a general issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That will be the best approach since this is a general issue for all unhandled hive param options. The reason to do this here is that SetCommand lives in sql/core and does not interact with this. Is there a way to invoke runSqlHive there?
|
Since hive client is shared among all sessions, we can't set hive conf dynamically, to keep session isolation. I think we should treat hive conf as static sql conf, and throw exception when users try to change them. |
|
I see. That's the reason why not to support that. Thank you, @cloud-fan. |
|
I'll close this PR and JIRA issue. |
What changes were proposed in this pull request?
Since Spark 2.0.0,
SETcommands do not pass the values to HiveClient. In most case, Spark handles well. However, for the dynamic partition insert, users meet the following misleading situation.The last error is the same with the previous one.
HiveClientdoes not know new value 1001. While users controlhive.exec.dynamic.partition.mode, there is no way to change the default value ofhive.exec.max.dynamic.partitionsofHiveCilentwithSETcommand.The root cause is that
hiveparameters are passed toHiveClienton creating. So, the workaround is to use--hiveconfwhen startingspark-shell. However, it is still unchangeable inspark-shell. We had better handle this case without misleading error messages ending infinite loop.How was this patch tested?
Manual.