Skip to content

Conversation

@gatorsmile
Copy link
Member

What changes were proposed in this pull request?

In Spark 2.0, SaveAsTable does not work when source DataFrame is built on a Hive Table, but Spark 1.6 works.

Spark 1.6

scala> sql("create table sample.sample stored as SEQUENCEFILE as select 1 as key, 'abc' as value")
res2: org.apache.spark.sql.DataFrame = []

scala> val df = sql("select key, value as value from sample.sample")
df: org.apache.spark.sql.DataFrame = [key: int, value: string]

scala> df.write.mode("append").saveAsTable("sample.sample")

scala> sql("select * from sample.sample").show()
+---+-----+
|key|value|
+---+-----+
|  1|  abc|
|  1|  abc|
+---+-----+

Spark 2.0

scala> df.write.mode("append").saveAsTable("sample.sample")
org.apache.spark.sql.AnalysisException: Saving data in MetastoreRelation sample, sample
 is not supported.;

This PR is to provide a support with by-name resolution. In 1.6, it is by-position resolution. The previous behavior is wrong. We need to adjust the order.

How was this patch tested?

Test cases are added

@SparkQA
Copy link

SparkQA commented Aug 12, 2016

Test build #63645 has finished for PR 14612 at commit 71399f1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 3, 2016

Test build #66252 has finished for PR 14612 at commit d2b2d91.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

cc @cloud-fan @yhuai

@gatorsmile
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Nov 6, 2016

Test build #68248 has finished for PR 14612 at commit d2b2d91.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible that we can extract the similar logic in CreateDataSourceTableAsSelectCommand here?

@cloud-fan
Copy link
Contributor

hi @gatorsmile do you have time to work on it? I'd like to get this into 2.1

@gatorsmile
Copy link
Member Author

Sure, will finish this tomorrow.

@SparkQA
Copy link

SparkQA commented Nov 11, 2016

Test build #68495 has finished for PR 14612 at commit 3035cfe.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

c

case c @ CreateTable(tableDesc, mode, Some(query))
if mode == SaveMode.Append && isHiveSerdeTable(tableDesc.identifier) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we consolidate hive and data source table here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try another way to fix it. Will submit a new PR.

Copy link
Member Author

@gatorsmile gatorsmile Nov 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh... Actually, I found another bug in our write path.

@gatorsmile gatorsmile closed this Nov 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants