Skip to content

Conversation

@liancheng
Copy link
Contributor

What changes were proposed in this pull request?

When inserting into an existing partitioned table, partitioning columns should always be determined by catalog metadata of the existing table to be inserted. Extra partitionBy() calls don't make sense, and mess up existing data because newly inserted data may have wrong partitioning directory layout.

How was this patch tested?

New test case added in InsertIntoHiveTableSuite.

@liancheng
Copy link
Contributor Author

cc @cloud-fan @clockfly @yhuai

if (partitioningColumns.isDefined) {
throw new AnalysisException(
"insertInto() can't be used together with partitionBy(). " +
"Partition columns are defined by the table into which is being inserted."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention that partitionBy is not needed?

@SparkQA
Copy link

SparkQA commented Jun 18, 2016

Test build #60736 has finished for PR 13747 at commit d24ed58.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yhuai
Copy link
Contributor

yhuai commented Jun 18, 2016

LGTM. Merging to master and branch 2.0.

asfgit pushed a commit that referenced this pull request Jun 18, 2016
…By()

## What changes were proposed in this pull request?

When inserting into an existing partitioned table, partitioning columns should always be determined by catalog metadata of the existing table to be inserted. Extra `partitionBy()` calls don't make sense, and mess up existing data because newly inserted data may have wrong partitioning directory layout.

## How was this patch tested?

New test case added in `InsertIntoHiveTableSuite`.

Author: Cheng Lian <lian@databricks.com>

Closes #13747 from liancheng/spark-16033-insert-into-without-partition-by.

(cherry picked from commit 10b6714)
Signed-off-by: Yin Huai <yhuai@databricks.com>
@asfgit asfgit closed this in 10b6714 Jun 18, 2016
@liancheng liancheng deleted the spark-16033-insert-into-without-partition-by branch June 18, 2016 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants