Skip to content

[SPARK-49152][SQL] V2SessionCatalog should use V2Command#47660

Closed
amaliujia wants to merge 13 commits intoapache:masterfrom
amaliujia:create_table_v2
Closed

[SPARK-49152][SQL] V2SessionCatalog should use V2Command#47660
amaliujia wants to merge 13 commits intoapache:masterfrom
amaliujia:create_table_v2

Conversation

@amaliujia
Copy link
Copy Markdown

What changes were proposed in this pull request?

V2SessionCatalog should use V2Command when possible.

Why are the changes needed?

This is because the session catalog can be overwritten thus the overwritten's catalog should use v2 commands, otherwise the V1Command will still call hive metastore or the built-in session catalog.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing tests.

Was this patch authored or co-authored using generative AI tooling?

NO

@github-actions github-actions bot added the SQL label Aug 8, 2024
@amaliujia
Copy link
Copy Markdown
Author

@cloud-fan

@amaliujia amaliujia changed the title [SPARK-49152][SQL] V2SessionCatalog should use V2Command when possible [SPARK-49152][SQL] V2SessionCatalog should use V2Command Aug 8, 2024
Comment thread sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala Outdated
import DataSourceV2Implicits._
import org.apache.spark.sql.connector.catalog.CatalogV2Implicits._

lazy private val hadoopConf = session.sparkContext.hadoopConfiguration
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this shouldn't be lazy val, we should call SessionState.newHadoopConf

Comment thread sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala Outdated
Rui Wang and others added 2 commits August 11, 2024 19:22
…/ResolveSessionCatalog.scala

Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
@cloud-fan
Copy link
Copy Markdown
Contributor

cloud-fan commented Aug 12, 2024

thanks, merging to master!

@cloud-fan cloud-fan closed this in 2465cb0 Aug 12, 2024
amaliujia pushed a commit to amaliujia/spark that referenced this pull request Aug 12, 2024
V2SessionCatalog should use V2Command when possible.

This is because the session catalog can be overwritten thus the overwritten's catalog should use v2 commands, otherwise the V1Command will still call hive metastore or the built-in session catalog.

No

Existing tests.

 NO

Closes apache#47660 from amaliujia/create_table_v2.

Authored-by: Rui Wang <rui.wang@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
private def qualifyLocInTableSpec(tableSpec: TableSpec): TableSpec = {
tableSpec.withNewLocation(tableSpec.location.map(makeQualifiedDBObjectPath(_)))
tableSpec.withNewLocation(tableSpec.location.map(loc => CatalogUtils.makeQualifiedPath(
CatalogUtils.stringToURI(loc), hadoopConf).toString))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should follow v1 command code path and call CatalogUtils.URIToString to get the path string.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixing at #47759

dongjoon-hyun pushed a commit that referenced this pull request Aug 14, 2024
…ath string

### What changes were proposed in this pull request?

This is a followup of #47660 to restore the behavior change. The table location string should be Hadoop Path string instead of URL string which escapes all special chars.

### Why are the changes needed?

restore the unintentional behavior change.

### Does this PR introduce _any_ user-facing change?

No, it's not released yet

### How was this patch tested?

new test

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #47759 from cloud-fan/fix.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
cloud-fan added a commit to cloud-fan/spark that referenced this pull request Aug 15, 2024
…ath string

This is a followup of apache#47660 to restore the behavior change. The table location string should be Hadoop Path string instead of URL string which escapes all special chars.

restore the unintentional behavior change.

No, it's not released yet

new test

no

Closes apache#47759 from cloud-fan/fix.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
cloud-fan added a commit that referenced this pull request Aug 15, 2024
…oop Path string

### What changes were proposed in this pull request?

This is a followup of #47660 to restore the behavior change. The table location string should be Hadoop Path string instead of URL string which escapes all special chars.

### Why are the changes needed?

restore the unintentional behavior change.

### Does this PR introduce _any_ user-facing change?

No, it's not released yet

### How was this patch tested?

new test

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #47765 from cloud-fan/fix.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
cloud-fan added a commit that referenced this pull request Sep 5, 2024
…se V1 commands

### What changes were proposed in this pull request?

This is a followup of #47660 . If users override `spark_catalog` with
`DelegatingCatalogExtension`, we should still use v1 commands as `DelegatingCatalogExtension` forwards requests to HMS and there are still behavior differences between v1 and v2 commands targeting HMS.

This PR also forces to use v1 commands for certain commands that do not have a v2 version.

### Why are the changes needed?

Avoid introducing behavior changes to Spark plugins that implements `DelegatingCatalogExtension` to override `spark_catalog`.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

new test case

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #47995 from amaliujia/fix_catalog_v2.

Lead-authored-by: Wenchen Fan <wenchen@databricks.com>
Co-authored-by: Rui Wang <rui.wang@databricks.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
cloud-fan added a commit that referenced this pull request Sep 5, 2024
…se V1 commands

### What changes were proposed in this pull request?

This is a followup of #47660 . If users override `spark_catalog` with
`DelegatingCatalogExtension`, we should still use v1 commands as `DelegatingCatalogExtension` forwards requests to HMS and there are still behavior differences between v1 and v2 commands targeting HMS.

This PR also forces to use v1 commands for certain commands that do not have a v2 version.

### Why are the changes needed?

Avoid introducing behavior changes to Spark plugins that implements `DelegatingCatalogExtension` to override `spark_catalog`.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

new test case

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #47995 from amaliujia/fix_catalog_v2.

Lead-authored-by: Wenchen Fan <wenchen@databricks.com>
Co-authored-by: Rui Wang <rui.wang@databricks.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit f7cfeb5)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
sunchao pushed a commit that referenced this pull request Mar 10, 2026
…oop Path string

### What changes were proposed in this pull request?

This is a followup of #47660 to restore the behavior change. The table location string should be Hadoop Path string instead of URL string which escapes all special chars.

### Why are the changes needed?

restore the unintentional behavior change.

### Does this PR introduce _any_ user-facing change?

No, it's not released yet

### How was this patch tested?

new test

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #47765 from cloud-fan/fix.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
sunchao pushed a commit that referenced this pull request Mar 10, 2026
…se V1 commands

### What changes were proposed in this pull request?

This is a followup of #47660 . If users override `spark_catalog` with
`DelegatingCatalogExtension`, we should still use v1 commands as `DelegatingCatalogExtension` forwards requests to HMS and there are still behavior differences between v1 and v2 commands targeting HMS.

This PR also forces to use v1 commands for certain commands that do not have a v2 version.

### Why are the changes needed?

Avoid introducing behavior changes to Spark plugins that implements `DelegatingCatalogExtension` to override `spark_catalog`.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

new test case

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #47995 from amaliujia/fix_catalog_v2.

Lead-authored-by: Wenchen Fan <wenchen@databricks.com>
Co-authored-by: Rui Wang <rui.wang@databricks.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit f7cfeb5)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants