[SPARK-49152][SQL][FOLLOWUP] DelegatingCatalogExtension should also use V1 commands#47995
[SPARK-49152][SQL][FOLLOWUP] DelegatingCatalogExtension should also use V1 commands#47995amaliujia wants to merge 4 commits intoapache:masterfrom
Conversation
|
|
||
| case RepairTable(ResolvedV1TableIdentifier(ident), addPartitions, dropPartitions) => | ||
| case RepairTable( | ||
| ResolvedTableIdentifierInSessionCatalog(ident), addPartitions, dropPartitions) => |
There was a problem hiding this comment.
| ResolvedTableIdentifierInSessionCatalog(ident), addPartitions, dropPartitions) => | |
| ResolvedTableIdentifierInSessionCatalog(ident), addPartitions, dropPartitions) => |
| private def supportsV1Command(catalog: CatalogPlugin): Boolean = { | ||
| isSessionCatalog(catalog) && | ||
| SQLConf.get.getConf(SQLConf.V2_SESSION_CATALOG_IMPLEMENTATION).isEmpty | ||
| (isSessionCatalog(catalog) && |
There was a problem hiding this comment.
I think isSessionCatalog(catalog) should always be checked.
| val v2Catalog = catalog("spark_catalog").asTableCatalog | ||
| val table = v2Catalog.loadTable(Identifier.of(Array("default"), "tbl")) | ||
| assert(table.properties().get(TableCatalog.PROP_PROVIDER) == classOf[SimpleScanSource].getName) | ||
| val e = intercept[AnalysisException] { |
There was a problem hiding this comment.
|
thanks, merging to master/3.5! |
…se V1 commands ### What changes were proposed in this pull request? This is a followup of #47660 . If users override `spark_catalog` with `DelegatingCatalogExtension`, we should still use v1 commands as `DelegatingCatalogExtension` forwards requests to HMS and there are still behavior differences between v1 and v2 commands targeting HMS. This PR also forces to use v1 commands for certain commands that do not have a v2 version. ### Why are the changes needed? Avoid introducing behavior changes to Spark plugins that implements `DelegatingCatalogExtension` to override `spark_catalog`. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? new test case ### Was this patch authored or co-authored using generative AI tooling? No Closes #47995 from amaliujia/fix_catalog_v2. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Rui Wang <rui.wang@databricks.com> Co-authored-by: Wenchen Fan <cloud0fan@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit f7cfeb5) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
| } | ||
| } | ||
|
|
||
| test("SPARK-49152: partition columns should be put at the end") { |
There was a problem hiding this comment.
The test case name sounds a little mismatched. Is this a correct reproducer for DelegatingCatalogExtension issue, @amaliujia ?
There was a problem hiding this comment.
This is the Spark behavior with the built-in catalog. A DelegatingCatalogExtension shouldn't change it.
Strictly speaking, this shouldn't be the only issue, as there might be other subtle differences between v1 and v2 CREATE TABLE command, or other commands.
There was a problem hiding this comment.
Yeah as @cloud-fan have mentioned, this verifies a specific case so we make sure the behavior is not changed. And there might be a list to verify though.
There was a problem hiding this comment.
Got it. It makes sense. Thank you for the explanation.
…be changed by falling back to v1 command ### What changes were proposed in this pull request? This is a followup of #47772 . The behavior of SaveAsTable should not be changed by switching v1 to v2 command. This is similar to #47995. For the case of `DelegatingCatalogExtension` we need it goes to V1 commands to be consistent with previous behavior. ### Why are the changes needed? Behavior regression. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? UT ### Was this patch authored or co-authored using generative AI tooling? No Closes #48019 from amaliujia/regress_v2. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…be changed by falling back to v1 command This is a followup of #47772 . The behavior of SaveAsTable should not be changed by switching v1 to v2 command. This is similar to #47995. For the case of `DelegatingCatalogExtension` we need it goes to V1 commands to be consistent with previous behavior. Behavior regression. No UT No Closes #48019 from amaliujia/regress_v2. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 37b39b4) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…se V1 commands ### What changes were proposed in this pull request? This is a followup of #47660 . If users override `spark_catalog` with `DelegatingCatalogExtension`, we should still use v1 commands as `DelegatingCatalogExtension` forwards requests to HMS and there are still behavior differences between v1 and v2 commands targeting HMS. This PR also forces to use v1 commands for certain commands that do not have a v2 version. ### Why are the changes needed? Avoid introducing behavior changes to Spark plugins that implements `DelegatingCatalogExtension` to override `spark_catalog`. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? new test case ### Was this patch authored or co-authored using generative AI tooling? No Closes #47995 from amaliujia/fix_catalog_v2. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Rui Wang <rui.wang@databricks.com> Co-authored-by: Wenchen Fan <cloud0fan@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit f7cfeb5) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…be changed by falling back to v1 command This is a followup of #47772 . The behavior of SaveAsTable should not be changed by switching v1 to v2 command. This is similar to #47995. For the case of `DelegatingCatalogExtension` we need it goes to V1 commands to be consistent with previous behavior. Behavior regression. No UT No Closes #48019 from amaliujia/regress_v2. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 37b39b4) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
This is a followup of #47660 . If users override
spark_catalogwithDelegatingCatalogExtension, we should still use v1 commands asDelegatingCatalogExtensionforwards requests to HMS and there are still behavior differences between v1 and v2 commands targeting HMS.This PR also forces to use v1 commands for certain commands that do not have a v2 version.
Why are the changes needed?
Avoid introducing behavior changes to Spark plugins that implements
DelegatingCatalogExtensionto overridespark_catalog.Does this PR introduce any user-facing change?
No
How was this patch tested?
new test case
Was this patch authored or co-authored using generative AI tooling?
No