[SPARK-49246][SQL] TableCatalog#loadTable should indicate if it's for writing#47772
[SPARK-49246][SQL] TableCatalog#loadTable should indicate if it's for writing#47772cloud-fan wants to merge 6 commits intoapache:masterfrom
Conversation
|
Speaking of table-level privileges, it's not simply rw actually,
|
|
@yaooqinn good point! I'll change it to |
|
operationType shall be a collection, these types are not always used in isolation, e.g., some operations might be upsert, or both read and write against the same table |
|
also cc @huaxingao |
| } | ||
|
|
||
| object UnresolvedRelation { | ||
| // An internal option of `UnresolvedRelation` to specify the required write privileges when |
There was a problem hiding this comment.
We can add a new field to UnresolvedRelation but it may break third-party catalyst rules.
| * | ||
| * @since 4.0.0 | ||
| */ | ||
| public enum TableWritePrivilege { |
There was a problem hiding this comment.
I only include write privileges as the full privileges include ALTER, REFERENCE, etc, which is not what we need for loadTable.
|
@cloud-fan Thanks for pinging me! The new |
|
The last commit just resolves a trivial merge conflicts, I'm merging this to master/3.5, thanks for reviewing! |
… writing For custom catalogs that have access control, read and write permissions can be different. However, currently Spark always call `TableCatalog#loadTable` to look up the table, no matter it's for read or write. This PR adds a variant of `loadTable`: `loadTableForWrite`, in `TableCatalog`. All the write commands will call this new method to look up tables instead. This new method has a default implementation that just calls `loadTable`, so there is no breaking change. allow more fine-grained access control for custom catalogs. No new tests no Closes #47772 from cloud-fan/write. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Wenchen Fan <cloud0fan@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit b6164e6) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…be changed by falling back to v1 command ### What changes were proposed in this pull request? This is a followup of #47772 . The behavior of SaveAsTable should not be changed by switching v1 to v2 command. This is similar to #47995. For the case of `DelegatingCatalogExtension` we need it goes to V1 commands to be consistent with previous behavior. ### Why are the changes needed? Behavior regression. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? UT ### Was this patch authored or co-authored using generative AI tooling? No Closes #48019 from amaliujia/regress_v2. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…be changed by falling back to v1 command This is a followup of #47772 . The behavior of SaveAsTable should not be changed by switching v1 to v2 command. This is similar to #47995. For the case of `DelegatingCatalogExtension` we need it goes to V1 commands to be consistent with previous behavior. Behavior regression. No UT No Closes #48019 from amaliujia/regress_v2. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 37b39b4) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
… writing For custom catalogs that have access control, read and write permissions can be different. However, currently Spark always call `TableCatalog#loadTable` to look up the table, no matter it's for read or write. This PR adds a variant of `loadTable`: `loadTableForWrite`, in `TableCatalog`. All the write commands will call this new method to look up tables instead. This new method has a default implementation that just calls `loadTable`, so there is no breaking change. allow more fine-grained access control for custom catalogs. No new tests no Closes #47772 from cloud-fan/write. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Wenchen Fan <cloud0fan@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit b6164e6) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…be changed by falling back to v1 command This is a followup of #47772 . The behavior of SaveAsTable should not be changed by switching v1 to v2 command. This is similar to #47995. For the case of `DelegatingCatalogExtension` we need it goes to V1 commands to be consistent with previous behavior. Behavior regression. No UT No Closes #48019 from amaliujia/regress_v2. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 37b39b4) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
For custom catalogs that have access control, read and write permissions can be different. However, currently Spark always call
TableCatalog#loadTableto look up the table, no matter it's for read or write.This PR adds a variant of
loadTablethat indicates the required write privileges. All the write commands will call this new method to look up tables instead. This new method has a default implementation that just callsloadTable, so there is no breaking change.Why are the changes needed?
allow more fine-grained access control for custom catalogs.
Does this PR introduce any user-facing change?
No
How was this patch tested?
new tests
Was this patch authored or co-authored using generative AI tooling?
no