[SPARK-49246][SQL] TableCatalog#loadTable should indicate if it's for writing by cloud-fan · Pull Request #47772 · apache/spark

cloud-fan · 2024-08-15T13:54:27Z

What changes were proposed in this pull request?

For custom catalogs that have access control, read and write permissions can be different. However, currently Spark always call TableCatalog#loadTable to look up the table, no matter it's for read or write.

This PR adds a variant of loadTable that indicates the required write privileges. All the write commands will call this new method to look up tables instead. This new method has a default implementation that just calls loadTable, so there is no breaking change.

Why are the changes needed?

allow more fine-grained access control for custom catalogs.

Does this PR introduce any user-facing change?

No

How was this patch tested?

new tests

Was this patch authored or co-authored using generative AI tooling?

no

yaooqinn · 2024-08-15T14:28:59Z

Speaking of table-level privileges, it's not simply rw actually,

privilege type	desc
INSERT	for insert rows.
DELETE	for deleting rows.
SELECT	for data retrieval.
UPDATE	for changing column values.
ALTER	table, column, partition meta etc

cloud-fan · 2024-08-15T14:47:41Z

@yaooqinn good point! I'll change it to def loadTable(ident, operationType) while operationType is an enum.

yaooqinn · 2024-08-15T14:57:56Z

operationType shall be a collection, these types are not always used in isolation, e.g., some operations might be upsert, or both read and write against the same table

cloud-fan · 2024-08-17T03:01:46Z

also cc @huaxingao

cloud-fan · 2024-08-17T03:21:09Z

 }

 object UnresolvedRelation {
+  // An internal option of `UnresolvedRelation` to specify the required write privileges when


We can add a new field to UnresolvedRelation but it may break third-party catalyst rules.

cloud-fan · 2024-08-17T03:22:12Z

+ *
+ * @since 4.0.0
+ */
+public enum TableWritePrivilege {


I only include write privileges as the full privileges include ALTER, REFERENCE, etc, which is not what we need for loadTable.

huaxingao · 2024-08-19T15:11:07Z

@cloud-fan Thanks for pinging me! The new TableCatalog#loadTable API looks good to me, and it also looks good from Iceberg's perspective. Also cc @aokolnychyi @szehon-ho

cloud-fan · 2024-08-21T02:53:03Z

The last commit just resolves a trivial merge conflicts, I'm merging this to master/3.5, thanks for reviewing!

… writing For custom catalogs that have access control, read and write permissions can be different. However, currently Spark always call `TableCatalog#loadTable` to look up the table, no matter it's for read or write. This PR adds a variant of `loadTable`: `loadTableForWrite`, in `TableCatalog`. All the write commands will call this new method to look up tables instead. This new method has a default implementation that just calls `loadTable`, so there is no breaking change. allow more fine-grained access control for custom catalogs. No new tests no Closes #47772 from cloud-fan/write. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Wenchen Fan <cloud0fan@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit b6164e6) Signed-off-by: Wenchen Fan <wenchen@databricks.com>

…be changed by falling back to v1 command ### What changes were proposed in this pull request? This is a followup of #47772 . The behavior of SaveAsTable should not be changed by switching v1 to v2 command. This is similar to #47995. For the case of `DelegatingCatalogExtension` we need it goes to V1 commands to be consistent with previous behavior. ### Why are the changes needed? Behavior regression. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? UT ### Was this patch authored or co-authored using generative AI tooling? No Closes #48019 from amaliujia/regress_v2. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

…be changed by falling back to v1 command This is a followup of #47772 . The behavior of SaveAsTable should not be changed by switching v1 to v2 command. This is similar to #47995. For the case of `DelegatingCatalogExtension` we need it goes to V1 commands to be consistent with previous behavior. Behavior regression. No UT No Closes #48019 from amaliujia/regress_v2. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 37b39b4) Signed-off-by: Wenchen Fan <wenchen@databricks.com>

… writing For custom catalogs that have access control, read and write permissions can be different. However, currently Spark always call `TableCatalog#loadTable` to look up the table, no matter it's for read or write. This PR adds a variant of `loadTable`: `loadTableForWrite`, in `TableCatalog`. All the write commands will call this new method to look up tables instead. This new method has a default implementation that just calls `loadTable`, so there is no breaking change. allow more fine-grained access control for custom catalogs. No new tests no Closes #47772 from cloud-fan/write. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Wenchen Fan <cloud0fan@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit b6164e6) Signed-off-by: Wenchen Fan <wenchen@databricks.com>

…be changed by falling back to v1 command This is a followup of #47772 . The behavior of SaveAsTable should not be changed by switching v1 to v2 command. This is similar to #47995. For the case of `DelegatingCatalogExtension` we need it goes to V1 commands to be consistent with previous behavior. Behavior regression. No UT No Closes #48019 from amaliujia/regress_v2. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 37b39b4) Signed-off-by: Wenchen Fan <wenchen@databricks.com>

loadTable should indicate if it's for writing

8586259

github-actions bot added the SQL label Aug 15, 2024

cloud-fan force-pushed the write branch from a6892ef to 89ba48b Compare August 16, 2024 14:32

address comments

6bb83d5

cloud-fan force-pushed the write branch from 89ba48b to 6bb83d5 Compare August 16, 2024 18:39

cloud-fan commented Aug 17, 2024

View reviewed changes

Comment thread sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableCatalog.java Outdated

cloud-fan commented Aug 17, 2024

View reviewed changes

Comment thread sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableWritePrivilege.java Outdated

cloud-fan commented Aug 17, 2024

View reviewed changes

cloud-fan and others added 4 commits August 20, 2024 11:07

Merge remote-tracking branch 'origin/master' into write

45e442c

Apply suggestions from code review

1a352fe

Merge remote-tracking branch 'origin/master' into write

33e1788

Merge remote-tracking branch 'my/write' into write

50f32b6

cloud-fan closed this in b6164e6 Aug 21, 2024

roryqi mentioned this pull request Jan 7, 2026

[FLINK-38848] [table] Supports to pass the required privilege when catalog manager gets the table apache/flink#27389

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-49246][SQL] TableCatalog#loadTable should indicate if it's for writing#47772

[SPARK-49246][SQL] TableCatalog#loadTable should indicate if it's for writing#47772
cloud-fan wants to merge 6 commits intoapache:masterfrom
cloud-fan:write

cloud-fan commented Aug 15, 2024 •

edited

Loading

Uh oh!

yaooqinn commented Aug 15, 2024

Uh oh!

cloud-fan commented Aug 15, 2024

Uh oh!

yaooqinn commented Aug 15, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

cloud-fan commented Aug 17, 2024

Uh oh!

cloud-fan Aug 17, 2024

Uh oh!

cloud-fan Aug 17, 2024

Uh oh!

huaxingao commented Aug 19, 2024

Uh oh!

cloud-fan commented Aug 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cloud-fan commented Aug 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

yaooqinn commented Aug 15, 2024

Uh oh!

cloud-fan commented Aug 15, 2024

Uh oh!

yaooqinn commented Aug 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cloud-fan commented Aug 17, 2024

Uh oh!

cloud-fan Aug 17, 2024

Choose a reason for hiding this comment

Uh oh!

cloud-fan Aug 17, 2024

Choose a reason for hiding this comment

Uh oh!

huaxingao commented Aug 19, 2024

Uh oh!

cloud-fan commented Aug 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cloud-fan commented Aug 15, 2024 •

edited

Loading

yaooqinn commented Aug 15, 2024 •

edited

Loading