-
Notifications
You must be signed in to change notification settings - Fork 3k
Spark: Support altering view properties #9582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark: Support altering view properties #9582
Conversation
| verifyColumnCount(ident, columnAliases, query) | ||
| SchemaUtils.checkColumnNameDuplication(query.schema.fieldNames, SQLConf.get.resolver) | ||
|
|
||
| case UnsetViewProperties(ResolvedV2View(catalog, ident), propertyKeys, ifExists) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check rules are intended to ensure that SQL is correct and valid, not to check the execution. If you have this check here, it would fail when you run EXPLAIN ALTER VIEW ... because this is doing execution work (checking for the property) in the planner. Another way to think about this is that this isn't an analysis failure, it is a runtime failure due to data. The SQL is perfectly valid.
This should either be done in AlterV2ViewExec or not at all. I would lean toward not doing this check at all. Shouldn't this be idempotent? Is there value in failing to remove a property if it isn't there? I don't see value so I'd ignore this and let the catalog throw an exception if it chooses to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I missed the ifExists part. looks like this behavior is required by Spark.
In that case, this should be moved to the Exec plan.
...k-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteViewCommands.scala
Outdated
Show resolved
Hide resolved
...k-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteViewCommands.scala
Show resolved
Hide resolved
| if (null != asViewCatalog) { | ||
| try { | ||
| org.apache.iceberg.view.View view = asViewCatalog.loadView(buildIdentifier(ident)); | ||
| UpdateViewProperties updateViewProperties = view.updateProperties(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also reject reserved properties? I don't think that you should be able to set format-version, location, or provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a miss on my end. For some reason reason I manually checked provider / location (which Spark would prevent from being set/unset) and assumed Spark would do this for all reserved properties.
I've added some tests and fixed this. Note that I'm rejecting both setting/removing reserved properties as it seems a bit confusing from a user's perspective if Spark rejects setting/removing certain properties but we only reject setting those.
It also appears that the behavior for tables is slightly different, where we're only rejecting setting reserved table properties, but we're not rejecting the removal of reserved table properties.
rdblue
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good. Just two major things:
- Checking if a property to unset exists should be done in the exec node
- Iceberg should fail if the caller attempts to set reserved properties
06461fe to
54a6abe
Compare
| .hasMessageContaining( | ||
| "The feature is not supported: location is a reserved table property"); | ||
|
|
||
| sql("ALTER VIEW %s UNSET TBLPROPERTIES ('format-version')", viewName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdblue it feels like maybe we should also prevent removing reserverd properties to be more explicit (even though they would be set by default when the view is loaded)? But I also see that we don't do this when removing reserved properties from a table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. You should not be able to unset these properties.
a2e312a to
32bc1d3
Compare
32bc1d3 to
646a645
Compare
| () -> sql("ALTER VIEW %s SET TBLPROPERTIES ('location' = 'random_location')", viewName)) | ||
| .isInstanceOf(AnalysisException.class) | ||
| .hasMessageContaining( | ||
| "The feature is not supported: location is a reserved table property"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Just because Spark has a bad error message doesn't mean Iceberg should. We could make this conform to our standard, which is more clear: "Cannot set reserved property: location"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see this is coming from Spark. I think the other messages (that we generate) could also improve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah we don't have much control over the error msg coming from Spark unfortunately
| () -> sql("ALTER VIEW %s SET TBLPROPERTIES ('format-version' = '99')", viewName)) | ||
| .isInstanceOf(UnsupportedOperationException.class) | ||
| .hasMessageContaining( | ||
| "Cannot specify 'format-version' because it's a reserved view property"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about Cannot set reserved property: %s? That's more direct and also uses the same verb that the caller used, "set".
|
|
||
| assertThatThrownBy(() -> sql("ALTER VIEW %s UNSET TBLPROPERTIES ('unknown-key')", viewName)) | ||
| .isInstanceOf(AnalysisException.class) | ||
| .hasMessageContaining("Attempted to unset non-existing property 'unknown-key'"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this isn't Spark, we should fix the message: Cannot remove property that is not set: %s
| assertThatThrownBy(() -> sql("ALTER VIEW %s UNSET TBLPROPERTIES ('format-version')", viewName)) | ||
| .isInstanceOf(UnsupportedOperationException.class) | ||
| .hasMessageContaining( | ||
| "Cannot specify 'format-version' because it's a reserved view property"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, it would be nice to improve the error message.
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkView.java
Outdated
Show resolved
Hide resolved
rdblue
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I have some minor style comments mostly about error messages.
This adds support for the below `ALTER` cases as defined in https://spark.apache.org/docs/latest/sql-ref-syntax-ddl-alter-view.html: * `ALTER VIEW <viewName> SET TBLPROPERTIES (...)` * `ALTER VIEW <viewName> UNSET TBLPROPERTIES (...)`
646a645 to
9e3eb5a
Compare
|
Thanks for the review @rdblue, I've updated the error msgs and also renamed the internal property to |
This adds support for the below `ALTER` cases as defined in https://spark.apache.org/docs/latest/sql-ref-syntax-ddl-alter-view.html: * `ALTER VIEW <viewName> SET TBLPROPERTIES (...)` * `ALTER VIEW <viewName> UNSET TBLPROPERTIES (...)`
This adds support for the below
ALTERcases as defined in https://spark.apache.org/docs/latest/sql-ref-syntax-ddl-alter-view.html:ALTER VIEW <viewName> SET TBLPROPERTIES (...)ALTER VIEW <viewName> UNSET TBLPROPERTIES (...)