-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-28948][SQL] Support passing all Table metadata in TableProvider #25651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| * @throws UnsupportedOperationException | ||
| */ | ||
| default Table getTable(CaseInsensitiveStringMap options, StructType schema) { | ||
| default Table getTable( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll refine the classdoc of this interface, after we reach an agreement of the proposal.
|
Test build #110014 has finished for PR 25651 at commit
|
|
Test build #110027 has finished for PR 25651 at commit
|
|
retest this please. |
|
Test build #110035 has finished for PR 25651 at commit
|
| } | ||
|
|
||
| private[sql] object V2SessionCatalog { | ||
| case class SchemaChangedException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just SchemaException
|
+1 with the proposal. |
| val dsOptions = new CaseInsensitiveStringMap(finalOptions.asJava) | ||
| val table = userSpecifiedSchema match { | ||
| case Some(schema) => provider.getTable(dsOptions, schema) | ||
| case Some(schema) => provider.getTable(dsOptions, schema, Array.empty) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't correct. The DataFrameReader does not know that the table is unpartitioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this is one of the most annoying and confusing behaviors of DataSourceV1. Being able to provide a schema but not partitioning information, which leads to minutes of partition schema inference.
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanSupportCheck.scala
Outdated
Show resolved
Hide resolved
| override def getTable( | ||
| options: CaseInsensitiveStringMap, | ||
| schema: StructType, | ||
| partitions: Array[Transform]): Table = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not identical to the metadata passed to createTable? Is there a reason not to pass the table properties as well as the read options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this should look similar to createTable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read options should be passed in Table.newScanBuilder. The options here is the table properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we do have a problem here. Table properties are case sensitive while scan options are case insensitive.
Think about 2 cases:
spark.read.format("myFormat").options(...).schema(...).load().
We need to get the table with the user-specifed options and schema. When scan the table, we need to use the user-specified options as scan options. The problem is,DataFrameReader.optionsspecifies both table properties and scan options in this case.CREATE TABLE t USING myFormat TABLEPROP ...and thenspark.read.options(...).table("t")
In this case,DataFrameReader.optionsonly specifies scan options.
Ideally, TableProvider.getTable takes table properties which should be case sensitive. However, DataFrameReader.options also specifies scan options which should be case insensitive.
I don't have a good idea now. Maybe it's OK to treat this as a special table which accepts case insensitive table properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we can make table properties case insensitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This interface should pass the table properties. There is no need to pass read or write options at this point, unless they can't be separated from table properties (as in the DataFrameReader case). The read options and write options should be passed to the logical plan -- this is added in #25681: https://github.com/apache/spark/pull/25681/files#diff-94fbd986b04087223f53697d4b6cab24R275
I propose passing table properties as a string map (java.util) through this interface. When the properties come from the metastore, then this is fine. When the properties come from DataFrameReader.option (or the write equivalent) then the original case sensitive map should be passed. Then the read options should additionally be passed to the correct plan node so that the physical plan can push them into the scan or the write.
| val partitions = new mutable.ArrayBuffer[Transform]() | ||
|
|
||
| v1Table.partitionColumnNames.foreach { col => | ||
| partitions += LogicalExpressions.identity(col) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This parses the column's name as a multi-part identifier, which is subtly incorrect. (It'll cause issues if the column name contains special characters like ':'.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't column names with special characters supported? I thought you could escape any identifier using back-ticks.
|
@cloud-fan, can you update the PR title and description? The I think that clarifying the problem statement will also help clean up the proposed changes. For example, this passes some -- but not all -- |
0da5453 to
2333585
Compare
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableProvider.java
Show resolved
Hide resolved
| throw new NoSuchTableException(ident) | ||
| } | ||
|
|
||
| tryResolveTableProvider(V1Table(catalogTable)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the core change of this PR.
|
Test build #110919 has finished for PR 25651 at commit
|
2333585 to
40e2894
Compare
|
Test build #111290 has finished for PR 25651 at commit
|
40e2894 to
3a6d13d
Compare
|
Test build #111293 has finished for PR 25651 at commit
|
3a6d13d to
0f9faca
Compare
|
Test build #111298 has finished for PR 25651 at commit
|
|
@cloud-fan, thanks for working on this. I plan to review it tomorrow. Looks like this is huge and touches about 50 files. Is there a way to make it smaller? |
|
@rdblue yea it's possible. In this PR, I try to adopt your suggestion to make it clear that
To make it consistent, I change it to use We can still keep the old method signature with a TODO to change it later, so that this PR can be much smaller. |
2bf1c59 to
cfc2264
Compare
|
Test build #111905 has finished for PR 25651 at commit
|
|
retest this please |
|
Test build #112044 has finished for PR 25651 at commit
|
|
Test build #112108 has finished for PR 25651 at commit
|
|
I should have time to review this on Thursday if it is ready. Until then, I commented on some of the open threads. |
|
@rdblue , when I try to have separated methods The major problems hit so far:
I feel that, the existing API ( |
|
Test build #112405 has finished for PR 25651 at commit
|
Why should this internal concern of one source affect the API? Partitioning inference does not require listing all the files in a table, and I doubt that it is a good idea to do that for schema inference either. If a table is small, it doesn't matter if this work is done twice (before it is fixed); and if a table is really large, then it isn't a good idea to do this for schema inference anyway.
This statement makes assumptions about the behavior of path-based tables and that behavior hasn't been clearly defined yet. Can you be more specific about the case and how you think path-based tables will behave? I disagree that no schema or partition inference should be done for writing. Maybe it isn't done today, but if there is existing data, Spark shouldn't allow writing new data that will break a table by using an incompatible schema or partition layout. In that case, we would want to infer the schema and partitioning. Also, if it isn't necessary to infer schema and partitioning, then this information still needs to be passed to the table. When running a CTAS operation, Spark might be called with |
|
I was not trying to define the behavior, but talking about the existing behavior. |
|
If you really think that schema or partition inference should be done for writing, we should disable file source v2 by default to not surprise users. |
I'm not saying that it is what we should do. That should be covered by a design doc for path-based tables. My point is that the claim that it won't be done is not necessarily true and makes assumptions about how these tables will behave. |
|
Do you mean we should block this PR until we figure out the behavior of path-based tables? This PR simply makes I'm OK to adopt the new API style if it doesn't break the existing behavior. But seems now we are unable to keep file source skipping schema/partition inference during write. Shall we discuss the new API style later? |
No, and sorry for the misunderstanding! My point is that your claim that inference won't be used in the write path is not necessarily correct and depends on the behavior we decide for path-based tables.
I think it's an exaggeration to say "unable". Partition inference in particular can be done much more easily and efficiently than depending on a recursive directory listing to find all data files. Granted, the current implementation would need to change, but do you really think that "unable" is an accurate description? The problem is that this needs to be decided because it affects the API that will go into Spark 3.0. I think we should go with what we agreed was a good solution for the API -- adding the |
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
|
Test build #122766 has finished for PR 25651 at commit
|
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
Currently Data Source V2 has 2 major use cases:
DataFrameReader/Writer, or register it as a table in Spark.Use case 1 is newly introduced in the master branch, which greatly improves the user experience when interacting with external storage systems that have catalogs, e.g. cassandra, JDBC, etc.
Use case 2 is the main use case of Data Source V1, which works well if the external storage system doesn't have a catalog, e.g. parquet files on S3.
However, use case 2 is not well supported. For example
This fails with
AnalysisException: com.abc.MyTableProvider is not a valid Spark SQL Data Source. The session catalog always treats table provider as v1 source.To support it, this PR updates
TableProvider#getTableto accept additional table metadata info. The expected behaviors are defined in https://docs.google.com/document/d/1oaS0eIVL1WsCjr4CqIpRv6CGkS5EoMQrngn3FsY1d-Q/edit?usp=sharingWhy are the changes needed?
Make Data Source V2 supports the use case that is supported by Data Source V1.
Does this PR introduce any user-facing change?
Yes, it's a new feature
How was this patch tested?
a new test suite