-
Notifications
You must be signed in to change notification settings - Fork 3k
Add sort order to writer classes #2214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The code change looks good to me, but I am a bit confused about the scope of this PR (I only read the Spark path, but the same applies to Flink). In |
|
Thanks Jack for reviewing the PR!
I think
Yes, in this PR I was trying to add capability of passing sortOrder info to writers within Iceberg library. I did think about expanding the support to |
Yes I was referring to the appender. Thanks for the clarification.
Okay I see, in that case this should be fine. |
| .withPartition(partition) | ||
| .equalityFieldIds(equalityFieldIds) | ||
| .withKeyMetadata(file.keyMetadata()) | ||
| .withSortOrder(sortOrder) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yyanyy, I think we cannot mark files with the current sort order id by default as there is no guarantee the incoming records are sorted. If we had an API to request a specific distribution and ordering in a query engine (will be possible in Spark 3.2, for example), only then we could annotate the files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review! Yes I wasn't sure when sortOrder will be available from engine, and they are currently all null for now. I assumed that this information could be available when engine constructs the appender factory like SparkAppenderFactory since it seems that the factory will be created per task level in Spark, so that sortOrder can be assigned based on individual task. But since I don't know the details about each engine I'll revert the changes to each appender factory for now.
|
cc @szehon-ho @RussellSpitzer @rdblue who participated in #2240. This PR has been open for a while. I think it has a great basis for propagating the sort order to writers but I am afraid we cannot mark written files with the current sort order unless we sure those files are sorted. That will require a query engine API to request a distribution and ordering. For Spark, it will be possible only in 3.2. What about the other query engine integrations, @openinx @pvary? If my understanding is correct, what about limiting the scope of this PR to just propagate the sort order to writers? |
|
Thanks for updating the PR, @yyanyy! Let me take a look now. |
aokolnychyi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks correct to me. I had just a couple of questions.
| return this; | ||
| } | ||
|
|
||
| public DeleteWriteBuilder withSortOrder(SortOrder newSortOrder) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We add an API for setting the sort order for equality deletes. Shall we also add the ability to set an order for regular data files? We won't immediately use it but it will make the code more consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we decide to do that, we should probably cover ORC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review! I think setting an order for regular data files is covered by changes in DataWriter file, as in Avro/Parquet/ORC files what WriteBuilder builds is actually an FileAppender that doesn't track file properties, and it's the DataWriter which wraps around this file appender that actually populates these information into a DataFile when it closes. I do think this seems a bit confusing and we may want to refactor these classes a bit to make delete and data writers to be constructed in a more similar way.
Regarding ORC - the reason for me to not change ORC is that ORC seems to not support delete writer. I wasn't sure if it was intentional, or just because no one had time to do it. If the latter is the case, I'm happy to create a separate PR to add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yeah, you are right. I think we should refactor this to make it clear (independently of this PR).
There are a few points to discuss.
The first one is FileAppenderFactory. It was originally created to produce FileAppender instances only, which made sense. Now, it also creates data and delete writers too, which makes less sense to me. I'd consider deprecating methods for creating writers in FileAppenderFactory and would create WriterFactory instead.
WriterFactory {
newDataWriter(...)
newEqualityDeleteWriter(...)
newPositionDeleteWriter(...)
}
Second, I'd consider adding a method called buildWriter to our Parquet, Avro, ORC classes that would create and initialize DataWriter, making it consistent with our delete writers. Otherwise, it looks a bit weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts, @yyanyy @rdblue @openinx @jackye1995?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
W.r.t ORC, I think we just did not have time to implement it. I'd be happy to review if there is a PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the first sounds good to me! I think both DataWriter and *DeleteWriter are wrapper around appender, but today to construct the latter we have to specify everything in appender factories for each file format, and Parquet Avro actually directly pass them over to the underlying appender. I wonder if we can have separate newAppender and newDeleteAppender in FileAppenderFactory (or even generalize into one) so that we don't do this delegation per file format, and move the three methods you mentioned to WriterFactory to separate the two writer factories. In this case WriterFactory may always rely on a FileAppenderFactory in constructor, which might still be fine.
For adding buildWriter to Parquet, Avro, ORC classes, I'm not entirely sure though, since DataWriter does not differ much among three file formats, so adding them to each file format may actually increase duplicated code. I haven't looked into the code deeply but I wonder if we can abstract the creation of appender out from delete builders (similar to what mentioned above), so that the new WriterFactory can create delete writers as simple as create data writers today, so that we will just have this logic in each WriterFactory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd need to take a closer look too. However, I do like the idea of abstracting appenders from writers. Let's discuss this in a separate issue, @yyanyy.
| this(appender, format, location, spec, partition, keyMetadata, null); | ||
| } | ||
|
|
||
| public DataWriter(FileAppender<T> appender, FileFormat format, String location, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We instantiate DataWriter in FileAppenderFactory implementations. It is absolutely correct that we don't propagate the sort order id now. However, should we consider adding a comment that explains why? Like we cannot guarantee the incoming records are sorted as needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to make sure I understand the suggestion correctly, do you mean to add a comment here to discourage people from using this new constructor for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was referring to the place where, for example, SparkAppenderFactory instantiates DataWriter:
@Override
public DataWriter<InternalRow> newDataWriter(EncryptedOutputFile file, FileFormat format, StructLike partition) {
return new DataWriter<>(newAppender(file.encryptingOutputFile(), format), format,
file.encryptingOutputFile().location(), spec, partition, file.keyMetadata());
}
Here we are using the constructor without sort order. I thought about a comment explaining why but I think we should do that later. We will eventually add sort order to classes like SparkAppenderFactory and then can add the comment.
To sum up, let's ignore my original comment for now.
| this(appender, format, location, spec, partition, keyMetadata, null, equalityFieldIds); | ||
| } | ||
|
|
||
| public EqualityDeleteWriter(FileAppender<T> appender, FileFormat format, String location, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to deprecate the old constructor? Not a big deal but I am bit worried about maintaining a new constructor each time we modify this class. At the same time, converting this into a builder is probably an overkill. This is a pretty internal API too so I'd be fine with breaking it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I think we can remove the old constructor in this case, as I think this class is for supporting V2 table and thus shouldn't have a lot of dependencies outside of the library packages for production usage, so we should probably be able to do it now. I will update the PR to update the existing constructor unless people have further comments on this thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be fine keeping if there is a valid use case but I'd hope folks use a higher-level API for writing deletes.
aokolnychyi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may avoid keeping the old constructors but this change looks good to me.
|
Looks like some CI job hanged. Triggering another testing round. |
|
Thanks for working on this, @yyanyy! And thanks for the review, @aokolnychyi. I agree that we can probably improve the appender factory or replace it with something for all writers. It's a little messy there so I'm glad you're thinking about how to clean it up. Thanks! |
Followup of #1975