-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Push down limit to sort #3530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Push down limit to sort #3530
Conversation
| .collect::<Result<Vec<SortColumn>>>()?; | ||
|
|
||
| let indices = lexsort_to_indices(&sort_columns, None)?; | ||
| let indices = lexsort_to_indices(&sort_columns, fetch)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key optimization: this returns only n indices after the change.
269e2b0 to
8c19b9b
Compare
bf01caf to
4306134
Compare
Codecov Report
@@ Coverage Diff @@
## master #3530 +/- ##
=======================================
Coverage 85.80% 85.81%
=======================================
Files 300 300
Lines 55382 55424 +42
=======================================
+ Hits 47520 47561 +41
- Misses 7862 7863 +1
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
4306134 to
2959049
Compare
| .as_any() | ||
| // SortExec preserve_partitioning=False, fetch=Some(n)) | ||
| // -> SortPreservingMergeExec (SortExec preserve_partitioning=True, fetch=Some(n)) | ||
| let parallel_sort = plan_any.downcast_ref::<SortExec>().is_some() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we now have the pushdown - we can use fetch, and support more than just a limit directly after sort.
Support skip, fix test Fmt Add limit directly after sort Update comment Simplify parallel sort by using new pushdown Clippy
2959049 to
4b1a86a
Compare
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really neat idea @Dandandan -- very beautiful implementation
| metrics_set: CompositeMetricsSet, | ||
| /// Preserve partitions of input plan | ||
| preserve_partitioning: bool, | ||
| /// Fetch highest/lowest n results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see -- this seems like it it now has the information plumbed to the SortExec to implement "TopK" within the physical operator's implementation. 👍
Very cool
| .collect::<Result<Vec<SortColumn>>>()?; | ||
|
|
||
| let indices = lexsort_to_indices(&sort_columns, None)?; | ||
| let indices = lexsort_to_indices(&sort_columns, fetch)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this will effectively get us much of the benefit of a special TopK operator as we don't have to copy the entire input -- we only copy the fetch limit, if specified
Although I suppose SortExec still buffers all of its input where a TopK could buffer them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, I wonder if you could also apply the limit here:
as part of sorting each batch -- rather than keeping the entire input batch, we only need to keep at most fetch rows from each batch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lexsort_to_indices already returns only fetch indices per batch, this is used to take that nr. of indices per batch, throwing away the rest of the rows.
The remaining optimization I think is tweaking SortPreservingMergeStream to only maintain fetch records in the heap instead of all fetch top records for each batch in the partition as mentioned here #3516 (comment). After this I think we have a full TopK implementation that only needs to keep n number of rows in memory (per partition).
I would like to do this in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A separate PR is a great idea 👍
lexsort_to_indices already returns only fetch indices per batch, this is used to take that nr. of indices per batch, throwing away the rest of the rows.
Right, the point I was trying to make is that there are 2 calls to lexsort_to_indices in sort.rs. I think this PR only pushed fetch to one of them. The second is https://github.com/apache/arrow-datafusion/blob/3a9e0d0/datafusion/core/src/physical_plan/sorts/sort.rs#L826 and I think it is correct to push fetch there too
I was thinking if we applied fetch to the second call, we could get close to the same effect without changing SortPreservingMergeStream.
- After this PR, sort buffers
num_input_batches * input_batch_sizerows. - Adding
fetchto the other call tolexsort_to_indiceswould would buffernum_input_batches * limitrows - Extending
SortPreservingMergeStreamwould allow us to buffer onlylimitrows.
So clearly extending SortPreservingMergeStream is optimal in terms of rows buffered, but it likely requires a bit more effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't look to much at the rest of the implementation, I think you're right that providing fetch to the other lexsort_to_indices would be beneficial as well. I will create a issue for this and issue a PR later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the current change already buffers num_input_batches * limit by the way, as it is applied before adding them to the buffer. As far as I can see adding the second to lexsort_to_indices will reduce mainly the output of the individual sorts to fetch rows - which is of course beneficial too as that reduces time to sort and limit the input again to take and input to SortPreservingMergeExec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
|
Benchmark runs are scheduled for baseline = c7f3a70 and contender = 81b5794. 81b5794 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes: #3528
Rationale for this change
vs after #3527
What changes are included in this PR?
Are there any user-facing changes?