-
Notifications
You must be signed in to change notification settings - Fork 0
Simplify code for LIMIT in aggregate stream #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| let elapsed_compute = self.baseline_metrics.elapsed_compute().clone(); | ||
|
|
||
| // common function for signalling end of processing of the input stream | ||
| fn set_input_done_and_produce_output( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function only modifies fields on GroupedHashAggregateStream so we can reduce a level of indirection and follow the pattern in other functionality in this module by moving it to the impl GroupedHashAggregateStream { ... }
| // make sure the exec_state just set is not overwritten below | ||
| break 'reading_input; | ||
| } | ||
| if self.hit_soft_group_limit() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose moving this check into its own function to try and keep the core group aggregate logic as simple as possible
| } | ||
| if self.hit_soft_group_limit() { | ||
| timer.done(); | ||
| extract_ok!(self.set_input_done_and_produce_output()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think changing the signature of set_input_done_and_produce_output to return just a Result<()> simplifies handling and makes it clearer what can be returned
| } | ||
|
|
||
| // common function for signalling end of processing of the input stream | ||
| fn set_input_done_and_produce_output(&mut self) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original signature implies this can return early as well as may return a RecordBatch, neither of which is true
|
Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days. |
This is a proposed way to simplify the logic in https://github.com/apache/arrow-datafusion/pull/8038/files
Note it currently targets apache#8038 but we can also apply it as a PR directly to DataFusion after apache#8038 is merged
I will comment inline on the rationale and changes
cc @msirek