-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Memory limited nested-loop join #5564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you @korowa -- I plan to review this tomorrow |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @korowa -- I went through this PR commit by commit and I think it looks great. Breaking it into two commits made it much easier to review.
cc @liukun4515 who contributed this operator originally in #4562
| /// Operator-level memory reservation for left data | ||
| reservation: OperatorMemoryReservation, | ||
| /// Execution metrics | ||
| metrics: ExecutionPlanMetricsSet, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for unifying the metrics at the same time
| ); | ||
| let filter = prepare_join_filter(); | ||
|
|
||
| let join_types = vec![ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
|
Benchmark runs are scheduled for baseline = 26eb406 and contender = 612eb1d. 612eb1d is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Part of #5220.
Rationale for this change
Control over memory allocations in
NestedLoopJoinExecWhat changes are included in this PR?
First commit modifies join operator:
NestedLoopJoinExec(for build-side data) and toNestedLoopJoinStream(forvisited_left_side)NestedLoopJoinExecnow usesBuildProbeJoinMetricsand is able to expose metrics viaexplain analyzeSecond commit is an attempt to avoid nested mutexes (discussion) -- it's solved via structs for
SharedMemoryReservation&SharedOptionalMemoryReservation(instead ofOperatorMemoryReservation), andTryGrowtrait, used in build-side data collection inHashJoinExec(both types of reservations potentially could be passed intocollect_leftclosure).Are these changes tested?
There are test cases for
MemoryReservationwrappersAre there any user-facing changes?
Nested-loop joins should respect memory pool limits