-
Notifications
You must be signed in to change notification settings - Fork 1.9k
chore: refactor BuildProbeJoinMetrics to use BaselineMetrics
#16500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2010YOUY01
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I left some suggestions, looking forward to your thoughts.
| pub(crate) output_rows: metrics::Count, | ||
| } | ||
|
|
||
| impl Drop for BuildProbeJoinMetrics { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's the best way but I guess it's okay to merge 🤔 If we forget to count a specific time period in some join operator, we can fix it in the future.
Could you also add a brief comment to explain this drop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. This is not intuitive at all. When I came back to this PR to act on your suggestions, it took me a while to remember how this works :)
I have added a comment for now. I would be open to other ways of handling this!
6fb44a6 to
b5e87fb
Compare
|
Rebased on latest main |
|
Thank you! this implementation looks correct to me. Since the state transition in joins are tricky, could you add a test (or ensure there are some existing tests), to double-check this refactor won't change the result of the related metrics? |
Closes apache#16495 Here's an example of an `explain analyze` of a hash join showing these metrics: ``` [(WatchID@0, WatchID@0)], metrics=[output_rows=100, elapsed_compute=2.313624ms, build_input_batches=1, build_input_rows=100, input_batches=1, input_rows=100, output_batches=1, build_mem_used=3688, build_time=865.832µs, join_time=1.369875ms] ``` Notice `output_rows=100, elapsed_compute=2.313624ms` in the above.
|
I have added asserts for metrics in the existing join tests. The ones in hash_join and cross_join are working. The asserts in nested_loop_join are currently failing due to a mismatch in |
|
Actually, I see the same behavior on latest main. Looks like the output_rows metric in nested loop join is currently wrong? |
Thank you for the catch! Here might also need |
Thanks for the pointer! It makes sense now. I have added a record_poll there and also updated the |
This was needed because ExhaustedProbeSide state can also return output rows - in certain types of joins. Without this, the output_rows metric for nested loop join was wrong!
|
Fixed the formatting issues |
2010YOUY01
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again.
|
Thanks @Samyak2 and @2010YOUY01 |
Which issue does this PR close?
Closes #16495
Rationale for this change
What changes are included in this PR?
Add a
BaselineMetricsand removeoutput_rowsfromBuildProbeJoinMetrics.elapsed_computein baseline is populated in the Drop trait implementation by summing upjoin_timeandbuild_time.Are these changes tested?
Here's an example of an
explain analyzeof a hash join showing these metrics:Notice
output_rows=100, elapsed_compute=2.313624msin the above.Are there any user-facing changes?
No