minor: SortExec measure elapsed_compute time when sorting#12099
minor: SortExec measure elapsed_compute time when sorting#12099alamb merged 2 commits intoapache:mainfrom
Conversation
Whilst investigating query execution performance I noticed that some SortExec nodes were reporting suspiciously short elapsed_compute times. It appears that the SortExec node wasn't running the elapsed_compute timer when it doing the actual sorting operation.
alamb
left a comment
There was a problem hiding this comment.
Thank you @mhilton -- I think this PR is good and an improvement over the current state of affairs and could be merged as is
However I think there are other potential places that the CPU is not accounted for either that we may want to investigate
| let fetch = self.fetch; | ||
| let expressions = Arc::clone(&self.expr); | ||
| let stream = futures::stream::once(futures::future::lazy(move |_| { | ||
| let timer = metrics.elapsed_compute().timer(); |
There was a problem hiding this comment.
I looked at the rest of this code, and I think there are other non trivial consumers of compute (e.g. the call to concat_batches above)
What I would recommend is moving this timer up into the SortExec::in_mem_sort and add one in SortExec::in_mem_sort_stream
There was a problem hiding this comment.
I did double check and the streaming_merge code does seem to track elapsed compute reasonably:
datafusion/datafusion/physical-plan/src/sorts/merge.rs
Lines 170 to 173 in 37428bb
There was a problem hiding this comment.
Where I have put the timer is part of the stream, so would not execute until the stream is polled, that means it wouldn't be tracked if the timer was moved to in_mem_sort_stream. I agree there should probably be another one there too though. in_mem_sort looks to be reasonably trivial to me, but I could add timers in the non-async blocks too if you think it's worthwhile.
There was a problem hiding this comment.
Ah, I think I missed the fact the timer is in the future 👍
| let fetch = self.fetch; | ||
| let expressions = Arc::clone(&self.expr); | ||
| let stream = futures::stream::once(futures::future::lazy(move |_| { | ||
| let timer = metrics.elapsed_compute().timer(); |
There was a problem hiding this comment.
Ah, I think I missed the fact the timer is in the future 👍
Whilst investigating query execution performance I noticed that some SortExec nodes were reporting suspiciously short elapsed_compute times. It appears that the SortExec node wasn't running the elapsed_compute timer when it doing the actual sorting operation.
Which issue does this PR close?
Closes #.
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?