Result of GroupByQueryRunner should not be considered to be ordered#2142
Result of GroupByQueryRunner should not be considered to be ordered#2142navis wants to merge 3 commits intoapache:masterfrom
Conversation
|
I only see the changes in GroupByQueryEngine for removing the ordering, |
|
The sorting process is expensive and it's not that useful for group-by query, I think. For merging, group-by query adds all values into I also think |
|
sounds good, can you also share any performance numbers on how much improvements are expected by this change ? |
|
@nishantmonu51 The point it that the result of group-by query runner should not be regarded as sorted. I think we can add option for it(sorting) to be done in just last stage of group by. Opinion? |
3b751d3 to
d90b647
Compare
87555da to
c1d6028
Compare
|
@nishantmonu51 #2571 is for using hash map rather than treemap in incremental index. This is for sorting aggregated result (from incremental index) only in the final stage. Similar but different patch. |
There was a problem hiding this comment.
why does this need to be true?
|
@navis i think changes can be simplified by not supporting the sort at all in the engine, can't see any scenario when it needs to return sorted results. |
|
@himanshug Yes, the first patch was just like that. But removing sort part makes whole group-by tests to be failed. And even worse, it will make different results with different JDK versions. |
|
@navis do you mean #2142 (comment) place's true is to make unit tests pass? but it could use unsorted results and it runs as part of main code. |
|
@himanshug It was like that but it seemed changed by changes of months. Then we can remove sort param. But can we remain the option in RowIterator? I think it can be exploited in some use. |
c1d6028 to
5764362
Compare
|
@navis with groupBy v2 changes, is this PR still necessary? |
|
@fjy Seemed not and hard to maintain. Let's close this. |
Currently, GroupByQueryEngine emits result by sorting on index value and regard that it's ordered in values also. But for IncrementalIndex, index value is temporary id and cannot be used for compare directly without referencing dictionary.