Skip to content

Result of GroupByQueryRunner should not be considered to be ordered#2142

Closed
navis wants to merge 3 commits intoapache:masterfrom
navis:groupby-is-not-ordered
Closed

Result of GroupByQueryRunner should not be considered to be ordered#2142
navis wants to merge 3 commits intoapache:masterfrom
navis:groupby-is-not-ordered

Conversation

@navis
Copy link
Copy Markdown
Contributor

@navis navis commented Dec 22, 2015

Currently, GroupByQueryEngine emits result by sorting on index value and regard that it's ordered in values also. But for IncrementalIndex, index value is temporary id and cannot be used for compare directly without referencing dictionary.

@nishantmonu51
Copy link
Copy Markdown
Member

I only see the changes in GroupByQueryEngine for removing the ordering,
should this be accompanied by any more changes where the ordering is used while merging ?

@navis
Copy link
Copy Markdown
Contributor Author

navis commented Dec 23, 2015

The sorting process is expensive and it's not that useful for group-by query, I think. For merging, group-by query adds all values into IncrementalIndex not using any sequence merging iterators. And group-by query has own ordering spec in it.

I also think mergeSequences and mergeSequencesUnordered in GroupByQueryQueryToolChest should throw exception.

@nishantmonu51
Copy link
Copy Markdown
Member

sounds good, can you also share any performance numbers on how much improvements are expected by this change ?

@navis
Copy link
Copy Markdown
Contributor Author

navis commented Jan 5, 2016

@nishantmonu51 The point it that the result of group-by query runner should not be regarded as sorted. I think we can add option for it(sorting) to be done in just last stage of group by. Opinion?

@navis navis force-pushed the groupby-is-not-ordered branch from 3b751d3 to d90b647 Compare January 15, 2016 02:30
@navis navis force-pushed the groupby-is-not-ordered branch 2 times, most recently from 87555da to c1d6028 Compare March 16, 2016 00:09
@navis
Copy link
Copy Markdown
Contributor Author

navis commented Mar 22, 2016

@nishantmonu51 #2571 is for using hash map rather than treemap in incremental index. This is for sorting aggregated result (from incremental index) only in the final stage. Similar but different patch.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this need to be true?

@himanshug
Copy link
Copy Markdown
Contributor

@navis i think changes can be simplified by not supporting the sort at all in the engine, can't see any scenario when it needs to return sorted results.

@navis
Copy link
Copy Markdown
Contributor Author

navis commented Apr 25, 2016

@himanshug Yes, the first patch was just like that. But removing sort part makes whole group-by tests to be failed. And even worse, it will make different results with different JDK versions.

@himanshug
Copy link
Copy Markdown
Contributor

@navis do you mean #2142 (comment) place's true is to make unit tests pass? but it could use unsorted results and it runs as part of main code.

@navis
Copy link
Copy Markdown
Contributor Author

navis commented Apr 25, 2016

@himanshug It was like that but it seemed changed by changes of months. Then we can remove sort param. But can we remain the option in RowIterator? I think it can be exploited in some use.

@navis navis force-pushed the groupby-is-not-ordered branch from c1d6028 to 5764362 Compare April 25, 2016 01:00
@fjy
Copy link
Copy Markdown
Contributor

fjy commented Aug 29, 2016

@navis with groupBy v2 changes, is this PR still necessary?

@navis
Copy link
Copy Markdown
Contributor Author

navis commented Sep 12, 2016

@fjy Seemed not and hard to maintain. Let's close this.

@navis navis closed this Sep 12, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants