Skip to content

GroupBy: Validation of output names, and a gross hack for v1 subqueries.#3686

Merged
nishantmonu51 merged 1 commit intoapache:masterfrom
gianm:groupby-validation
Nov 29, 2016
Merged

GroupBy: Validation of output names, and a gross hack for v1 subqueries.#3686
nishantmonu51 merged 1 commit intoapache:masterfrom
gianm:groupby-validation

Conversation

@gianm
Copy link
Copy Markdown
Contributor

@gianm gianm commented Nov 12, 2016

Prevents having dimensions, aggregators, or postaggregators with the same name as each other. In the past they would have clobbered each other, which would lead to unexpected query results.

@gianm gianm added the Bug label Nov 12, 2016
@gianm gianm added this to the 0.9.3 milestone Nov 12, 2016
@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Nov 13, 2016

Closing for now since this will conflict with #3685.

@gianm gianm closed this Nov 13, 2016
@gianm gianm reopened this Nov 13, 2016
@gianm gianm force-pushed the groupby-validation branch from 33eab40 to 8048b9b Compare November 13, 2016 00:42
@gianm gianm changed the title GroupBy: Validation of non-colliding output names. GroupBy: Validation of output names, and a gross hack for v1 subqueries. Nov 13, 2016
@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Nov 13, 2016

Don't merge until #3685 is merged; this one is rebased off that.

@gianm gianm force-pushed the groupby-validation branch from 8048b9b to 0791c49 Compare November 13, 2016 01:15
@fjy
Copy link
Copy Markdown
Contributor

fjy commented Nov 14, 2016

@gianm this now has conflicts

@fjy
Copy link
Copy Markdown
Contributor

fjy commented Nov 14, 2016

👍 but FWIW, groupby v1 is dead to me

Copy link
Copy Markdown
Member

@leventov leventov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small thing: toArray(new T[0]) is a better default choice now, see https://shipilev.net/blog/2016/arrays-wisdom-ancients/#_conclusion, "Bottom line" section.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's kind of surprising and good to know.

@gianm gianm force-pushed the groupby-validation branch from 0791c49 to 0841689 Compare November 14, 2016 21:55
@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Nov 14, 2016

Rebased

v1 subqueries try to use aggregators to "transfer" values from the inner
results to an incremental index, but aggregators can't transfer all kinds of
values (strings are a common one). This is a workaround that selectively
ignores what the outer aggregators ask for and instead assumes that we know
best.

These are in the same commit because the name validation changed the kinds of
errors that were thrown by v1 subqueries.
@gianm gianm force-pushed the groupby-validation branch from 0841689 to 6bff8aa Compare November 14, 2016 22:03
Copy link
Copy Markdown
Member

@nishantmonu51 nishantmonu51 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few comments for log messages, otherwise seems good.


for (AggregatorFactory aggregator : aggregators) {
if (!outputNames.add(aggregator.getName())) {
throw new IAE("Duplicate output name[%s]", aggregator.getName());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change log message to "Duplicate Aggregator name[%s]"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that message would be misleading - the duplicate name might be a dimension and aggregator with the same name, not two aggregators.


for (PostAggregator postAggregator : postAggregators) {
if (!outputNames.add(postAggregator.getName())) {
throw new IAE("Duplicate output name[%s]", postAggregator.getName());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate postagg name ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment to above

@fjy
Copy link
Copy Markdown
Contributor

fjy commented Nov 17, 2016

@nishantmonu51 any more comments?

Copy link
Copy Markdown
Member

@nishantmonu51 nishantmonu51 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@nishantmonu51 nishantmonu51 merged commit 6922d68 into apache:master Nov 29, 2016
dgolitsyn pushed a commit to metamx/druid that referenced this pull request Feb 14, 2017
…es. (apache#3686)

v1 subqueries try to use aggregators to "transfer" values from the inner
results to an incremental index, but aggregators can't transfer all kinds of
values (strings are a common one). This is a workaround that selectively
ignores what the outer aggregators ask for and instead assumes that we know
best.

These are in the same commit because the name validation changed the kinds of
errors that were thrown by v1 subqueries.
@gianm gianm deleted the groupby-validation branch September 23, 2022 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants