Skip to content

Fix bugs in query builders and in TimeBoundaryQuery.getFilter()#4131

Merged
himanshug merged 8 commits intoapache:masterfrom
metamx:query-queryMetrics-property
Apr 25, 2017
Merged

Fix bugs in query builders and in TimeBoundaryQuery.getFilter()#4131
himanshug merged 8 commits intoapache:masterfrom
metamx:query-queryMetrics-property

Conversation

@leventov
Copy link
Copy Markdown
Member

@leventov leventov commented Mar 29, 2017

Add Query.getQueryMetrics() and Query.withQueryMetrics() for use in MetricsEmittingQueryRunner and CPUTimeMetricQueryRunner. This is needed to emit some dimensions/metrics during query execution in query engines.

  • Fix most query builder's copy(query) methods, they were non-static (that doesn't make sense) and "abandoned": didn't copy some of the query fields.
  • In query.withXxx() methods, use builders instead of calling constructors for brevity.
  • Fix a bug in TimeBoundaryQuery.getFilter(): was always returning null.

A follow-up of #3954, a part of #3798.

@leventov leventov added this to the 0.10.1 milestone Mar 29, 2017
Copy link
Copy Markdown
Contributor

@drcrallen drcrallen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These change make sense overall. There is a signature change on Query which I'm scratching my brain to see if is really needed.

It feels like something like the query tool chest should be ensuring the query is in the right kind of shape for being run, and I'm not convinced having the runners modify the Query object is the right kind of pattern in the long run.

Thoughts?

boolean descending,
Map<String, Object> context
Map<String, Object> context,
QueryMetrics<?> queryMetrics
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we keep a constructor with the same signature for the sake of query extensions?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added compatibility constructor

)
{
final Sequence<T> baseSequence = delegate.run(query, responseContext);
QueryMetrics<? super Query<T>> queryMetrics = queryToolChest.makeMetrics(query);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this only be done if query.getQueryMetrics() is null?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, changed


applyCustomDimensions.accept(queryMetrics);

final Query<T> queryWithMetrics = query.withQueryMetrics(queryMetrics);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question, does this still need to be done if the query already has a QueryMetrics set in it?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

@leventov
Copy link
Copy Markdown
Member Author

@drcrallen

These change make sense overall. There is a signature change on Query which I'm scratching my brain to see if is really needed.

It feels like something like the query tool chest should be ensuring the query is in the right kind of shape for being run, and I'm not convinced having the runners modify the Query object is the right kind of pattern in the long run.

Didn't clearly understand, could you please elaborate?

@drcrallen
Copy link
Copy Markdown
Contributor

@leventov what I mean is that if you look at the implementations of QueryRunner, they tend to not modify the query object EXCEPT a few cases where a special function in the QueryToolChest does modification.

As such, I'm questioning if modification of the query in arbitrary QueryRunners AND in the QueryToolChest makes sense. It feels like a bad precedent to set. I suggest looking into making the QueryToolChest the place to create the metrics items and attach them to the query, and the query runners simply consume the set values.

@leventov
Copy link
Copy Markdown
Member Author

@drcrallen query object is already changed in the same way in some QueryRunners, e. g. FinalizeResultsQueryRunner and GroupByMergingQueryRunnerV2, via withOverriddenContext().

Generally when I was preparing this PR, I was thinking that currently the Query abstraction is overloaded, and should be split into:

  1. Query -- set of configs that affect the result data. Cachable.
  2. UniqueQuery/QueryInstance/? -- Query + identity of the query sent to the Druid cluster, i. e. queryId. Maybe don't need to be a separate abstraction, and should be merged with Query, or with QueryWithContext?
  3. QueryWithContext/RuntimeQuery/RichQuery/? - UniqueQuery + something attached to it, that makes sense only during the query execution within the Druid cluster. I. e. queryMetrics and most things, Query.context property is currently used for, except (maybe) queryId.

Then QueryRunner.run() accepts QueryWithContext and is free to modify it, but not the underlying Query.

But anyway, I think it shouldn't be done as part of this PR, for compatibility reasons it might need to be delayed to 0.11.0.

@himanshug
Copy link
Copy Markdown
Contributor

@leventov It appears that QueryMetrics is added to Query so that it could be passed around to various runners... how do you feel about changing QueryRunner.run(..) to accept ResponseContext instead of Map<String, Object> and keep QueryMetrics in ResponseContext . ResponseContext might look like...

ResponseContext {
  QueryMetrics queryMetrics,
  Map<String, Object> others, //current code context map
}

@leventov
Copy link
Copy Markdown
Member Author

leventov commented Mar 31, 2017

@himanshug as we discuss in #4113, responseContext shouldn't be used to pass anything to downstream query runners. Query.withOverriddenContext()/Query.getContext() should be used for that, it's a part of Query object. So I made QueryMetrics a part of Query object too.

On the other hand, the Query abstraction should probably be refactored into Query and QueryWithContext: #4131 (comment). But anyway, not as part of this PR and not in Druid 0.10.1.

@leventov
Copy link
Copy Markdown
Member Author

leventov commented Apr 4, 2017

@himanshug do you have other comments?

@himanshug
Copy link
Copy Markdown
Contributor

@leventov sorry, I was away for a while.
However, i thought #4113 conclusion is to change the type of responseContext into something that is safer to use concurrently. responseContext is indeed there to pass around and gather various things.
Query context is more for various "flags/configuration" that are indication to query processing engine for various things. these may or may not be explicitly specified by the user.

That said, I might be wrong or we might want to change things in direction you're suggesting. So, please remind us to discuss this in next dev sync up and we can conclude it.

@leventov
Copy link
Copy Markdown
Member Author

leventov commented Apr 7, 2017

@himanshug I don't see what we disagree about. #4113 and #4131 (comment) couldn't be done in a minor Druid version update like 0.10.1, because they break custom user query types and query runners. This PR adds QueryMetrics in a compatible way for 0.10.x.

I agree that Query is not a very intuitive/suitable place for QueryMetrics, but responseContext is worse. I used to pass some equivalent of QueryMetrics via responseContext when initially implemented this functionality several months ago, and got concurrency issues: #3803. So I think Query is the most suitable place for QueryMetrics for now.

I won't be able to participate next week's dev sync

@gianm
Copy link
Copy Markdown
Contributor

gianm commented Apr 10, 2017

@leventov, in reply to:

I agree that Query is not a very intuitive/suitable place for QueryMetrics, but responseContext is worse. This PR adds QueryMetrics in a compatible way for 0.10.x.

This patch won't preserve extension compatibility anyway, since Query gained withQueryMetrics but BaseQuery doesn't provide a default implementation.

But also: is there a nice migration path from this change now, to something that would be a "better" design for 0.11.0? From your discussion with @himanshug, it sounds like there isn't, and in 0.11.0 we'd just want to remove these methods we're adding now and replace them with something else. I think if that's the case, it's fine to make the "better" change now and have the next release after 0.10.0 be 0.11.0.

@leventov
Copy link
Copy Markdown
Member Author

@gianm uff, ok.

Could you please comment on #4131 (comment)? Also @drcrallen and @himanshug. I don't want to start implementing the change that people will dislike later. Because there is a lot of Query and QueryRunner code to change. (It couldn't be guaranteed that somebody will disagree when looking at actual PR, but I want to minimize probability of that.)

@gianm
Copy link
Copy Markdown
Contributor

gianm commented Apr 10, 2017

I'll take a closer look at that in a bit. I guess you would propose that query endpoints like QueryResource should accept a QueryWithContext? My first thought is that there's not a serious need to reorganize Query to split out the context. Some considerations:

  1. IMO, we don't want to change the query API /druid/v2/ as part of this change. So queries submitted there should still include a context.
  2. Sometimes "internal" context parameters are useful for users to be able to provide, like finalize (if a user wants to get complex metrics in raw form), queryId (if a user wants to use a specific query id).
  3. We need a way for users to provide parameters that affect execution but not results. For example, priority, timeout, useCache, populateCache, groupByStrategy.
  4. We need a way for brokers to pass down information to compute nodes, which needs to be part of the serialized form of a query. For example, when queries are issued from the broker, the groupBy query populates groupByStrategy if it wasn't set by the user, to ensure the same strategy is used on all nodes.
  5. Maybe not the best, but, sometimes query context parameters that users provide do affect the results and need to be included in the cache key. For example: the timeseries parameter skipEmptyBuckets. That probably should have originally been a top level thing rather than a context parameter, but we should retain it in context for query API compatibility.

These points, taken as a whole, to me suggest it makes sense to keep the current design of Query. It satisfies all these needs well and does that in a relatively simple way.

It still makes sense to me to add a QueryPlus object and change QueryRunner to take that, but the "plus" wouldn't be query context, it would be probably response context and queryMetrics.

@himanshug
Copy link
Copy Markdown
Contributor

@leventov @gianm I would say keep Query the way it is now with query context in it.
Regarding QueryRunner.run(QueryPlus) , I'm still not sure why it can't be QueryRunner.run(Query,ResponseContext) where ResponseContext has {QueryPlus - Query}. You do have to pass something in addition to Query to all the runners.

also @leventov do you agree with above but not in favor because it can't be made backward compatible?

@leventov I do agree that we need to minimize disagreements after large code body is written so before writing any code , let us get to some conclusion first.

@gianm
Copy link
Copy Markdown
Contributor

gianm commented Apr 10, 2017

Regarding QueryRunner.run(QueryPlus) , I'm still not sure why it can't be QueryRunner.run(Query,ResponseContext) where ResponseContext has {QueryPlus - Query}. You do have to pass something in addition to Query to all the runners.

QueryRunner.run(QueryPlus) and QueryRunner(Query, ExtraQueryStuff) are equivalent. The only reason I would suggest not naming ExtraQueryStuff as "ResponseContext" is for future proofing. It might have things in it that aren't response context. Like QueryMetrics, or future unforeseen uses. That way we don't have to change the API again if we want to add something else.

@leventov
Copy link
Copy Markdown
Member Author

@gianm your comment: #4131 (comment) makes sense for me. But talking about QueryPlus, responseContext shouldn't be part of it, response context should be returned from QueryRunner.run(), see #4113 (comment).

@leventov leventov changed the title Add Query.queryMetrics property (part of #3798) Fix bugs in query builders and in TimeBoundaryQuery.getFilter() Apr 19, 2017
@leventov leventov removed this from the 0.10.1 milestone Apr 19, 2017
@leventov leventov added the Bug label Apr 19, 2017
@leventov
Copy link
Copy Markdown
Member Author

@gianm @himanshug After yesterday's dev sync, I created #4184 and removed query.queryMetrics property, as part of this PR. Only bug fixing / refactoring part of this PR is remaining.

Copy link
Copy Markdown
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leventov the patch looks good to me. I left a trivial comment.

}

limitFn = postProcFn;
private static LimitSpec nullToNoopLimitSpec(LimitSpec limitSpec)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to be useful in other places as well. How about moving to LimitSpec and make it public?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved

Copy link
Copy Markdown
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments about the groupBy builder. The rest looks good to me.


public Builder setLimit(Integer limit)
{
this.limit = limit;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to clear limitFn and limitSpec in order to force them to be recomputed, otherwise code like this would ignore the setLimit(10) part,

new GroupByQuery.Builder(query).setLimit(10).build();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment for any other function that writes to anything else that might modify limitFn, including orderByColumnSpecs, limit, havingSpec, or limitSpec. Some should clear both limitSpec and limitFn, some should only clear limitFn.

Copy link
Copy Markdown
Contributor

@himanshug himanshug Apr 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it would be less error prone to just remove limitSpec == null check and recreate it every time.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gianm setLimit() was duplicating limit(), which had a saner implementation. Removed setLimit() and renamed limit() to setLimit(). Added postProcessingFn = null to some methods. Also never skip constructor checks now.

@himanshug if you mean recreate limitSpec every time in Builder.build(), it couldn't be done because limitSpec could be set directly.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i meant to always recreate limitSpec . but i see, it can't be done due to explicit limitSpec set.

return postProcFn;
}

limitFn = postProcFn;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you're at it, this should probably be called this.postProcFn since it's doing both LIMIT and HAVING.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to postProcessingFn, and renamed applyLimit() method to postProcess().

Copy link
Copy Markdown
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@gianm
Copy link
Copy Markdown
Contributor

gianm commented Apr 25, 2017

@drcrallen @himanshug able to take another look?

@himanshug himanshug merged commit ee9b5a6 into apache:master Apr 25, 2017
@himanshug himanshug added this to the 0.10.1 milestone Apr 25, 2017
}

protected Map<String, Object> computeOverridenContext(Map<String, Object> overrides)
protected static Map<String, Object> computeOverriddenContext(
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR breaks compatibility of BaseQuery by refactoring this protected instance method into a static one. Is it a compatibility bug?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if BaseQuery is one of the "supposed to be stable" interfaces or if it's just Query / QueryRunner / QueryToolChest. Good question. We should probably have an annotation or something to make it clear.

I guess since all built-in queries extend BaseQuery, it's likely that extensions would too, so it would be kinder to them to keep compatibility there.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@drcrallen asked me to keep compatibility of BaseQuery in earlier review of this PR: #4131 (comment)

I found this incompatibility because our extension broke :)

For annotation, there is an issue: #4044

Ok, I'll make a PR that fixes incompatibility

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR: #4241

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

@leventov leventov deleted the query-queryMetrics-property branch July 14, 2017 13:51
gianm added a commit to gianm/druid that referenced this pull request Mar 14, 2018
PR apache#4131 introduced a new copy builder for segmentMetadata that did
not retain the value of usingDefaultInterval. This led to it being
dropped and the default-interval handling not working as expected.
Instead of using the default 1 week history when intervals are not
provided, the segmentMetadata query would query _all_ segments,
incurring an unexpected performance hit.

This patch fixes the bug and adds a test for the copy builder.
gianm added a commit that referenced this pull request Mar 15, 2018
* SegmentMetadataQuery: Fix default interval handling.

PR #4131 introduced a new copy builder for segmentMetadata that did
not retain the value of usingDefaultInterval. This led to it being
dropped and the default-interval handling not working as expected.
Instead of using the default 1 week history when intervals are not
provided, the segmentMetadata query would query _all_ segments,
incurring an unexpected performance hit.

This patch fixes the bug and adds a test for the copy builder.

* Intervals
gianm added a commit to implydata/druid-public that referenced this pull request Mar 15, 2018
* SegmentMetadataQuery: Fix default interval handling.

PR apache#4131 introduced a new copy builder for segmentMetadata that did
not retain the value of usingDefaultInterval. This led to it being
dropped and the default-interval handling not working as expected.
Instead of using the default 1 week history when intervals are not
provided, the segmentMetadata query would query _all_ segments,
incurring an unexpected performance hit.

This patch fixes the bug and adds a test for the copy builder.

* Intervals
gianm added a commit to implydata/druid-public that referenced this pull request Mar 15, 2018
* SegmentMetadataQuery: Fix default interval handling.

PR apache#4131 introduced a new copy builder for segmentMetadata that did
not retain the value of usingDefaultInterval. This led to it being
dropped and the default-interval handling not working as expected.
Instead of using the default 1 week history when intervals are not
provided, the segmentMetadata query would query _all_ segments,
incurring an unexpected performance hit.

This patch fixes the bug and adds a test for the copy builder.

* Intervals
gianm added a commit to gianm/druid that referenced this pull request Apr 9, 2018
* SegmentMetadataQuery: Fix default interval handling.

PR apache#4131 introduced a new copy builder for segmentMetadata that did
not retain the value of usingDefaultInterval. This led to it being
dropped and the default-interval handling not working as expected.
Instead of using the default 1 week history when intervals are not
provided, the segmentMetadata query would query _all_ segments,
incurring an unexpected performance hit.

This patch fixes the bug and adds a test for the copy builder.

* Intervals
fjy pushed a commit that referenced this pull request Apr 10, 2018
* SegmentMetadataQuery: Fix default interval handling.

PR #4131 introduced a new copy builder for segmentMetadata that did
not retain the value of usingDefaultInterval. This led to it being
dropped and the default-interval handling not working as expected.
Instead of using the default 1 week history when intervals are not
provided, the segmentMetadata query would query _all_ segments,
incurring an unexpected performance hit.

This patch fixes the bug and adds a test for the copy builder.

* Intervals
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants