Fix bugs in query builders and in TimeBoundaryQuery.getFilter()#4131
Fix bugs in query builders and in TimeBoundaryQuery.getFilter()#4131himanshug merged 8 commits intoapache:masterfrom
Conversation
…used code in Druids
…ry.getDimensionsFilter()
drcrallen
left a comment
There was a problem hiding this comment.
These change make sense overall. There is a signature change on Query which I'm scratching my brain to see if is really needed.
It feels like something like the query tool chest should be ensuring the query is in the right kind of shape for being run, and I'm not convinced having the runners modify the Query object is the right kind of pattern in the long run.
Thoughts?
| boolean descending, | ||
| Map<String, Object> context | ||
| Map<String, Object> context, | ||
| QueryMetrics<?> queryMetrics |
There was a problem hiding this comment.
can we keep a constructor with the same signature for the sake of query extensions?
There was a problem hiding this comment.
Added compatibility constructor
| ) | ||
| { | ||
| final Sequence<T> baseSequence = delegate.run(query, responseContext); | ||
| QueryMetrics<? super Query<T>> queryMetrics = queryToolChest.makeMetrics(query); |
There was a problem hiding this comment.
Should this only be done if query.getQueryMetrics() is null?
|
|
||
| applyCustomDimensions.accept(queryMetrics); | ||
|
|
||
| final Query<T> queryWithMetrics = query.withQueryMetrics(queryMetrics); |
There was a problem hiding this comment.
same question, does this still need to be done if the query already has a QueryMetrics set in it?
Didn't clearly understand, could you please elaborate? |
…icQueryRunner and MetricsEmittingQueryRunner
|
@leventov what I mean is that if you look at the implementations of As such, I'm questioning if modification of the query in arbitrary |
|
@drcrallen query object is already changed in the same way in some QueryRunners, e. g. Generally when I was preparing this PR, I was thinking that currently the
Then But anyway, I think it shouldn't be done as part of this PR, for compatibility reasons it might need to be delayed to 0.11.0. |
|
@leventov It appears that QueryMetrics is added to Query so that it could be passed around to various runners... how do you feel about changing |
|
@himanshug as we discuss in #4113, responseContext shouldn't be used to pass anything to downstream query runners. On the other hand, the Query abstraction should probably be refactored into |
|
@himanshug do you have other comments? |
|
@leventov sorry, I was away for a while. That said, I might be wrong or we might want to change things in direction you're suggesting. So, please remind us to discuss this in next dev sync up and we can conclude it. |
|
@himanshug I don't see what we disagree about. #4113 and #4131 (comment) couldn't be done in a minor Druid version update like 0.10.1, because they break custom user query types and query runners. This PR adds QueryMetrics in a compatible way for 0.10.x. I agree that I won't be able to participate next week's dev sync |
|
@leventov, in reply to:
This patch won't preserve extension compatibility anyway, since Query gained But also: is there a nice migration path from this change now, to something that would be a "better" design for 0.11.0? From your discussion with @himanshug, it sounds like there isn't, and in 0.11.0 we'd just want to remove these methods we're adding now and replace them with something else. I think if that's the case, it's fine to make the "better" change now and have the next release after 0.10.0 be 0.11.0. |
|
@gianm uff, ok. Could you please comment on #4131 (comment)? Also @drcrallen and @himanshug. I don't want to start implementing the change that people will dislike later. Because there is a lot of |
|
I'll take a closer look at that in a bit. I guess you would propose that query endpoints like QueryResource should accept a QueryWithContext? My first thought is that there's not a serious need to reorganize Query to split out the context. Some considerations:
These points, taken as a whole, to me suggest it makes sense to keep the current design of Query. It satisfies all these needs well and does that in a relatively simple way. It still makes sense to me to add a QueryPlus object and change QueryRunner to take that, but the "plus" wouldn't be query context, it would be probably response context and queryMetrics. |
|
@leventov @gianm I would say keep Query the way it is now with query context in it. also @leventov do you agree with above but not in favor because it can't be made backward compatible? @leventov I do agree that we need to minimize disagreements after large code body is written so before writing any code , let us get to some conclusion first. |
QueryRunner.run(QueryPlus) and QueryRunner(Query, ExtraQueryStuff) are equivalent. The only reason I would suggest not naming ExtraQueryStuff as "ResponseContext" is for future proofing. It might have things in it that aren't response context. Like QueryMetrics, or future unforeseen uses. That way we don't have to change the API again if we want to add something else. |
|
@gianm your comment: #4131 (comment) makes sense for me. But talking about QueryPlus, responseContext shouldn't be part of it, response context should be returned from QueryRunner.run(), see #4113 (comment). |
|
@gianm @himanshug After yesterday's dev sync, I created #4184 and removed query.queryMetrics property, as part of this PR. Only bug fixing / refactoring part of this PR is remaining. |
| } | ||
|
|
||
| limitFn = postProcFn; | ||
| private static LimitSpec nullToNoopLimitSpec(LimitSpec limitSpec) |
There was a problem hiding this comment.
It looks to be useful in other places as well. How about moving to LimitSpec and make it public?
gianm
left a comment
There was a problem hiding this comment.
Left some comments about the groupBy builder. The rest looks good to me.
|
|
||
| public Builder setLimit(Integer limit) | ||
| { | ||
| this.limit = limit; |
There was a problem hiding this comment.
I think this needs to clear limitFn and limitSpec in order to force them to be recomputed, otherwise code like this would ignore the setLimit(10) part,
new GroupByQuery.Builder(query).setLimit(10).build();There was a problem hiding this comment.
Similar comment for any other function that writes to anything else that might modify limitFn, including orderByColumnSpecs, limit, havingSpec, or limitSpec. Some should clear both limitSpec and limitFn, some should only clear limitFn.
There was a problem hiding this comment.
i think it would be less error prone to just remove limitSpec == null check and recreate it every time.
There was a problem hiding this comment.
@gianm setLimit() was duplicating limit(), which had a saner implementation. Removed setLimit() and renamed limit() to setLimit(). Added postProcessingFn = null to some methods. Also never skip constructor checks now.
@himanshug if you mean recreate limitSpec every time in Builder.build(), it couldn't be done because limitSpec could be set directly.
There was a problem hiding this comment.
yeah i meant to always recreate limitSpec . but i see, it can't be done due to explicit limitSpec set.
| return postProcFn; | ||
| } | ||
|
|
||
| limitFn = postProcFn; |
There was a problem hiding this comment.
While you're at it, this should probably be called this.postProcFn since it's doing both LIMIT and HAVING.
There was a problem hiding this comment.
Renamed to postProcessingFn, and renamed applyLimit() method to postProcess().
|
@drcrallen @himanshug able to take another look? |
| } | ||
|
|
||
| protected Map<String, Object> computeOverridenContext(Map<String, Object> overrides) | ||
| protected static Map<String, Object> computeOverriddenContext( |
There was a problem hiding this comment.
This PR breaks compatibility of BaseQuery by refactoring this protected instance method into a static one. Is it a compatibility bug?
There was a problem hiding this comment.
I'm not sure if BaseQuery is one of the "supposed to be stable" interfaces or if it's just Query / QueryRunner / QueryToolChest. Good question. We should probably have an annotation or something to make it clear.
I guess since all built-in queries extend BaseQuery, it's likely that extensions would too, so it would be kinder to them to keep compatibility there.
There was a problem hiding this comment.
@drcrallen asked me to keep compatibility of BaseQuery in earlier review of this PR: #4131 (comment)
I found this incompatibility because our extension broke :)
For annotation, there is an issue: #4044
Ok, I'll make a PR that fixes incompatibility
PR apache#4131 introduced a new copy builder for segmentMetadata that did not retain the value of usingDefaultInterval. This led to it being dropped and the default-interval handling not working as expected. Instead of using the default 1 week history when intervals are not provided, the segmentMetadata query would query _all_ segments, incurring an unexpected performance hit. This patch fixes the bug and adds a test for the copy builder.
* SegmentMetadataQuery: Fix default interval handling. PR #4131 introduced a new copy builder for segmentMetadata that did not retain the value of usingDefaultInterval. This led to it being dropped and the default-interval handling not working as expected. Instead of using the default 1 week history when intervals are not provided, the segmentMetadata query would query _all_ segments, incurring an unexpected performance hit. This patch fixes the bug and adds a test for the copy builder. * Intervals
* SegmentMetadataQuery: Fix default interval handling. PR apache#4131 introduced a new copy builder for segmentMetadata that did not retain the value of usingDefaultInterval. This led to it being dropped and the default-interval handling not working as expected. Instead of using the default 1 week history when intervals are not provided, the segmentMetadata query would query _all_ segments, incurring an unexpected performance hit. This patch fixes the bug and adds a test for the copy builder. * Intervals
* SegmentMetadataQuery: Fix default interval handling. PR apache#4131 introduced a new copy builder for segmentMetadata that did not retain the value of usingDefaultInterval. This led to it being dropped and the default-interval handling not working as expected. Instead of using the default 1 week history when intervals are not provided, the segmentMetadata query would query _all_ segments, incurring an unexpected performance hit. This patch fixes the bug and adds a test for the copy builder. * Intervals
* SegmentMetadataQuery: Fix default interval handling. PR apache#4131 introduced a new copy builder for segmentMetadata that did not retain the value of usingDefaultInterval. This led to it being dropped and the default-interval handling not working as expected. Instead of using the default 1 week history when intervals are not provided, the segmentMetadata query would query _all_ segments, incurring an unexpected performance hit. This patch fixes the bug and adds a test for the copy builder. * Intervals
* SegmentMetadataQuery: Fix default interval handling. PR #4131 introduced a new copy builder for segmentMetadata that did not retain the value of usingDefaultInterval. This led to it being dropped and the default-interval handling not working as expected. Instead of using the default 1 week history when intervals are not provided, the segmentMetadata query would query _all_ segments, incurring an unexpected performance hit. This patch fixes the bug and adds a test for the copy builder. * Intervals
AddQuery.getQueryMetrics()andQuery.withQueryMetrics()for use inMetricsEmittingQueryRunnerandCPUTimeMetricQueryRunner. This is needed to emit some dimensions/metrics during query execution in query engines.copy(query)methods, they were non-static (that doesn't make sense) and "abandoned": didn't copy some of the query fields.TimeBoundaryQuery.getFilter(): was always returning null.A follow-up of #3954, a part of #3798.