Skip to content

fix timewarp query results when using timezones and crossing DST transitions#5157

Merged
jon-wei merged 13 commits intoapache:masterfrom
metamx:timewarp-timezone
Jan 11, 2018
Merged

fix timewarp query results when using timezones and crossing DST transitions#5157
jon-wei merged 13 commits intoapache:masterfrom
metamx:timewarp-timezone

Conversation

@clintropolis
Copy link
Copy Markdown
Member

@clintropolis clintropolis commented Dec 13, 2017

Fixes #5156

changes:

  • TimewarpOperator will now compensate for daylight savings time shifts between date translation ranges for queries using a PeriodGranularity with a timezone defined
  • The Query interface has been expanded with 2 new methods, getGranularity and getTimezone.
  • BaseQuery now has a Granularity and a new constructor that takes an additional argument to set it. When using the original constructor, granularity will be set to Granularities.ALL.
  • The getTimezone method in BaseQuery will return the PeriodGranularity timezone if it is used, else it will default to DateTimeZone.UTC.
  • GroupByQuery, SearchQuery, SelectQuery, TimeseriesQuery, and TopNQuery all individually had a granularity, so this change to Query and BaseQuery cuts down on some duplicate code in addition providing a mechanism for TimewarpOperator (and anything else) that needs to be aware of query granularity or timezone.
  • The changes to BaseQuery result in serialization differences for DataSourceMetadataQuery, SegmentMetadataQuery, ScanQuery, and TimeBoundaryQuery, which previously did not have a granularity, but are now set to Granularities.ALL.

changes:
* `TimewarpOperator` will now compensate for daylight savings time shifts between date translation ranges for queries using a `PeriodGranularity` with a timezone defined
* introduces a new abstract query type `TimeBucketedQuery` for all queries which have a `Granularity` (100% not attached to this name). `GroupByQuery`, `SearchQuery`, `SelectQuery`, `TimeseriesQuery`, and `TopNQuery` all extend `TimeBucke
tedQuery`, cutting down on some duplicate code and providing a mechanism for `TimewarpOperator` (and anything else) that needs to be aware of granularity
@leventov
Copy link
Copy Markdown
Member

Labelling Design Review because this PR introduces a new @ExtensionPoint.

for (DimensionSpec spec : this.dimensions) {
Preconditions.checkArgument(spec != null, "dimensions has null DimensionSpec");
}
Preconditions.checkNotNull(granularity, "Must specify a granularity");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check should be done in TimeBucketedQuery's constructor

)
{
super(dataSource, querySegmentSpec, descending, context);
super(dataSource, querySegmentSpec, descending, context, granularity == null ? Granularities.ALL : granularity);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to extract method Granularities.nullToAll()

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

)
{
super(dataSource, querySegmentSpec, descending, context);
super(dataSource, querySegmentSpec, descending, context, granularity);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do timeseries queries behave with null granularity?

Copy link
Copy Markdown
Member Author

@clintropolis clintropolis Dec 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tested and it throws a null pointer exception, moving the precondition check from group by into TimeBucketedQuery constructor should fix this.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears TimeseriesQueryBuilder defaults to Granularities.ALL which is why all of the tests which do not specify a granularity do not explode.

* @return the offset between the mapped time and time t
*/
protected long computeOffset(final long t)
protected long computeOffset(final long t, final DateTimeZone tz)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to refactor this method? Maybe it's code could be simplified?

@leventov
Copy link
Copy Markdown
Member

@clintropolis could you please also give this PR a title that looks more sensible in git commit history.

@clintropolis clintropolis changed the title timewarp and timezones fix timewarp query results when using timezones and crossing DST transitions Dec 18, 2017
@leventov leventov added this to the 0.12.0 milestone Dec 19, 2017
@drcrallen
Copy link
Copy Markdown
Contributor

All queries are time bucketed.

descending is already a property in BaseQuery

Would it make sense for BaseQuery to be modified to accept granularity as a constructor, instead of making another class layer that every query will want to use anyways?

@clintropolis
Copy link
Copy Markdown
Member Author

@drcrallen The reason I added TimeBucketedQuery is because DataSourceMetadataQuery, SegmentMetadataQuery, ScanQuery, and TimeBoundaryQuery all extend BaseQuery but did not have a granularity. If you think it is preferable to make BaseQuery always have a granularity, I can merge TimeBucketedQuery into it and change these other query types to have an implicit 'All' or 'None', I just wasn't sure if conceptually that made sense (or would have any other unintended side effects).

Thanks for the review!

@gianm
Copy link
Copy Markdown
Contributor

gianm commented Jan 4, 2018

All queries are time bucketed.

They don't all take a granularity parameter, but if you count "hardcoded 'all' granularity" as "time bucketed" then, okay, sure :)

If you think it is preferable to make BaseQuery always have a granularity, I can merge TimeBucketedQuery into it and change these other query types to have an implicit 'All' or 'None', I just wasn't sure if conceptually that made sense (or would have any other unintended side effects).

If this is the way you go, it should be 'all', since 'none' really means 'millis'.

Clint Wylie added 2 commits January 5, 2018 13:52
* add 'getGranularity' and 'getTimezone' to 'Query' interface
* merge 'TimeBucketedQuery' into 'BaseQuery'
* fixup tests from resulting serialization changes
@clintropolis clintropolis reopened this Jan 6, 2018
@clintropolis clintropolis reopened this Jan 8, 2018
@leventov
Copy link
Copy Markdown
Member

leventov commented Jan 8, 2018

@clintropolis there is a true error at least with intellij inspections, closing/reopening a pr won't help

@clintropolis
Copy link
Copy Markdown
Member Author

clintropolis commented Jan 8, 2018

Oops indeed. But the tests should pass, I think.

@clintropolis
Copy link
Copy Markdown
Member Author

The inspection indicates that of the getGranularity and getTimezone methods I added to BaseQuery, only getTimezone should be on the Query interface, which I'm not sure makes a lot of sense without the granularity since it supplies the timezone. Should I rework to either remove both from Query interface, or does it make sense for them to be there and ignore the inspection somehow?

@jon-wei
Copy link
Copy Markdown
Contributor

jon-wei commented Jan 10, 2018

I personally think the inspection complaining about getGranularity is excessive, but I guess you could move the two methods to BaseQuery from Query, there aren't any Query implementations that aren't also BaseQuery, I think that would fix the inspection?

@jon-wei
Copy link
Copy Markdown
Contributor

jon-wei commented Jan 11, 2018

@clintropolis You can also add @SuppressWarnings("unused") to getGranularity() to suppress that inspection error, I would favor that over moving the methods to BaseQuery

@jon-wei jon-wei merged commit 491f8cc into apache:master Jan 11, 2018
jon-wei pushed a commit to jon-wei/druid that referenced this pull request Jan 11, 2018
…sitions (apache#5157)

* timewarp and timezones
changes:
* `TimewarpOperator` will now compensate for daylight savings time shifts between date translation ranges for queries using a `PeriodGranularity` with a timezone defined
* introduces a new abstract query type `TimeBucketedQuery` for all queries which have a `Granularity` (100% not attached to this name). `GroupByQuery`, `SearchQuery`, `SelectQuery`, `TimeseriesQuery`, and `TopNQuery` all extend `TimeBucke
tedQuery`, cutting down on some duplicate code and providing a mechanism for `TimewarpOperator` (and anything else) that needs to be aware of granularity

* move precondition check to TimeBucketedQuery, add Granularities.nullToAll, add getTimezone to TimeBucketQuery

* formatting

* more formatting

* unused import

* changes:
* add 'getGranularity' and 'getTimezone' to 'Query' interface
* merge 'TimeBucketedQuery' into 'BaseQuery'
* fixup tests from resulting serialization changes

* dedupe

* fix after merge

* suppress warning
leventov pushed a commit that referenced this pull request Jan 12, 2018
…ng DST tran… (#5252)

* fix timewarp query results when using timezones and crossing DST transitions (#5157)

* timewarp and timezones
changes:
* `TimewarpOperator` will now compensate for daylight savings time shifts between date translation ranges for queries using a `PeriodGranularity` with a timezone defined
* introduces a new abstract query type `TimeBucketedQuery` for all queries which have a `Granularity` (100% not attached to this name). `GroupByQuery`, `SearchQuery`, `SelectQuery`, `TimeseriesQuery`, and `TopNQuery` all extend `TimeBucke
tedQuery`, cutting down on some duplicate code and providing a mechanism for `TimewarpOperator` (and anything else) that needs to be aware of granularity

* move precondition check to TimeBucketedQuery, add Granularities.nullToAll, add getTimezone to TimeBucketQuery

* formatting

* more formatting

* unused import

* changes:
* add 'getGranularity' and 'getTimezone' to 'Query' interface
* merge 'TimeBucketedQuery' into 'BaseQuery'
* fixup tests from resulting serialization changes

* dedupe

* fix after merge

* suppress warning

* Fix compile issue
NirajaM pushed a commit to NirajaM/druid that referenced this pull request Jan 16, 2018
…sitions (apache#5157)

* timewarp and timezones
changes:
* `TimewarpOperator` will now compensate for daylight savings time shifts between date translation ranges for queries using a `PeriodGranularity` with a timezone defined
* introduces a new abstract query type `TimeBucketedQuery` for all queries which have a `Granularity` (100% not attached to this name). `GroupByQuery`, `SearchQuery`, `SelectQuery`, `TimeseriesQuery`, and `TopNQuery` all extend `TimeBucke
tedQuery`, cutting down on some duplicate code and providing a mechanism for `TimewarpOperator` (and anything else) that needs to be aware of granularity

* move precondition check to TimeBucketedQuery, add Granularities.nullToAll, add getTimezone to TimeBucketQuery

* formatting

* more formatting

* unused import

* changes:
* add 'getGranularity' and 'getTimezone' to 'Query' interface
* merge 'TimeBucketedQuery' into 'BaseQuery'
* fixup tests from resulting serialization changes

* dedupe

* fix after merge

* suppress warning
@FrankChen021 FrankChen021 mentioned this pull request Nov 30, 2020
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants