Skip to content

Indicating time trends in topN results #7161

@leventov

Description

@leventov

This issue elaborates on one of the aspects mentioned in the discussion about Error bounds / probabilities / skewness as first-class Druid query results: time trends for single-valued query types such as topN and groupBy. In different time intervals, results may be different, for example:

interval=00:00_01:00
country    website_visits
China      2000
Argentina  1000
USA        900      

...

interval=05:00_06:00
country    website_visits
Argentina  1500
USA        1300
China      1000    

-- China goes to bed, Americas wake up.

When we aggregate just those 5 hours, there is a relative trend for each dimension value (country). We may detect this when aggregating results on Broker and send a enum along with the value for each dimension value or grouping key:

  • consistent upward trend
  • consistent downward trend
  • no trend (the relative weight of the key stays realtively stable), or a cycling trend: relative weight goes up and down and finishes approximately at the same place where it has started.
  • big variance, but no particular trend.

Then in query UIs, consistent upward and downward trends may be represented as small green and red arrows (those that are used in stock market interfaces :)

Big variance, but no particular trend may be indicated by something like an asterisk or different tone of the row or an exclamation mark in a circle, indicating that the total aggregate value probably has low statistical significance, or there is some problem with the data (see #7160).

FYI @mistercrunch @julianhyde @vogievetsky

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions