Skip to content

Zero filling does not work in timeseries queries. #2106

@KenjiTakahashi

Description

@KenjiTakahashi

That's using Druid 0.8.2.

An example query and result:

(virtualenv)2015-12-16 23:38:27
monitowl-dev:root:/vagrant> cat query2.json
{
    "filter": {
        "fields": [
            {
                "type": "selector",
                "dimension": "<cut>",
                "value": "<cut>"
            },
            {
                "type": "selector",
                "dimension": "<cut>",
                "value": "<cut>"
            }
        ],
        "type": "and"
    },
    "intervals": "2015-11-04T03:41:58.042000/2015-11-11T22:20:14.927000",
    "dataSource": "logdb_data",
    "granularity": {
        "origin": "1970-01-01T00:00:00",
        "type": "period",
        "period": "PT50000S"
    },
    "postAggregations": [
        {
            "fields": [
                {
                    "fieldName": "sum",
                    "type": "fieldAccess",
                    "name": "sum"
                },
                {
                    "fieldName": "count",
                    "type": "fieldAccess",
                    "name": "count"
                }
            ],
            "type": "arithmetic",
            "name": "result",
            "fn": "/"
        }
    ],
    "queryType": "timeseries",
    "aggregations": [
        {
            "type": "count",
            "name": "count"
        },
        {
            "fieldName": "data_num",
            "type": "doubleSum",
            "name": "sum"
        }
    ]
}
2015-12-16 23:38:30 cat query2.json
(virtualenv)2015-12-16 23:38:30
monitowl-dev:root:/vagrant> curl -X POST 'http://localhost:10000/druid/v2/?pretty' -H 'content-type: application/json' -d@query2.json
[ {
  "timestamp" : "2015-11-04T01:20:00.000Z",
  "result" : {
    "result" : 0.0,
    "count" : 0,
    "sum" : 0.0
  }
}, {
  "timestamp" : "2015-11-04T15:13:20.000Z",
  "result" : {
    "result" : 0.0,
    "count" : 0,
    "sum" : 0.0
  }
}, {
  "timestamp" : "2015-11-10T10:06:40.000Z",
  "result" : {
    "result" : 0.0,
    "count" : 0,
    "sum" : 0.0
  }
}, {
  "timestamp" : "2015-11-11T00:00:00.000Z",
  "result" : {
    "result" : 0.0,
    "count" : 0,
    "sum" : 0.0
  }
}, {
  "timestamp" : "2015-11-11T13:53:20.000Z",
  "result" : {
    "result" : 0.0,
    "count" : 0,
    "sum" : 0.0
  }
} ]
2015-12-16 23:38:32 curl -X POST 'http://localhost:10000/druid/v2/?pretty' -H 'content-type: application/json' -d@query2.json

This uses period of 50000S, which is almost 14 hours.
See that between 2nd and 3rd result, there's ~5 days gap with no data and it is not filled with zeroes.

Explicitly adding

"context": {
    "skipEmptyBuckets": "false"
}

does not change anything.

And, strangely enough, setting "skipEmptyBuckets": "true" returns... nothing (i.e. empty array). Maybe that's because it discards "our" zeroes as well? I'll need to find (or produce) a gap within data that do not evaluate to 0 and test it further.

BTW: I've seen this also reported on ML (https://groups.google.com/forum/#!topic/druid-user/3SfgJ7t001s) recently, but no conclusion was made there.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions