Skip to content

Include chunk end in rangeKey #298

@tomwilkie

Description

@tomwilkie

Use that during queries to minimise amount of data fetched from DynamoDB

Observations:

  • The critical path for queries is DynamoDB:

screen shot 2017-02-20 at 11 20 26

  • Our most common query is count(count by (instance)(up)) for the last 5 mins, an instant query hour. This query is done automatically by the UI, but in general I think metric-name queries will be common.

  • This resolves to a DynamoDB query to read all chunks in the last 24hrs for the up timeseries - which is at minimum two chunks per target (12hr chunk max age), and we have >100 targets.

  • By the looks for the traces, this query actually normally only reads ~1 chunk, as most chunks are older that the time range, and the query is answered by ingesters.

  • Queries to DynamoDB get slower as the day goes on, as the daily-rows "fill up":

screen shot 2017-02-20 at 11 25 57

Idea is to include chunk end time in range value in DynamoDB, and then do queries for chunks that end after the query start time. Should reduce amount of work and number of results DynamoDB has to return, hopefully making these queries quicker.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions