Skip to content

Split incoming requests by day, run them in parallel, and cache query results.#1029

Merged
tomwilkie merged 7 commits intocortexproject:masterfrom
grafana:query-results-cache
Sep 29, 2018
Merged

Split incoming requests by day, run them in parallel, and cache query results.#1029
tomwilkie merged 7 commits intocortexproject:masterfrom
grafana:query-results-cache

Conversation

@tomwilkie
Copy link
Contributor

@tomwilkie tomwilkie commented Sep 23, 2018

Includes #995, fixes #977, fixes #963, fixes #266

Continuation of the work proposed in https://docs.google.com/document/d/1lsvSkv0tiAMPQv-V8vI2LZ8f4i9JuTRsuPI_i-XcAqY/edit?usp=drive_web&ouid=103586900408483314805.

  • Generic code to parse incoming query_range requests, mutate them and round trip them.
  • Align incoming queries with their step to make the results more cachable.
  • Split queries along day boundaries, modulo step.
  • Run queries in parallel and combine their results.
  • Cache query results by (userid, promql query, step, day).
  • Work out intersection of query and cached result and only query for difference.

@tomwilkie tomwilkie force-pushed the query-results-cache branch 3 times, most recently from 8b29dca to 266cb9c Compare September 24, 2018 10:20
@tomwilkie tomwilkie changed the title [WIP] Query results caching in the frontend Query results caching in the frontend Sep 24, 2018
@tomwilkie
Copy link
Contributor Author

This is rebased on the new caching interfaces and should be ready for review now.

Copy link
Contributor

@csmarchbanks csmarchbanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind creating some docs for the new flags/behavior?

Gopkg.toml Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the referenced pr has been merged, so this can all be removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo Aling -> Align

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a separate flag from caching? Seems like caching results wouldn't work very well without this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main criticism I've heard against trickster is that it mutates the request. Also, we've put the step alignment into Grafana now. So I wanted to make this optional as I think it shouldn't be necessary in some cases. Its a niche case, for sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These responses don't look empty

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parititon -> Partition

@tomwilkie tomwilkie changed the title Query results caching in the frontend Split incoming requests by day and run them in parallel; also cache query results. Sep 26, 2018
@tomwilkie tomwilkie force-pushed the query-results-cache branch 5 times, most recently from bb805ef to 7315796 Compare September 27, 2018 16:24
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Split incoming requests by day and run them in parallel.

- Generic code to parse incoming query_range requests, mutate them and round trip them.
- Split queries along day boundaries, modulo step.
- Run queries in parallel and combine their results.
- Ensure we propagate org ids correctly; add e2e tests.
- Take care to ensure we propagate trace IDs correctly; involved updating weaveworks/common.

Query results caching.

- Align incoming queries with their step to make the results more cachable.
- Work out intersection of query and cached result and only query for difference.
- Cache query results using the key `hash(userid, promql query, step, day)`.
- Cache multiple, non-overlapping time ranges per day.  Don't cache the last 1 minute.
- Hash the key for the query results cache.
- Don't cache the last minute, as results can still be in flux.
- Make sure the trimmed results should also align with step.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
dep couldn't find a solution so I had to encourage it to use the later grpc and proto versions - I think this is because the constraint added into common on gogo/status.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
@tomwilkie
Copy link
Contributor Author

Would you mind creating some docs for the new flags/behavior?

Done!

@tomwilkie tomwilkie changed the title Split incoming requests by day and run them in parallel; also cache query results. Split incoming requests by day, run them in parallel, and cache query results. Sep 27, 2018
…m the upstream repo.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Copy link
Contributor

@csmarchbanks csmarchbanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the Check endpoint on an ingester still going to work? Seems like client.RegisterIngesterServer will no longer register the Check method, so another register will need to be added to cmd/ingester/main.go?

Gopkg.toml Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these new constraints necessary? Or were they just needed for when you were pointing to a branch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed them to get dep to successfully find a solution, but now it has, I don't seem to need them. Removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like weaveworks/common is still new and unneeded

@tomwilkie
Copy link
Contributor Author

Is the Check endpoint on an ingester still going to work? Seems like client.RegisterIngesterServer will no longer register the Check method, so another register will need to be added to cmd/ingester/main.go?

Good catch! I had fixed but forgot to push...

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
@tomwilkie
Copy link
Contributor Author

BTW we've deployed this in production and it makes a huge difference to query performance.

@csmarchbanks
Copy link
Contributor

One last thing to remove in Gopkg.toml, then this LGTM

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
@tomwilkie
Copy link
Contributor Author

Thanks!

@tomwilkie tomwilkie merged commit ee74728 into cortexproject:master Sep 29, 2018
@tomwilkie tomwilkie deleted the query-results-cache branch September 29, 2018 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants