Parallel chunk fetching from DynamoDB by bboreham · Pull Request #603 · cortexproject/cortex

bboreham · 2017-11-19T23:04:11Z

Run multiple goroutines in parallel to drive DynamoDB harder: group chunk fetches into "gangs" to allow some configuration over how hard we hit DynamoDB

This is a very simple scheme; it could be done much more intelligently but I think it's worth deploying something simple

Example query shown below, run against dev: although the overall time is only a little shorter this includes one DynamoDB batch fetch that took 3 seconds, similar to what is described at #602. In the presence of throttling it gets more efficient to run several queries in parallel which may hit different tables or shards.

Query is sum(rate(container_cpu_usage_seconds_total{image!="",namespace!=""}[5m])) BY (namespace) start=1510770800 end=1510770830

Trace before:

and after:

jml

A couple of questions, but nothing that needs to change.

If you want to bounce back for another round of review, feel free, but no need to do so.

jml · 2017-11-21T15:17:39Z

pkg/chunk/aws_storage_client.go

I'm more familiar with choosing the level of parallelism (e.g. how many concurrent goroutines) than choosing the size of each concurrent job, as you're doing here. I don't have opinions on which is better. Why did you decide to do it this way?

Choosing the level of parallelism requires a global queue of work across all queries. I'd like to do that, but don't feel I can complete it in the current sprint.

Computing the "gang size" to target a certain parallelism per query is harder to tune (wanting to keep the batches sent to DynamdDB fairly large), and the end result, given many queries in parallel, will still have highly variable overall parallelism.

jml · 2017-11-21T15:25:27Z

pkg/chunk/aws_storage_client.go

Obligatory type theory nerdery: this is more accurately chunksTimesError. I'm not actually suggesting you change this.

jml · 2017-11-21T15:33:15Z

pkg/chunk/aws_storage_client.go

We have a few other implementations of "parallel map a function that returns (a, err)". Some of the others use a buffered channel. I don't think it's necessary here, because the loop immediately following will drain all of results as quickly as possible.

No action required, just flagging in case my reading is wrong and you notice something on a second glance.

Yes, a buffer would be extra memory management overhead for no obvious benefit.

Run multiple goroutines in parallel to drive DynamoDB harder. Group chunk fetches into "gangs" to allow some configuration over how hard we hit DynamoDB

bboreham force-pushed the parallel-dynamodb-chunk-fetch branch 2 times, most recently from 91ab6d2 to c1bd9b3 Compare November 20, 2017 17:44

bboreham requested a review from jml November 21, 2017 10:37

jml approved these changes Nov 21, 2017

View reviewed changes

Simple parallel chunk fetching from DynamoDB

bec45d9

Run multiple goroutines in parallel to drive DynamoDB harder. Group chunk fetches into "gangs" to allow some configuration over how hard we hit DynamoDB

bboreham force-pushed the parallel-dynamodb-chunk-fetch branch from c1bd9b3 to bec45d9 Compare November 22, 2017 17:50

bboreham merged commit 1636bd8 into master Nov 22, 2017

This was referenced Dec 22, 2017

Parallelise chunk decoding #625

Closed

Parallelise chunk fetches #599

Closed

tomwilkie mentioned this pull request Dec 26, 2017

Query optimisation ideas #209

Closed

26 tasks

bboreham deleted the parallel-dynamodb-chunk-fetch branch January 4, 2018 20:25

bboreham mentioned this pull request Jan 9, 2018

Limit parallelism when fetching chunks from DynamoDB #644

Merged

bboreham mentioned this pull request Apr 10, 2018

Return chunks fetched by GetChunks on error #791

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel chunk fetching from DynamoDB#603

Parallel chunk fetching from DynamoDB#603
bboreham merged 1 commit intomasterfrom
parallel-dynamodb-chunk-fetch

bboreham commented Nov 19, 2017

Uh oh!

jml left a comment

Uh oh!

jml Nov 21, 2017

Uh oh!

bboreham Nov 21, 2017

Uh oh!

jml Nov 21, 2017

Uh oh!

jml Nov 21, 2017

Uh oh!

bboreham Nov 21, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bboreham commented Nov 19, 2017

Uh oh!

jml left a comment

Choose a reason for hiding this comment

Uh oh!

jml Nov 21, 2017

Choose a reason for hiding this comment

Uh oh!

bboreham Nov 21, 2017

Choose a reason for hiding this comment

Uh oh!

jml Nov 21, 2017

Choose a reason for hiding this comment

Uh oh!

jml Nov 21, 2017

Choose a reason for hiding this comment

Uh oh!

bboreham Nov 21, 2017

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants