Skip to content

Parallel chunk fetching from DynamoDB#603

Merged
bboreham merged 1 commit intomasterfrom
parallel-dynamodb-chunk-fetch
Nov 22, 2017
Merged

Parallel chunk fetching from DynamoDB#603
bboreham merged 1 commit intomasterfrom
parallel-dynamodb-chunk-fetch

Conversation

@bboreham
Copy link
Contributor

Run multiple goroutines in parallel to drive DynamoDB harder: group chunk fetches into "gangs" to allow some configuration over how hard we hit DynamoDB

This is a very simple scheme; it could be done much more intelligently but I think it's worth deploying something simple

Example query shown below, run against dev: although the overall time is only a little shorter this includes one DynamoDB batch fetch that took 3 seconds, similar to what is described at #602. In the presence of throttling it gets more efficient to run several queries in parallel which may hit different tables or shards.

Query is sum(rate(container_cpu_usage_seconds_total{image!="",namespace!=""}[5m])) BY (namespace) start=1510770800 end=1510770830

Trace before:

image

and after:

image

@bboreham bboreham force-pushed the parallel-dynamodb-chunk-fetch branch 2 times, most recently from 91ab6d2 to c1bd9b3 Compare November 20, 2017 17:44
@bboreham bboreham requested a review from jml November 21, 2017 10:37
Copy link
Contributor

@jml jml left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions, but nothing that needs to change.

If you want to bounce back for another round of review, feel free, but no need to do so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm more familiar with choosing the level of parallelism (e.g. how many concurrent goroutines) than choosing the size of each concurrent job, as you're doing here. I don't have opinions on which is better. Why did you decide to do it this way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choosing the level of parallelism requires a global queue of work across all queries. I'd like to do that, but don't feel I can complete it in the current sprint.

Computing the "gang size" to target a certain parallelism per query is harder to tune (wanting to keep the batches sent to DynamdDB fairly large), and the end result, given many queries in parallel, will still have highly variable overall parallelism.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obligatory type theory nerdery: this is more accurately chunksTimesError. I'm not actually suggesting you change this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a few other implementations of "parallel map a function that returns (a, err)". Some of the others use a buffered channel. I don't think it's necessary here, because the loop immediately following will drain all of results as quickly as possible.

No action required, just flagging in case my reading is wrong and you notice something on a second glance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a buffer would be extra memory management overhead for no obvious benefit.

Run multiple goroutines in parallel to drive DynamoDB harder.
Group chunk fetches into "gangs" to allow some configuration over how
hard we hit DynamoDB
@bboreham bboreham force-pushed the parallel-dynamodb-chunk-fetch branch from c1bd9b3 to bec45d9 Compare November 22, 2017 17:50
@bboreham bboreham merged commit 1636bd8 into master Nov 22, 2017
This was referenced Dec 22, 2017
@tomwilkie tomwilkie mentioned this pull request Dec 26, 2017
26 tasks
@bboreham bboreham deleted the parallel-dynamodb-chunk-fetch branch January 4, 2018 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants