Skip to content

Add sequential sketch merging to MSQ#13205

Merged
cryptoe merged 34 commits intoapache:masterfrom
adarshsanjeev:sequential-sketch-merging
Nov 22, 2022
Merged

Add sequential sketch merging to MSQ#13205
cryptoe merged 34 commits intoapache:masterfrom
adarshsanjeev:sequential-sketch-merging

Conversation

@adarshsanjeev
Copy link
Copy Markdown
Contributor

@adarshsanjeev adarshsanjeev commented Oct 11, 2022

Current Implementation

In the current MSQ implementation, each worker maintains a ClusterByStatisticsCollector. If we are clustering by time, one sketch is maintained for each time chunk. During the merge, we sent the entire ClusterByStatisticsCollector from the worker to the controller. These are merged together, downsampling if memory taken is too much and partitions are generated from this.

Potential Improvements

Taking all worker sketches into controller memory is an unneeded step, which leads to downsampling of sketches and generating partitions outside the targeted weight. Since we have a primary partitioning on time, we only need to maintain the sketch from all workers for a particular time partition, generate the partition and remove the sketches. This is the sequential sketch merging approach.

Since this increases time taken however, this has been moved to a separate WorkerSketchFetcher. This will use an executor service to avoid blocking the main controller thread. This will not remove the current functionality, it will only uses sequential merging under cases that it is likely to provide greater benefits. For small sketches, it continues to fetch the entire sketch for the speed.

Key changes

  • Add a new phase to controller: MERGING_STATISTICS and necessary transitions
  • Change worker to now send a WorkerReport to the controller, which contains information necessary to fetch sketches later.
  • Change the sketch fetching a pull from the controller instead of a push from the worker.
  • Add WorkerSketchFetcher to fetch sketches in the background.

Context changes

  • Add a new parameter clusterStatisticsMergeMode. It controls whether to parallel or sequential merging of worker sketches. Can be PARALLEL, SEQUENTIAL or AUTO. On AUTO tries to find the best approach based on number of workers and size of input rows.

Release note

  • Removes an API on ControllerChatHandler /keyStatistics/{queryId}/{stageNumber}/{workerNumber}
  • Adds an API on ControllerChatHandler /partialKeyStatisticsInformation/{queryId}/{stageNumber}/{workerNumber} that accepts an object that is deserializable to PartialKeyStatisticsInformation.

Key changed/added classes in this PR
  • ClusterByStatisticsCollector
  • WorkerClient
  • ControllerClient
  • StageDefinition
  • WorkerSketchFetcher
  • ControllerStageTracker
  • ClusterByStatisticsWorkerReport

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@adarshsanjeev adarshsanjeev marked this pull request as ready for review October 14, 2022 03:17
@kfaraz kfaraz added the Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 label Oct 14, 2022
Copy link
Copy Markdown
Contributor

@cryptoe cryptoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A partial review. Will try to get this finished tomorrow.

Copy link
Copy Markdown
Contributor

@LakshSingla LakshSingla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! Left a few comments on the same.

Comment thread docs/multi-stage-query/reference.md Outdated
*/
public class WorkerSketchFetcher
{
private static final int DEFAULT_THREAD_COUNT = 10;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going through this again, with regards to memory usage, we should limit the threads in relation to the number of sketches the controller can maintain at a time in memory.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. Yeah, we need to limit the number of parallel threads here. Lets start with 4 ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to 4

Copy link
Copy Markdown
Contributor

@cryptoe cryptoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments.

*/
public class WorkerSketchFetcher
{
private static final int DEFAULT_THREAD_COUNT = 10;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. Yeah, we need to limit the number of parallel threads here. Lets start with 4 ?

Copy link
Copy Markdown
Contributor

@cryptoe cryptoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left my final review. Thanks for waiting on this @adarshsanjeev .

Copy link
Copy Markdown
Contributor

@cryptoe cryptoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm. Some comments here and there.

@cryptoe cryptoe merged commit 280a0f7 into apache:master Nov 22, 2022
@cryptoe
Copy link
Copy Markdown
Contributor

cryptoe commented Nov 22, 2022

Thanks for the contribution @adarshsanjeev.

@cryptoe cryptoe added this to the 25.0 milestone Nov 22, 2022
adarshsanjeev added a commit to adarshsanjeev/druid that referenced this pull request Nov 22, 2022
* Add sketch fetching framework

* Refactor code to support sequential merge

* Update worker sketch fetcher

* Refactor sketch fetcher

* Refactor sketch fetcher

* Add context parameter and threshold to trigger sequential merge

* Fix test

* Add integration test for non sequential merge

* Address review comments

* Address review comments

* Address review comments

* Resolve maxRetainedBytes

* Add new classes

* Renamed key statistics information class

* Rename fetchStatisticsSnapshotForTimeChunk function

* Address review comments

* Address review comments

* Update documentation and add comments

* Resolve build issues

* Resolve build issues

* Change worker APIs to async

* Address review comments

* Resolve build issues

* Add null time check

* Update integration tests

* Address review comments

* Add log messages and comments

* Resolve build issues

* Add unit tests

* Add unit tests

* Fix timing issue in tests
@cryptoe cryptoe added Backport and removed Backport labels Nov 22, 2022
cryptoe pushed a commit that referenced this pull request Nov 23, 2022
* Add sketch fetching framework

* Refactor code to support sequential merge

* Update worker sketch fetcher

* Refactor sketch fetcher

* Refactor sketch fetcher

* Add context parameter and threshold to trigger sequential merge

* Fix test

* Add integration test for non sequential merge

* Address review comments

* Address review comments

* Address review comments

* Resolve maxRetainedBytes

* Add new classes

* Renamed key statistics information class

* Rename fetchStatisticsSnapshotForTimeChunk function

* Address review comments

* Address review comments

* Update documentation and add comments

* Resolve build issues

* Resolve build issues

* Change worker APIs to async

* Address review comments

* Resolve build issues

* Add null time check

* Update integration tests

* Address review comments

* Add log messages and comments

* Resolve build issues

* Add unit tests

* Add unit tests

* Fix timing issue in tests
gianm added a commit to gianm/druid that referenced this pull request Mar 13, 2024
Since apache#13205, a special deserializer module has no longer been necessary
to read key collector snapshots. This patch removes the unnecessary code.
vogievetsky pushed a commit that referenced this pull request Mar 18, 2024
Since #13205, a special deserializer module has no longer been necessary
to read key collector snapshots. This patch removes the unnecessary code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants