Add basic tuning guide, getting started page, updated clustering docs#7629
Add basic tuning guide, getting started page, updated clustering docs#7629fjy merged 19 commits intoapache:masterfrom
Conversation
| To estimate total memory usage of the Historical under these guidelines: | ||
|
|
||
| - Heap: `(0.5GB * number of CPU cores) + (2 * total size of lookup maps) + druid.cache.sizeInBytes` | ||
| - Direct Memory: `(druid.processing.numThreads + druid.processing.numMergeBuffers + 1) * druid.processing.buffer.sizeBytes` |
There was a problem hiding this comment.
Free space for page cache seems important enough that it should be listed here I think as going into the total calculation of historical usage.
There was a problem hiding this comment.
Added a note on the importance of free system memory here
|
|
||
| The Broker heap requirements scale based on the number of segments in the cluster, and the total data size of the segments. | ||
|
|
||
| The heap size will vary based on data size and usage patterns, but 4G to 8G is a good starting point for a small or medium cluster. For a rough estimate of memory requirements on the high end, very large clusters with a node count on the order of ~100 nodes may need Broker heaps of 30GB-60GB. |
There was a problem hiding this comment.
Should we clarify what we consider a "small or medium cluster"?
There was a problem hiding this comment.
Added clarification here (~15 nodes or less)
|
|
||
| The biggest contributions to heap usage on Brokers are: | ||
| - Partial unmerged query results from Historicals and Tasks | ||
| - The segment timeline |
There was a problem hiding this comment.
It might be worth including that this also consists of like the locations of all the segments on all historicals and realtime tasks, but I'm not sure how to do that concisely.
There was a problem hiding this comment.
I added an explanation of the timeline and cached metadata
|
|
||
| #### Number of Brokers | ||
|
|
||
| A 1:15 ratio of Brokers to Historicals is a reasonable starting point (this is not a hard rule). |
There was a problem hiding this comment.
It might be obvious, but regardless maybe worth calling out the exception that if you need HA for queries that you should use that ratio after the first 2 brokers.
|
|
||
| The heap requirements of the Overlord scale with the number of servers, segments, and tasks in the cluster. | ||
|
|
||
| You can set the Overlord heap to the same size as your Broker heap, or slightly smaller: both services have to process cluster-wide state and answer API requests about this state. |
There was a problem hiding this comment.
This doesn't sound right to me, an overlord isn't as heavy weight as a coordinator or broker
There was a problem hiding this comment.
Adjust overlord recommendation to 25%-50% of coordinator heap
|
|
||
| Please see the [General Connection Pool Guidelines](#general-connection-pool-guidelines) section for an overview of connection pool configuration. | ||
|
|
||
| On the Brokers, please ensure that the sum of `druid.broker.http.numConnections` across all the Brokers is slightly lower than the value of `druid.server.http.numThreads` on your Historicals. |
There was a problem hiding this comment.
Should this be just historicals or also include realtime tasks?
|
|
||
| `druid.processing.buffer.sizeBytes` is a closely related property that controls the size of the off-heap buffers allocated to the processing threads. | ||
|
|
||
| One buffer is allocated for each processing thread. A size between 500MB and 1GB (the default) is a reasonable choice for general use. |
There was a problem hiding this comment.
1G isn't the default, it is currently auto calculated if it can detect the amount of direct memory according to the formula, for java 8 at least.
There was a problem hiding this comment.
Removed mention of default
|
|
||
| For Historicals, `druid.server.http.numThreads` should be set to a value slightly higher than the sum of `druid.broker.http.numConnections` across all the Brokers in the cluster. | ||
|
|
||
| Tuning the cluster so that each Historical can accept 50 queries and 10 non-queries is a reasonable starting point. |
There was a problem hiding this comment.
To me this statement is not clear. What is the definition of queries vs. non-queries and how to configure a number for each?
There was a problem hiding this comment.
I updated the "General Connection Pool Guidelines" section with definition of queries vs non-queries, added a note on exact tuning-> depends on your inflow vs. drain rate for requests which depends on the specific queries you're running
| - A Query server, hosting the Druid Broker and Router processes | ||
|
|
||
| In production, we recommend deploying multiple Master servers with Coordinator and Overlord processes in a fault-tolerant configuration as well. | ||
| In production, we recommend deploying multiple Master servers with Coordinator and Overlord processes in a fault-tolerant configuration. |
There was a problem hiding this comment.
Probably multiple query servers too for fault tolerance on the user side of things?
There was a problem hiding this comment.
Adjusted this to recommend multiple master/query servers in production while dropping the example to one master server
| -XX:MaxDirectMemorySize=12g | ||
| -Xms12g | ||
| -Xmx12g | ||
| -XX:MaxDirectMemorySize=6g |
There was a problem hiding this comment.
I think you've sized this to run on 8 core 32g machine, should you add some more merge buffers or larger buffer sizes to use up a bit more of the space, and any of the rest give to heap for breathing room?
There was a problem hiding this comment.
I doubled the number of merge buffers
|
|
||
| The example cluster above is chosen as a single example out of many possible ways to size a Druid cluster. | ||
|
|
||
| You can choose smaller/larger hardware or less/more servers for your specific needs and constraints. |
There was a problem hiding this comment.
I think it might also be worth mentioning here that advanced use cases can also choose not to co-locate services, to scale out the workload? It doesn't need to be very detailed, it just isn't incompatible with the following part that directs the reader to the basic tuning guide which describes how all the processes use resources independently, so it seems like there is no reason to leave it out since this section is about making choices.
There was a problem hiding this comment.
Added a note on not colocating
| - A query server, hosting the Druid Broker and Router processes | ||
|
|
||
| In production, we recommend deploying multiple Master servers with Coordinator and Overlord processes in a fault-tolerant configuration as well. | ||
| In production, we recommend deploying multiple Master servers and multiple Query servers in a fault-tolerant configuration based on your specific fault-tolerance needs, but you can get started quickly with one Master and one Query server and add more servers later. |
There was a problem hiding this comment.
What about migrating the metadata store from local derby to something else? I think we should talk about HA and the metadata store
There was a problem hiding this comment.
Nevermind, it is covered later
|
👍 |
|
|
||
| Segments are memory-mapped by Historical processes using any available free system memory (i.e., memory not used by the Historical JVM and heap/direct memory buffers or other processes on the system). Segments that are not currently in memory will be paged from disk when queried. | ||
|
|
||
| Therefore, `druid.server.maxSize` should be set such that a Historical is not allocated an excessive amount of segment data. As the value of (`free system memory` / `druid.server.maxSize`) increases, a greater proportion of segments can be kept in memory, allowing for better query performance. |
There was a problem hiding this comment.
Should we also mention that druid.server.maxSize should be sum of druid.segmentCache.locations across all cache locations?
There was a problem hiding this comment.
Added info about druid.segmentCache.locations and how it relates to druid.server.maxSize
|
|
||
| Tuning the cluster so that each Task can accept 50 queries and 10 non-queries is a reasonable starting point. | ||
|
|
||
| #### SSD storage |
There was a problem hiding this comment.
How about moving this section to above Task Configurations since this is also a configuration for middleManagers?
There was a problem hiding this comment.
Moved to MM section
|
|
||
| The TopN and GroupBy queries use these buffers to store intermediate computed results. As the buffer size increases, more data can be processed in a single pass. | ||
|
|
||
| ## GroupBy Merging Buffers |
There was a problem hiding this comment.
Should it be groupBy v2 because it's the only one using merging buffers?
|
Oh wait, this is a doc change. I'm changing my vote to just +1. |
|
I'm backporting this to 0.15.0-incubating since it contains changes for some scripts and tutorial configurations. |
…apache#7629) * Add basic tuning guide, getting started page, updated clustering docs * Add note about caching, fix tutorial paths * Adjust hadoop wording * Add license * Tweak * Shrink overlord heaps, fix tutorial urls * Tweak xlarge peon, update peon sizing * Update Data peon buffer size * Fix cluster start scripts * Add upper level _common to classpath * Fix cluster data/query confs * Address PR comments * Elaborate on connection pools * PR comments * Increase druid.broker.http.maxQueuedBytes * Add guidelines for broker backpressure * PR comments
…#7629) (#7684) * Add basic tuning guide, getting started page, updated clustering docs * Add note about caching, fix tutorial paths * Adjust hadoop wording * Add license * Tweak * Shrink overlord heaps, fix tutorial urls * Tweak xlarge peon, update peon sizing * Update Data peon buffer size * Fix cluster start scripts * Add upper level _common to classpath * Fix cluster data/query confs * Address PR comments * Elaborate on connection pools * PR comments * Increase druid.broker.http.maxQueuedBytes * Add guidelines for broker backpressure * PR comments
…apache#7629) (apache#7684) * Add basic tuning guide, getting started page, updated clustering docs * Add note about caching, fix tutorial paths * Adjust hadoop wording * Add license * Tweak * Shrink overlord heaps, fix tutorial urls * Tweak xlarge peon, update peon sizing * Update Data peon buffer size * Fix cluster start scripts * Add upper level _common to classpath * Fix cluster data/query confs * Address PR comments * Elaborate on connection pools * PR comments * Increase druid.broker.http.maxQueuedBytes * Add guidelines for broker backpressure * PR comments
…apache#7629) * Add basic tuning guide, getting started page, updated clustering docs * Add note about caching, fix tutorial paths * Adjust hadoop wording * Add license * Tweak * Shrink overlord heaps, fix tutorial urls * Tweak xlarge peon, update peon sizing * Update Data peon buffer size * Fix cluster start scripts * Add upper level _common to classpath * Fix cluster data/query confs * Address PR comments * Elaborate on connection pools * PR comments * Increase druid.broker.http.maxQueuedBytes * Add guidelines for broker backpressure * PR comments
This PR:
operations/basic-cluster-tuning.mdtutorials/cluster.mdwith information on migrating from a single-server deployment, and changes the examples to use the newbin/start-cluster-*scripts and configurations underexamples/conf/druid/cluster