-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Add master/data/query server concepts to docs/packaging #6916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
29bc1e2
bef9068
b112df0
6d9b701
38e1139
6172c77
4f8a2f9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -61,7 +61,7 @@ The realtime node uses several of the global configs in [Configuration](../confi | |
| |Property|Description|Default| | ||
| |--------|-----------|-------| | ||
| |`druid.processing.buffer.sizeBytes`|This specifies a buffer size for the storage of intermediate results. The computation engine in both the Historical and Realtime nodes will use a scratch buffer of this size to do all of their intermediate computations off-heap. Larger values allow for more aggregations in a single pass over the data while smaller values can require more passes depending on the query that is being executed.|auto (max 1GB)| | ||
| |`druid.processing.formatString`|Realtime and historical nodes use this format string to name their processing threads.|processing-%s| | ||
| |`druid.processing.formatString`|Realtime and Historical nodes use this format string to name their processing threads.|processing-%s| | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. have we not deprecated real-time processes yet?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. They've been deprecated but not removed from the docs yet |
||
| |`druid.processing.numMergeBuffers`|The number of direct memory buffers available for merging query results. The buffers are sized by `druid.processing.buffer.sizeBytes`. This property is effectively a concurrency limit for queries that require merging buffers. If you are using any queries that require merge buffers (currently, just groupBy v2) then you should have at least two of these.|`max(2, druid.processing.numThreads / 4)`| | ||
| |`druid.processing.numThreads`|The number of processing threads to have available for parallel processing of segments. Our rule of thumb is `num_cores - 1`, which means that even under heavy load there will still be one core available to do background tasks like talking with ZooKeeper and pulling down segments. If only one core is available, this property defaults to the value `1`.|Number of cores - 1 (or 1)| | ||
| |`druid.processing.columnCache.sizeBytes`|Maximum size in bytes for the dimension value lookup cache. Any value greater than `0` enables the cache. It is currently disabled by default. Enabling the lookup cache can significantly improve the performance of aggregators operating on dimension values, such as the JavaScript aggregator, or cardinality aggregator, but can slow things down if the cache hit rate is low (i.e. dimensions with few repeating values). Enabling it may also require additional garbage collection tuning to avoid long GC pauses.|`0` (disabled)| | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -29,7 +29,7 @@ title: "Cassandra Deep Storage" | |
| Druid can use Cassandra as a deep storage mechanism. Segments and their metadata are stored in Cassandra in two tables: | ||
| `index_storage` and `descriptor_storage`. Underneath the hood, the Cassandra integration leverages Astyanax. The | ||
| index storage table is a [Chunked Object](https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store) repository. It contains | ||
| compressed segments for distribution to historical nodes. Since segments can be large, the Chunked Object storage allows the integration to multi-thread | ||
| compressed segments for distribution to Historical nodes. Since segments can be large, the Chunked Object storage allows the integration to multi-thread | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should delete this page. I'm pretty sure C* doesn't even work as a deep storage.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO if we're going to delete this page we should delete the extension too. So it's not relevant to this patch.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay - we can do this as part of a later update. |
||
| the write to Cassandra, and spreads the data across all the nodes in a cluster. The descriptor storage table is a normal C* table that | ||
| stores the segment metadatak. | ||
|
|
||
|
|
@@ -52,7 +52,7 @@ CREATE TABLE descriptor_storage(key varchar, | |
| First create the schema above. I use a new keyspace called `druid` for this purpose, which can be created using the | ||
| [Cassandra CQL `CREATE KEYSPACE`](http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/create_keyspace_r.html) command. | ||
|
|
||
| Then, add the following to your historical and realtime runtime properties files to enable a Cassandra backend. | ||
| Then, add the following to your Historical and realtime runtime properties files to enable a Cassandra backend. | ||
|
|
||
| ```properties | ||
| druid.extensions.loadList=["druid-cassandra-storage"] | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gianm @jon-wei what do you think of universally replacing "node" with "process" everywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would help but would rather do it in a different PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that sounds good to me, but I'll do that in a later patch to keep this PR scoped down