Add master/data/query server concepts to docs/packaging by jon-wei · Pull Request #6916 · apache/druid

jon-wei · 2019-01-25T23:43:09Z

This is a PR for the changes described in proposal #6838.

Areas changed:

New architecture diagram in design/index.md
Adds Master/Data/Query server type concepts to design/index.md
Changed process-specific config and API docs to be subsections under Master/Data/Query groupings
Adjusted "clustering" doc to use server types
Structured the conf and quickstart/tutorial configurations to have master/data/query directories
Added page on process colocation

gianm

A comment on "docs/img/druid-architecture.png": try boxes instead of cylinders, since cylinders in architecture diagrams are typically used for databases, which these processes are not. (Until you combine them Voltron style)

On the directory layout changes (conf/historical -> conf/data/historical):

I think node.sh would need changes to work with the new structure.
The supervise program or its conf files would need changes too.
I'm wondering if it'd be best to revert the layout to the old structure (e.g. conf/historical). Colocation of processes into master/query/data servers is a suggested simplification but is not required, and the colocated configs would look weird if the processes are not actually colocated. And the necessary changes to scripts like supervise and node.sh mean the upgrade path would be somewhat bumpy.

There were a few places I commented about capitalizing server and process names. I don't think I caught them all, so if you see any others I didn't catch please adjust those too.

gianm · 2019-01-28T12:06:16Z

+  ~ under the License.
+  -->
+
+# Process Colocation


IMO, it'd be better to remove this doc and either incorporate its contents into design/index.md or to shorten what's in design/index.md and break out most of the content from the "Server Types" section into a new doc (maybe design/processes.md), and incorporate this content into that new page. The rationale for this suggestion is that it would be easier for users to grasp the concepts around server and process types if a single page spelled out all the info they need.

There's enough stuff here that probably the second makes sense: shortening what's in design/index.md and creating a design/processes.md that discusses server types and processes in more detail.

I restructured this to go with the second suggestion, the new page is at design/processes.md

gianm · 2019-01-28T12:11:49Z

+
+With higher levels of ingestion or query load, it can make sense to deploy the Historical and MiddleManager processes on separate nodes to to avoid CPU and memory contention. 
+
+The historical also benefits from having free memory for memory mapped segments, which can be another reason to deploy the Data Server processes separately.


IMO clearer to replace "the Data Server processes" with "the Historical and MiddleManager processes".

Replaced this with specific process references

gianm · 2019-01-28T12:13:11Z

+
+This simple cluster will feature:
+ - Scalable, fault-tolerant Data servers running Historical and MiddleManager processes
+ - Query servers, hosting Druid broker processes


Capitalize Broker.

Capitalized

gianm · 2019-01-28T12:13:52Z

-configuration as well.
+your needs. 
+
+This simple cluster will feature:


Consider reordering these to be in the same order as the sections below.

gianm · 2019-01-28T12:56:55Z

-## MiddleManager and Peons
+## Data Server
+
+This section contains the configuration options for the processes that reside on data servers (middle managers/peons and historicals).


Some suggested style adjustments and clarifications:

This section contains the configuration options for the processes that reside on Data servers (MiddleManagers and Historicals) in the suggested [three-server configuration](../design/index.html#server-types).

I'd suggest similar changes for the Master and Query sections in this doc.

Added suggested clarification

gianm · 2019-01-28T13:10:32Z

+
+A master server manages data ingestion and storage: it is responsible for starting new ingestion jobs and coordinating availability of data on the "Data servers" described below.
+
+Within a master server, functionality is split between two processes, the coordinator and overlord.


Capitalize Coordinator and Overlord.

Capitalized

gianm · 2019-01-28T13:11:35Z

+## Server Types

-* [**Historical**](../design/historical.html) processes are the workhorses that handle storage and querying on "historical" data
+A Druid cluster is organized into 3 server types:


The rule to follow with intro docs should be: guide people down a simple/recommended path, but don't mislead. "Druid cluster is organized into 3 server types" is somewhat misleading, consider this or similar instead:

Druid processes can be deployed any way you like, but for ease of deployment we suggest organizing them into three server types: Master, Query, and Data.

Reworded as suggested

gianm · 2019-01-28T13:22:30Z

+
+A data server executes ingestion jobs and stores queryable data.
+
+Within a data server, functionality is split between two processes, the historical and middle manager.


Capitalize Data, Historical, and MiddleManager.

Capitalized

gianm · 2019-01-28T13:23:01Z

-In addition to these process types, Druid also has three external dependencies. These are intended to be able to
+### External dependencies
+
+In addition to these server and process types, Druid also has three external dependencies. These are intended to be able to


In addition to its built-in process types

Might be more clear.

Reworded as suggested

gianm · 2019-01-28T13:28:08Z

-## MiddleManager
+## Data Server
+
+This section documents the API endpoints for the processes that reside on data servers (middle managers/peons and historicals).


Some suggested style adjustments and clarifications:

This section contains the API endpoints for the processes that reside on Data servers (MiddleManagers and Historicals) in the suggested [three-server configuration](../design/index.html#server-types).

I'd suggest similar changes for the Master and Query sections in this doc.

Added suggested clarification

jon-wei · 2019-01-29T02:11:36Z

@gianm

I addressed the line comments, updated the diagram to use boxes instead of cylinders, and did a pass through the docs to capitalize the Druid process names in all areas.

I reverted the conf directory changes.

I also went ahead and removed the "Indexing Service" concept from the TOC and elsewhere in the docs and replaced with references to Overlord or Overlord/MiddleManager directly, I felt it was too much to have another high-level process organization.

fjy · 2019-01-29T02:12:11Z

 ### Data distribution model

-Druid’s data distribution is segment-based and leverages a highly available "deep" storage such as S3 or HDFS. Scaling up (or down) does not require massive copy actions or downtime; in fact, losing any number of historical nodes does not result in data loss because new historical nodes can always be brought up by reading data from "deep" storage.
+Druid’s data distribution is segment-based and leverages a highly available "deep" storage such as S3 or HDFS. Scaling up (or down) does not require massive copy actions or downtime; in fact, losing any number of Historical nodes does not result in data loss because new Historical nodes can always be brought up by reading data from "deep" storage.


@gianm @jon-wei what do you think of universally replacing "node" with "process" everywhere?

I think it would help but would rather do it in a different PR.

that sounds good to me, but I'll do that in a later patch to keep this PR scoped down

fjy · 2019-01-29T02:12:47Z

 |--------|-----------|-------|
 |`druid.processing.buffer.sizeBytes`|This specifies a buffer size for the storage of intermediate results. The computation engine in both the Historical and Realtime nodes will use a scratch buffer of this size to do all of their intermediate computations off-heap. Larger values allow for more aggregations in a single pass over the data while smaller values can require more passes depending on the query that is being executed.|auto (max 1GB)|
-|`druid.processing.formatString`|Realtime and historical nodes use this format string to name their processing threads.|processing-%s|
+|`druid.processing.formatString`|Realtime and Historical nodes use this format string to name their processing threads.|processing-%s|


have we not deprecated real-time processes yet?

They've been deprecated but not removed from the docs yet

fjy · 2019-01-29T02:13:13Z

 `index_storage` and `descriptor_storage`.  Underneath the hood, the Cassandra integration leverages Astyanax.  The
 index storage table is a [Chunked Object](https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store) repository. It contains
-compressed segments for distribution to historical nodes.  Since segments can be large, the Chunked Object storage allows the integration to multi-thread
+compressed segments for distribution to Historical nodes.  Since segments can be large, the Chunked Object storage allows the integration to multi-thread


I think we should delete this page. I'm pretty sure C* doesn't even work as a deep storage.

IMO if we're going to delete this page we should delete the extension too. So it's not relevant to this patch.

Okay - we can do this as part of a later update.

fjy · 2019-01-29T02:14:17Z

+* [Coordinator](../design/coordinator.html)
+* [Overlord](../design/overlord.html)
+* [Broker](../design/broker.html)
+* [Router (Optional)](../development/router.html) 


optional processes should be at the end

but I can understand why the order is what it is

moved Router to the end

fjy · 2019-01-30T18:30:08Z

👍

gianm

Looking great so far!! Just a couple of line comments. In addition to those, in the conf files, you spelled "coordinator" wrong (as "coordiator").

gianm · 2019-01-30T18:56:37Z


 conf/druid:
-_common       broker        coordinator   historical    middleManager overlord
+_common data    master  query


If you meant to revert the conf/data/historical sort of changes, please revert this part too.

Ah thanks, reverted this and fixed the "coordiator" path

gianm · 2019-01-30T19:00:19Z

+
+[**Peon**](../design/peons.html) processes are task execution engines spawned by MiddleManagers. Each Peon runs a separate JVM and is responsible for executing a single task. Peons always run on the same host as the MiddleManager that spawned them.
+
+## Process Colocation


The title here is counter-intuitive: it's mostly talking about when you would not want to colocate. (The previous section is detailing the case for colocation.) How about re-titling it to "Pros and cons of colocation".

Renamed to "Pros and cons of colocation"

gianm

LGTM

* Add master/data/query server concepts to docs/packaging * PR comments * TOC and markdown fix * Update image legend * PR comment * More PR comments

Add master/data/query server concepts to docs/packaging

29bc1e2

jon-wei added the Area - Documentation label Jan 25, 2019

fjy added this to the 0.14.0 milestone Jan 25, 2019

gianm reviewed Jan 28, 2019

View reviewed changes

jon-wei added 4 commits January 28, 2019 17:02

PR comments

bef9068

Merge remote-tracking branch 'upstream/master' into server_type_docs

b112df0

TOC and markdown fix

6d9b701

Update image legend

38e1139

fjy reviewed Jan 29, 2019

View reviewed changes

PR comment

6172c77

gianm reviewed Jan 30, 2019

View reviewed changes

More PR comments

4f8a2f9

gianm approved these changes Jan 31, 2019

View reviewed changes

gianm merged commit 8213787 into apache:master Jan 31, 2019

This was referenced Feb 14, 2019

Time Ordering Option on Small-Result-Set Scan Queries #7024

Closed

[Proposal] Introduce concept of master/data/query servers in docs and packaging #6838

Closed


		With higher levels of ingestion or query load, it can make sense to deploy the Historical and MiddleManager processes on separate nodes to to avoid CPU and memory contention.

		The historical also benefits from having free memory for memory mapped segments, which can be another reason to deploy the Data Server processes separately. No newline at end of file


		A master server manages data ingestion and storage: it is responsible for starting new ingestion jobs and coordinating availability of data on the "Data servers" described below.

		Within a master server, functionality is split between two processes, the coordinator and overlord.


		A data server executes ingestion jobs and stores queryable data.

		Within a data server, functionality is split between two processes, the historical and middle manager.


		[Peon](../design/peons.html) processes are task execution engines spawned by MiddleManagers. Each Peon runs a separate JVM and is responsible for executing a single task. Peons always run on the same host as the MiddleManager that spawned them.

		## Process Colocation

Conversation

jon-wei commented Jan 25, 2019

Uh oh!

gianm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jon-wei Jan 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jon-wei commented Jan 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fjy commented Jan 30, 2019

Uh oh!

gianm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jon-wei Jan 29, 2019 •

edited

Loading

jon-wei commented Jan 29, 2019 •

edited

Loading

jon-wei Jan 30, 2019 •

edited

Loading