[Proposal] Introduce concept of master/data/query servers in docs and packaging

Motivation
--------------
Druid currently has 6 process types:
- coordinator
- overlord
- broker
- router
- historical
- middle manager

For someone getting started with Druid, this can be a lot of new and distinct concepts to understand. Additionally, it is not always clear from the process names what each process type is responsible for. For example, "coordinator", "overlord", and "middle manager" (and possibly even "broker") all suggest some kind of cluster management functionality.

As a result, some initial confusion is not uncommon when trying to understand Druid's cluster architecture.

Such concerns re: process organization and naming have been discussed in the Druid community in the past, e.g.:
* Proposal] eventual removal of separate overlord node: https://github.com/apache/incubator-druid/issues/3696 
* Discussion thread on process naming: https://groups.google.com/d/msg/druid-development/6Lz7CGgkgBI/M98t7ok_DwAJ

Proposed Changes
--------------
This proposal suggests that we introduce the following "server type" concepts to the Druid documentation and default packaging, where a "server type" is a deployment grouping of the existing Druid processes:

* Master server
  - Coordinator + Overlord processes
  - Manages data ingestion and storage: responsible for starting new ingestion jobs and coordinating availability of data on the "Data servers" described below

* Query server
  - Broker + Router processes
  - The endpoints that users and client applications interact with, routing queries to data servers or other query servers (and optionally proxied master server requests as well)

* Data server
  - Historical + Middle manager processes
  - Executes ingestion jobs and stores all queryable data

We have been using this master/query/data server organization in our docs and default packaging at Imply, and we've found in practice that this structure helps users grasp Druid's architecture more quickly.

The Druid docs and packaging would be updated to guide a new user in thinking of a cluster in terms of these larger process groupings:
- Introduce a new page or section describing these server types at a high level
- Rework docs to relate discussion of specific processes to larger "server type" grouping where appropriate
- Update quickstart and config templates
  - Create a "master" server config template, running a combined Coordinator and Overlord with `druid.coordinator.asOverlord.enabled` set to true
  - Create a "query" server config template, including the broker and possibly a colocated router too (maybe when we feel like moving this out of experimental status?)
  - Create a "data" server config template, including a colocated historical and middle manager
- For users with more complex resource allocation requirements, the documentation should clearly describe how/why the processes within these "server types" can be deployed/scaled individually. The docs would frame deployments with separated processes as a more "advanced" architecture, suggesting the simpler consolidated deployments for most users.

New or Changed Public Interfaces
--------------
No public interfaces are changed.

Compatibility
--------------
These are conceptual changes to the docs and packaged templates only, existing clusters would not be affected.

Potential future work
--------------
- These suggested doc and packaging changes align well with this earlier proposal https://github.com/apache/incubator-druid/issues/3696 for consolidating the coordinator and overlord at the code level.
- For simplicity, the broker and router could be consolidated into a single process as well.
- More hypothetically, we could consider consolidating the historical and middle manager into one process as well, this could help enable better dynamic resource allocation decisions for example.

Alternatives
--------------
None


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Introduce concept of master/data/query servers in docs and packaging #6838

Motivation

Proposed Changes

New or Changed Public Interfaces

Compatibility

Potential future work

Alternatives

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Proposal] Introduce concept of master/data/query servers in docs and packaging #6838

Description

Motivation

Proposed Changes

New or Changed Public Interfaces

Compatibility

Potential future work

Alternatives

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions