Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 22 additions & 22 deletions docs/configuration/index.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/configuration/logging.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ title: "Logging"
-->


Apache Druid processes will emit logs that are useful for debugging to the console. Druid processes also emit periodic metrics about their state. For more about metrics, see [Configuration](../configuration/index.html#enabling-metrics). Metric logs are printed to the console by default, and can be disabled with `-Ddruid.emitter.logging.logLevel=debug`.
Apache Druid processes will emit logs that are useful for debugging to the console. Druid processes also emit periodic metrics about their state. For more about metrics, see [Configuration](../configuration/index.md#enabling-metrics). Metric logs are printed to the console by default, and can be disabled with `-Ddruid.emitter.logging.logLevel=debug`.

Druid uses [log4j2](http://logging.apache.org/log4j/2.x/) for logging. Logging can be configured with a log4j2.xml file. Add the path to the directory containing the log4j2.xml file (e.g. the _common/ dir) to your classpath if you want to override default Druid log configuration. Note that this directory should be earlier in the classpath than the druid jars. The easiest way to do this is to prefix the classpath with the config dir.

Expand Down
2 changes: 1 addition & 1 deletion docs/dependencies/metadata-storage.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ druid.metadata.storage.connector.dbcp.maxConnLifetimeMillis=1200000
druid.metadata.storage.connector.dbcp.defaultQueryTimeout=30000
```

See [BasicDataSource Configuration](https://commons.apache.org/proper/commons-dbcp/configuration.html) for full list.
See [BasicDataSource Configuration](https://commons.apache.org/proper/commons-dbcp/configuration) for full list.

## Metadata storage tables

Expand Down
12 changes: 6 additions & 6 deletions docs/design/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ to the [metadata store](#metadata-storage). This entry is a self-describing bit
things like the schema of the segment, its size, and its location on deep storage. These entries are what the
Coordinator uses to know what data *should* be available on the cluster.

For details on the segment file format, please see [segment files](segments.html).
For details on the segment file format, please see [segment files](segments.md).

For details on modeling your data in Druid, see [schema design](../ingestion/schema-design.md).

Expand Down Expand Up @@ -229,9 +229,9 @@ publish in an all-or-nothing manner:
that has not yet been published can be rolled back if ingestion tasks fail. In this case, partially-ingested data is
discarded, and Druid will resume ingestion from the last committed set of stream offsets. This ensures exactly-once
publishing behavior.
- [Hadoop-based batch ingestion](../ingestion/hadoop.html). Each task publishes all segment metadata in a single
- [Hadoop-based batch ingestion](../ingestion/hadoop.md). Each task publishes all segment metadata in a single
transaction.
- [Native batch ingestion](../ingestion/native-batch.html). In parallel mode, the supervisor task publishes all segment
- [Native batch ingestion](../ingestion/native-batch.md). In parallel mode, the supervisor task publishes all segment
metadata in a single transaction after the subtasks are finished. In simple (single-task) mode, the single task
publishes all segment metadata in a single transaction after it is complete.

Expand All @@ -244,11 +244,11 @@ ingestion will not cause duplicate data to be ingested:
- Supervised "seekable-stream" ingestion methods like [Kafka](../development/extensions-core/kafka-ingestion.md) and
[Kinesis](../development/extensions-core/kinesis-ingestion.md) are idempotent due to the fact that stream offsets and
segment metadata are stored together and updated in lock-step.
- [Hadoop-based batch ingestion](../ingestion/hadoop.html) is idempotent unless one of your input sources
- [Hadoop-based batch ingestion](../ingestion/hadoop.md) is idempotent unless one of your input sources
is the same Druid datasource that you are ingesting into. In this case, running the same task twice is non-idempotent,
because you are adding to existing data instead of overwriting it.
- [Native batch ingestion](../ingestion/native-batch.html) is idempotent unless
[`appendToExisting`](../ingestion/native-batch.html) is true, or one of your input sources is the same Druid datasource
- [Native batch ingestion](../ingestion/native-batch.md) is idempotent unless
[`appendToExisting`](../ingestion/native-batch.md) is true, or one of your input sources is the same Druid datasource
that you are ingesting into. In either of these two cases, running the same task twice is non-idempotent, because you
are adding to existing data instead of overwriting it.

Expand Down
4 changes: 2 additions & 2 deletions docs/design/broker.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@ title: "Broker"

### Configuration

For Apache Druid Broker Process Configuration, see [Broker Configuration](../configuration/index.html#broker).
For Apache Druid Broker Process Configuration, see [Broker Configuration](../configuration/index.md#broker).

### HTTP endpoints

For a list of API endpoints supported by the Broker, see [Broker API](../operations/api-reference.html#broker).
For a list of API endpoints supported by the Broker, see [Broker API](../operations/api-reference.md#broker).

### Overview

Expand Down
12 changes: 6 additions & 6 deletions docs/design/coordinator.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@ title: "Coordinator Process"

### Configuration

For Apache Druid Coordinator Process Configuration, see [Coordinator Configuration](../configuration/index.html#coordinator).
For Apache Druid Coordinator Process Configuration, see [Coordinator Configuration](../configuration/index.md#coordinator).

### HTTP endpoints

For a list of API endpoints supported by the Coordinator, see [Coordinator API](../operations/api-reference.html#coordinator).
For a list of API endpoints supported by the Coordinator, see [Coordinator API](../operations/api-reference.md#coordinator).

### Overview

Expand Down Expand Up @@ -89,12 +89,12 @@ Once some segments are found, it issues a [compaction task](../ingestion/tasks.m
The maximum number of running compaction tasks is `min(sum of worker capacity * slotRatio, maxSlots)`.
Note that even though `min(sum of worker capacity * slotRatio, maxSlots)` = 0, at least one compaction task is always submitted
if the compaction is enabled for a dataSource.
See [Compaction Configuration API](../operations/api-reference.html#compaction-configuration) and [Compaction Configuration](../configuration/index.html#compaction-dynamic-configuration) to enable the compaction.
See [Compaction Configuration API](../operations/api-reference.md#compaction-configuration) and [Compaction Configuration](../configuration/index.md#compaction-dynamic-configuration) to enable the compaction.

Compaction tasks might fail due to the following reasons.

- If the input segments of a compaction task are removed or overshadowed before it starts, that compaction task fails immediately.
- If a task of a higher priority acquires a [time chunk lock](../ingestion/tasks.html#locking) for an interval overlapping with the interval of a compaction task, the compaction task fails.
- If a task of a higher priority acquires a [time chunk lock](../ingestion/tasks.md#locking) for an interval overlapping with the interval of a compaction task, the compaction task fails.

Once a compaction task fails, the Coordinator simply checks the segments in the interval of the failed task again, and issues another compaction task in the next run.

Expand Down Expand Up @@ -127,7 +127,7 @@ If the coordinator has enough task slots for compaction, this policy will contin
`bar_2017-10-01T00:00:00.000Z_2017-11-01T00:00:00.000Z_VERSION` and `bar_2017-10-01T00:00:00.000Z_2017-11-01T00:00:00.000Z_VERSION_1`.
Finally, `foo_2017-09-01T00:00:00.000Z_2017-10-01T00:00:00.000Z_VERSION` will be picked up even though there is only one segment in the time chunk of `2017-09-01T00:00:00.000Z/2017-10-01T00:00:00.000Z`.

The search start point can be changed by setting [skipOffsetFromLatest](../configuration/index.html#compaction-dynamic-configuration).
The search start point can be changed by setting [skipOffsetFromLatest](../configuration/index.md#compaction-dynamic-configuration).
If this is set, this policy will ignore the segments falling into the time chunk of (the end time of the most recent segment - `skipOffsetFromLatest`).
This is to avoid conflicts between compaction tasks and realtime tasks.
Note that realtime tasks have a higher priority than compaction tasks by default. Realtime tasks will revoke the locks of compaction tasks if their intervals overlap, resulting in the termination of the compaction task.
Expand All @@ -138,7 +138,7 @@ Note that realtime tasks have a higher priority than compaction tasks by default

### The Coordinator console

The Druid Coordinator exposes a web GUI for displaying cluster information and rule configuration. For more details, please see [coordinator console](../operations/management-uis.html#coordinator-consoles).
The Druid Coordinator exposes a web GUI for displaying cluster information and rule configuration. For more details, please see [coordinator console](../operations/management-uis.md#coordinator-consoles).

### FAQ

Expand Down
4 changes: 2 additions & 2 deletions docs/design/historical.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@ title: "Historical Process"

### Configuration

For Apache Druid Historical Process Configuration, see [Historical Configuration](../configuration/index.html#historical).
For Apache Druid Historical Process Configuration, see [Historical Configuration](../configuration/index.md#historical).

### HTTP endpoints

For a list of API endpoints supported by the Historical, please see the [API reference](../operations/api-reference.html#historical).
For a list of API endpoints supported by the Historical, please see the [API reference](../operations/api-reference.md#historical).

### Running

Expand Down
2 changes: 1 addition & 1 deletion docs/design/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ Druid servers fail, the system will automatically route around the damage until
is designed to run 24/7 with no need for planned downtimes for any reason, including configuration changes and software
updates.
6. **Cloud-native, fault-tolerant architecture that won't lose data.** Once Druid has ingested your data, a copy is
stored safely in [deep storage](architecture.html#deep-storage) (typically cloud storage, HDFS, or a shared filesystem).
stored safely in [deep storage](architecture.md#deep-storage) (typically cloud storage, HDFS, or a shared filesystem).
Your data can be recovered from deep storage even if every single Druid server fails. For more limited failures affecting
just a few Druid servers, replication ensures that queries are still possible while the system recovers.
7. **Indexes for quick filtering.** Druid uses [Roaring](https://roaringbitmap.org/) or
Expand Down
8 changes: 4 additions & 4 deletions docs/design/indexer.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ title: "Indexer Process"
~ under the License.
-->

> The Indexer is an optional and <a href="../development/experimental.html">experimental</a> feature.
> The Indexer is an optional and [experimental](../development/experimental.md) feature.
> Its memory management system is still under development and will be significantly enhanced in later releases.

The Apache Druid Indexer process is an alternative to the MiddleManager + Peon task execution system. Instead of forking a separate JVM process per-task, the Indexer runs tasks as separate threads within a single JVM process.
Expand All @@ -31,11 +31,11 @@ The Indexer is designed to be easier to configure and deploy compared to the Mid

### Configuration

For Apache Druid Indexer Process Configuration, see [Indexer Configuration](../configuration/index.html#indexer).
For Apache Druid Indexer Process Configuration, see [Indexer Configuration](../configuration/index.md#indexer).

### HTTP endpoints

The Indexer process shares the same HTTP endpoints as the [MiddleManager](../operations/api-reference.html#middlemanager).
The Indexer process shares the same HTTP endpoints as the [MiddleManager](../operations/api-reference.md#middlemanager).

### Running

Expand All @@ -51,7 +51,7 @@ The following resources are shared across all tasks running inside an Indexer pr

The query processing threads and buffers are shared across all tasks. The Indexer will serve queries from a single endpoint shared by all tasks.

If [query caching](../configuration/index.html#indexer-caching) is enabled, the query cache is also shared across all tasks.
If [query caching](../configuration/index.md#indexer-caching) is enabled, the query cache is also shared across all tasks.

#### Server HTTP threads

Expand Down
2 changes: 1 addition & 1 deletion docs/design/indexing-service.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Indexing [tasks](../ingestion/tasks.md) create (and sometimes destroy) Druid [se
The indexing service is composed of three main components: a [Peon](../design/peons.md) component that can run a single task, a [Middle Manager](../design/middlemanager.md) component that manages Peons, and an [Overlord](../design/overlord.md) component that manages task distribution to MiddleManagers.
Overlords and MiddleManagers may run on the same process or across multiple processes while MiddleManagers and Peons always run on the same process.

Tasks are managed using API endpoints on the Overlord service. Please see [Overlord Task API](../operations/api-reference.html#tasks) for more information.
Tasks are managed using API endpoints on the Overlord service. Please see [Overlord Task API](../operations/api-reference.md#tasks) for more information.

![Indexing Service](../assets/indexing_service.png "Indexing Service")

Expand Down
4 changes: 2 additions & 2 deletions docs/design/middlemanager.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@ title: "MiddleManager Process"

### Configuration

For Apache Druid MiddleManager Process Configuration, see [Indexing Service Configuration](../configuration/index.html#middlemanager-and-peons).
For Apache Druid MiddleManager Process Configuration, see [Indexing Service Configuration](../configuration/index.md#middlemanager-and-peons).

### HTTP endpoints

For a list of API endpoints supported by the MiddleManager, please see the [API reference](../operations/api-reference.html#middlemanager).
For a list of API endpoints supported by the MiddleManager, please see the [API reference](../operations/api-reference.md#middlemanager).

### Overview

Expand Down
6 changes: 3 additions & 3 deletions docs/design/overlord.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@ title: "Overlord Process"

### Configuration

For Apache Druid Overlord Process Configuration, see [Overlord Configuration](../configuration/index.html#overlord).
For Apache Druid Overlord Process Configuration, see [Overlord Configuration](../configuration/index.md#overlord).

### HTTP endpoints

For a list of API endpoints supported by the Overlord, please see the [API reference](../operations/api-reference.html#overlord).
For a list of API endpoints supported by the Overlord, please see the [API reference](../operations/api-reference.md#overlord).

### Overview

Expand All @@ -40,7 +40,7 @@ This mode is recommended if you intend to use the indexing service as the single

### Overlord console

The Overlord provides a UI for managing tasks and workers. For more details, please see [overlord console](../operations/management-uis.html#overlord-console).
The Overlord provides a UI for managing tasks and workers. For more details, please see [overlord console](../operations/management-uis.md#overlord-console).

### Blacklisted workers

Expand Down
4 changes: 2 additions & 2 deletions docs/design/peons.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@ title: "Peons"

### Configuration

For Apache Druid Peon Configuration, see [Peon Query Configuration](../configuration/index.html#peon-query-configuration) and [Additional Peon Configuration](../configuration/index.html#additional-peon-configuration).
For Apache Druid Peon Configuration, see [Peon Query Configuration](../configuration/index.md#peon-query-configuration) and [Additional Peon Configuration](../configuration/index.md#additional-peon-configuration).

### HTTP endpoints

For a list of API endpoints supported by the Peon, please see the [Peon API reference](../operations/api-reference.html#peon).
For a list of API endpoints supported by the Peon, please see the [Peon API reference](../operations/api-reference.md#peon).

Peons run a single task in a single JVM. MiddleManager is responsible for creating Peons for running tasks.
Peons should rarely (if ever for testing purposes) be run on their own.
Expand Down
2 changes: 1 addition & 1 deletion docs/design/processes.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ In clusters with very high segment counts, it can make sense to separate the Coo

The Coordinator and Overlord processes can be run as a single combined process by setting the `druid.coordinator.asOverlord.enabled` property.

Please see [Coordinator Configuration: Operation](../configuration/index.html#coordinator-operation) for details.
Please see [Coordinator Configuration: Operation](../configuration/index.md#coordinator-operation) for details.

### Historicals and MiddleManagers

Expand Down
4 changes: 2 additions & 2 deletions docs/design/router.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,11 @@ In addition to query routing, the Router also runs the [Druid Console](../operat

### Configuration

For Apache Druid Router Process Configuration, see [Router Configuration](../configuration/index.html#router).
For Apache Druid Router Process Configuration, see [Router Configuration](../configuration/index.md#router).

### HTTP endpoints

For a list of API endpoints supported by the Router, see [Router API](../operations/api-reference.html#router).
For a list of API endpoints supported by the Router, see [Router API](../operations/api-reference.md#router).

### Running

Expand Down
2 changes: 1 addition & 1 deletion docs/development/extensions-contrib/materialized-view.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ A sample derivativeDataSource supervisor spec is shown below:
|baseDataSource |The name of base dataSource. This dataSource data should be already stored inside Druid, and the dataSource will be used as input data.|yes|
|dimensionsSpec |Specifies the dimensions of the data. These dimensions must be the subset of baseDataSource's dimensions.|yes|
|metricsSpec |A list of aggregators. These metrics must be the subset of baseDataSource's metrics. See [aggregations](../../querying/aggregations.md).|yes|
|tuningConfig |TuningConfig must be HadoopTuningConfig. See [Hadoop tuning config](../../ingestion/hadoop.html#tuningconfig).|yes|
|tuningConfig |TuningConfig must be HadoopTuningConfig. See [Hadoop tuning config](../../ingestion/hadoop.md#tuningconfig).|yes|
|dataSource |The name of this derived dataSource. |no(default=baseDataSource-hashCode of supervisor)|
|hadoopDependencyCoordinates |A JSON array of Hadoop dependency coordinates that Druid will use, this property will override the default Hadoop coordinates. Once specified, Druid will look for those Hadoop dependencies from the location specified by druid.extensions.hadoopDependenciesDir |no|
|classpathPrefix |Classpath that will be prepended for the Peon process. |no|
Expand Down
4 changes: 2 additions & 2 deletions docs/development/extensions-contrib/moving-average-query.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Moving Average encapsulates the [groupBy query](../../querying/groupbyquery.md)

It runs the query in two main phases:

1. Runs an inner [groupBy](../../querying/groupbyquery.html) or [timeseries](../../querying/timeseriesquery.html) query to compute Aggregators (i.e. daily count of events).
1. Runs an inner [groupBy](../../querying/groupbyquery.md) or [timeseries](../../querying/timeseriesquery.md) query to compute Aggregators (i.e. daily count of events).
2. Passes over aggregated results in Broker, in order to compute Averagers (i.e. moving 7 day average of the daily count).

#### Main enhancements provided by this extension:
Expand Down Expand Up @@ -70,7 +70,7 @@ There are currently no configuration properties specific to Moving Average.
|dimensions|A JSON list of [DimensionSpec](../../querying/dimensionspecs.md) (Notice that property is optional)|no|
|limitSpec|See [LimitSpec](../../querying/limitspec.md)|no|
|having|See [Having](../../querying/having.md)|no|
|granularity|A period granularity; See [Period Granularities](../../querying/granularities.html#period-granularities)|yes|
|granularity|A period granularity; See [Period Granularities](../../querying/granularities.md#period-granularities)|yes|
|filter|See [Filters](../../querying/filters.md)|no|
|aggregations|Aggregations forms the input to Averagers; See [Aggregations](../../querying/aggregations.md)|yes|
|postAggregations|Supports only aggregations as input; See [Post Aggregations](../../querying/post-aggregations.md)|no|
Expand Down
Loading