Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions docs/operations/python.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
---
id: python
title: "Python Installation"
---

<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->

Apache Druid startup script requires Python2 or Python3 interpreter.
Since Python2 is deprecated, this document has instructions to install Python3 interpreter.

## Python3 interpreter installation instructions

### Linux

#### Debian or Ubuntu
- `sudo apt update`
- `sudo apt install -y python3-pip`
#### RHEL
- `sudo yum install -y epel-release`
- `sudo yum install -y python3-pip`

### MacOS

#### Install with Homebrew
Refer [Installing Python 3 on Mac OS X](https://docs.python-guide.org/starting/install3/osx/)

#### Install the official Python release
* Browse to the [Python Downloads Page](https://www.python.org/downloads/) and download the latest version (3.x.x)

Verify if Python3 is installed by issuing `python3 --version` command.


34 changes: 24 additions & 10 deletions docs/operations/single-server.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,17 @@ title: "Single server deployment"
-->


Druid includes a set of reference configurations and launch scripts for single-machine deployments:

- `nano-quickstart`
- `micro-quickstart`
- `small`
- `medium`
- `large`
- `xlarge`
Druid includes a set of reference configurations and launch scripts for single-machine deployments.
These configuration bundles are located in `conf/druid/single-server/`.

The `auto` configuration sizes runtime parameters based on available processors and memory. Other configurations include hard-coded runtime parameters for various server sizes. Most users should stick with `auto`. Refer below [Druid auto start](#druid-auto-start)
- `auto` (run script: `bin/start-druid`)
- `nano-quickstart` (run script: `bin/start-nano-quickstart`)
- `micro-quickstart` (run script: `bin/start-micro-quickstart`)
- `small` (run script: `bin/start-single-server-small`)
- `medium` (run script: `bin/start-single-server-medium`)
- `large` (run script: `bin/start-single-server-large`)
- `xlarge` (run script: `bin/start-single-server-xlarge`)

The `micro-quickstart` is sized for small machines like laptops and is intended for quick evaluation use-cases.

Expand All @@ -44,6 +47,18 @@ The example configurations run the Druid Coordinator and Overlord together in a

While example configurations are provided for very large single machines, at higher scales we recommend running Druid in a [clustered deployment](../tutorials/cluster.md), for fault-tolerance and reduced resource contention.

## Druid auto start

Druid includes a launch script, `bin/start-druid` that automatically sets various memory-related parameters based on available processors and memory. It accepts optional arguments such as list of services, total memory and a config directory to override default JVM arguments and service-specific runtime properties.

`start-druid` is a generic launch script capable of starting any set of Druid services on a server.
It accepts optional arguments such as list of services, total memory and a config directory to override default JVM arguments and service-specific runtime properties.
Druid services will use all processors and up to 80% memory on the system.
For details about possible arguments, run `bin/start-druid --help`.

The corresponding launch scripts (e.g. `start-micro-quickstart`) are now deprecated.


## Single server reference configurations

### Nano-Quickstart: 1 CPU, 4GiB RAM
Expand Down Expand Up @@ -74,5 +89,4 @@ While example configurations are provided for very large single machines, at hig
### X-Large: 64 CPU, 512GiB RAM (~i3.16xlarge)

- Launch command: `bin/start-xlarge`
- Configuration directory: `conf/druid/single-server/xlarge`

- Configuration directory: `conf/druid/single-server/xlarge`
5 changes: 4 additions & 1 deletion docs/tutorials/cluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,10 @@ The [basic cluster tuning guide](../operations/basic-cluster-tuning.md) has info

## Select OS

We recommend running your favorite Linux distribution. You will also need [Java 8 or 11](../operations/java.md).
We recommend running your favorite Linux distribution. You will also need

* [Java 8 or 11](../operations/java.md).
* [Python2 or Python3](../operations/python.md)

> If needed, you can specify where to find Java using the environment variables
> `DRUID_JAVA_HOME` or `JAVA_HOME`. For more details run the `bin/verify-java` script.
Expand Down
43 changes: 22 additions & 21 deletions docs/tutorials/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,7 @@ title: "Quickstart (local)"
~ under the License.
-->


This quickstart gets you started with Apache Druid using the [`micro-quickstart`](../operations/single-server.md#micro-quickstart-4-cpu-16gib-ram) configuration, and introduces you to Druid ingestion and query features.
This quickstart gets you started with Apache Druid and introduces you to Druid ingestion and query features. For this tutorial, we recommend a machine with at least 6 GB of RAM.

In this quickstart, you'll do the following:
- install Druid
Expand All @@ -37,15 +36,16 @@ Druid supports a variety of ingestion options. Once you're done with this tutori

You can follow these steps on a relatively modest machine, such as a workstation or virtual server with 16 GiB of RAM.

Druid comes equipped with several [startup configuration profiles](../operations/single-server.md) for a
range of machine sizes. These range from `nano` (1 CPU, 4GiB RAM) to `x-large` (64 CPU, 512GiB RAM). For more
information, see [Single server deployment](../operations/single-server.md). For information on deploying Druid services
across clustered machines, see [Clustered deployment](./cluster.md).
Druid comes equipped with launch scripts that can be used to start all processes on a single server. Here, we will use [`auto`](../operations/single-server.md#druid-auto-start), which automatically sets various runtime properties based on available processors and memory.

In addition, Druid includes several [bundled non-automatic profiles](../operations/single-server.md) for a range of machine sizes. These range from nano (1 CPU, 4GiB RAM) to x-large (64 CPU, 512GiB RAM).
We won't use those here, but for more information, see [Single server deployment](../operations/single-server.md). For additional information on deploying Druid services across clustered machines, see [Clustered deployment](./cluster.md).

The software requirements for the installation machine are:

* Linux, Mac OS X, or other Unix-like OS. (Windows is not supported.)
* Java 8u92+ or Java 11.
* [Python2 or Python3](../operations/python.md)

> Druid relies on the environment variables `JAVA_HOME` or `DRUID_JAVA_HOME` to find Java on the machine. You can set
`DRUID_JAVA_HOME` if there is more than one instance of Java. To verify Java requirements for your environment, run the
Expand All @@ -72,38 +72,39 @@ The distribution directory contains `LICENSE` and `NOTICE` files and subdirector

## Start up Druid services

Start up Druid services using the `micro-quickstart` single-machine configuration.
Start up Druid services using the `auto` single-machine configuration.
This configuration includes default settings that are appropriate for this tutorial, such as loading the `druid-multi-stage-query` extension by default so that you can use the MSQ task engine.

You can view that setting and others in the configuration files in the `conf/druid/single-server/micro-quickstart/`.
You can view that setting and others in the configuration files in the `conf/druid/auto`.

From the apache-druid-{{DRUIDVERSION}} package root, run the following command:

```bash
./bin/start-micro-quickstart
./bin/start-druid
```

This brings up instances of ZooKeeper and the Druid services:

```bash
$ ./bin/start-micro-quickstart
[Thu Sep 8 18:30:00 2022] Starting Apache Druid.
[Thu Sep 8 18:30:00 2022] Open http://localhost:8888/ in your browser to access the web console.
[Thu Sep 8 18:30:00 2022] Or, if you have enabled TLS, use https on port 9088.
[Thu Sep 8 18:30:00 2022] Running command[zk], logging to[/apache-druid-{{DRUIDVERSION}}/var/sv/zk.log]: bin/run-zk conf
[Thu Sep 8 18:30:00 2022] Running command[coordinator-overlord], logging to[/apache-druid-{{DRUIDVERSION}}/var/sv/coordinator-overlord.log]: bin/run-druid coordinator-overlord conf/druid/single-server/micro-quickstart
[Thu Sep 8 18:30:00 2022] Running command[broker], logging to[/apache-druid-{{DRUIDVERSION}}/var/sv/broker.log]: bin/run-druid broker conf/druid/single-server/micro-quickstart
[Thu Sep 8 18:30:00 2022] Running command[router], logging to[/apache-druid-{{DRUIDVERSION}}/var/sv/router.log]: bin/run-druid router conf/druid/single-server/micro-quickstart
[Thu Sep 8 18:30:00 2022] Running command[historical], logging to[/apache-druid-{{DRUIDVERSION}}/var/sv/historical.log]: bin/run-druid historical conf/druid/single-server/micro-quickstart
[Thu Sep 8 18:30:00 2022] Running command[middleManager], logging to[/apache-druid-{{DRUIDVERSION}}/var/sv/middleManager.log]: bin/run-druid middleManager conf/druid/single-server/micro-quickstart
$ ./bin/start-druid
[Tue Nov 29 16:31:06 2022] Starting Apache Druid.
[Tue Nov 29 16:31:06 2022] Open http://localhost:8888/ in your browser to access the web console.
[Tue Nov 29 16:31:06 2022] Or, if you have enabled TLS, use https on port 9088.
[Tue Nov 29 16:31:06 2022] Starting services with log directory [/apache-druid-{{DRUIDVERSION}}/log].
[Tue Nov 29 16:31:06 2022] Running command[zk]: bin/run-zk conf
[Tue Nov 29 16:31:06 2022] Running command[broker]: bin/run-druid broker /apache-druid-{{DRUIDVERSION}}/conf/druid/single-server/quickstart '-Xms1187m -Xmx1187m -XX:MaxDirectMemorySize=791m'
[Tue Nov 29 16:31:06 2022] Running command[router]: bin/run-druid router /apache-druid-{{DRUIDVERSION}}/conf/druid/single-server/quickstart '-Xms128m -Xmx128m'
[Tue Nov 29 16:31:06 2022] Running command[coordinator-overlord]: bin/run-druid coordinator-overlord /apache-druid-{{DRUIDVERSION}}/conf/druid/single-server/quickstart '-Xms1290m -Xmx1290m'
[Tue Nov 29 16:31:06 2022] Running command[historical]: bin/run-druid historical /apache-druid-{{DRUIDVERSION}}/conf/druid/single-server/quickstart '-Xms1376m -Xmx1376m -XX:MaxDirectMemorySize=2064m'
[Tue Nov 29 16:31:06 2022] Running command[middleManager]: bin/run-druid middleManager /apache-druid-{{DRUIDVERSION}}/conf/druid/single-server/quickstart '-Xms64m -Xmx64m' '-Ddruid.worker.capacity=2 -Ddruid.indexer.runner.javaOptsArray=["-server","-Duser.timezone=UTC","-Dfile.encoding=UTF-8","-XX:+ExitOnOutOfMemoryError","-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager","-Xms256m","-Xmx256m","-XX:MaxDirectMemorySize=256m"]'
```

All persistent state, such as the cluster metadata store and segments for the services, are kept in the `var` directory under
the Druid root directory, apache-druid-{{DRUIDVERSION}}. Each service writes to a log file under `var/sv`.

At any time, you can revert Druid to its original, post-installation state by deleting the entire `var` directory. You may want to do this, for example, between Druid tutorials or after experimentation, to start with a fresh instance.

To stop Druid at any time, use CTRL+C in the terminal. This exits the `bin/start-micro-quickstart` script and terminates all Druid processes.
To stop Druid at any time, use CTRL+C in the terminal. This exits the `bin/start-druid` script and terminates all Druid processes.

## Open the web console

Expand Down Expand Up @@ -222,4 +223,4 @@ See the following topics for more information:
* [Tutorial: Load stream data from Apache Kafka](./tutorial-kafka.md) to load streaming data from a Kafka topic.
* [Extensions](../development/extensions.md) for details on Druid extensions.

Remember that after stopping Druid services, you can start clean next time by deleting the `var` directory from the Druid root directory and running the `bin/start-micro-quickstart` script again. You may want to do this before using other data ingestion tutorials, since they use the same Wikipedia datasource.
Remember that after stopping Druid services, you can start clean next time by deleting the `var` directory from the Druid root directory and running the `bin/start-druid` script again. You may want to do this before using other data ingestion tutorials, since they use the same Wikipedia datasource.
8 changes: 4 additions & 4 deletions docs/tutorials/tutorial-batch-hadoop.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ This tutorial shows you how to load data files into Apache Druid using a remote

For this tutorial, we'll assume that you've already completed the previous
[batch ingestion tutorial](tutorial-batch.md) using Druid's native batch ingestion system and are using the
`micro-quickstart` single-machine configuration as described in the [quickstart](index.md).
`auto` single-machine configuration as described in the [quickstart](../operations/single-server.md#druid-auto-start).

## Install Docker

Expand Down Expand Up @@ -156,7 +156,7 @@ cp /tmp/shared/hadoop_xml/*.xml {PATH_TO_DRUID}/conf/druid/single-server/micro-q

### Update Druid segment and log storage

In your favorite text editor, open `conf/druid/single-server/micro-quickstart/_common/common.runtime.properties`, and make the following edits:
In your favorite text editor, open `conf/druid/auto/_common/common.runtime.properties`, and make the following edits:

#### Disable local deep storage and enable HDFS deep storage

Expand Down Expand Up @@ -196,7 +196,7 @@ druid.indexer.logs.directory=/druid/indexing-logs

Once the Hadoop .xml files have been copied to the Druid cluster and the segment/log storage configuration has been updated to use HDFS, the Druid cluster needs to be restarted for the new configurations to take effect.

If the cluster is still running, CTRL-C to terminate the `bin/start-micro-quickstart` script, and re-run it to bring the Druid services back up.
If the cluster is still running, CTRL-C to terminate the `bin/start-druid` script, and re-run it to bring the Druid services back up.

## Load batch data

Expand All @@ -221,7 +221,7 @@ This tutorial is only meant to be used together with the [query tutorial](../tut

If you wish to go through any of the other tutorials, you will need to:
* Shut down the cluster and reset the cluster state by removing the contents of the `var` directory under the druid package.
* Revert the deep storage and task storage config back to local types in `conf/druid/single-server/micro-quickstart/_common/common.runtime.properties`
* Revert the deep storage and task storage config back to local types in `conf/druid/auto/_common/common.runtime.properties`
* Restart the cluster

This is necessary because the other ingestion tutorials will write to the same "wikipedia" datasource, and later tutorials expect the cluster to use local deep storage.
Expand Down
2 changes: 1 addition & 1 deletion docs/tutorials/tutorial-kafka.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ The tutorial guides you through the steps to load sample nested clickstream data

## Prerequisites

Before you follow the steps in this tutorial, download Druid as described in the [quickstart](index.md) using the [micro-quickstart](../operations/single-server.md#micro-quickstart-4-cpu-16gib-ram) single-machine configuration and have it running on your local machine. You don't need to have loaded any data.
Before you follow the steps in this tutorial, download Druid as described in the [quickstart](index.md) using the [auto](../operations/single-server.md#druid-auto-start) single-machine configuration and have it running on your local machine. You don't need to have loaded any data.

## Download and start Kafka

Expand Down
46 changes: 42 additions & 4 deletions examples/bin/run-druid
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
# specific language governing permissions and limitations
# under the License.

if [ "$#" -gt 2 ] || [ "$#" -eq 0 ]
if [ "$#" -gt 4 ] || [ "$#" -eq 0 ]
then
>&2 echo "usage: $0 <service> [conf-dir]"
exit 1
Expand Down Expand Up @@ -47,7 +47,45 @@ if [ ! -d "$LOG_DIR" ]; then mkdir -p $LOG_DIR; fi

echo "Running [$1], logging to [$LOG_DIR/$1.log] if no changes made to log4j2.xml"

if [ "$WHATAMI" = 'coordinator-overlord' ]
then
SERVER_NAME=coordinator
else
SERVER_NAME="$WHATAMI"
fi


if [ ! -f "$CONFDIR"/$WHATAMI/main.config ];
then
MAIN_CLASS="org.apache.druid.cli.Main server $SERVER_NAME"
else
MAIN_CLASS=`cat "$CONFDIR"/$WHATAMI/main.config | xargs`
fi

cd "$WHEREAMI/.."
exec "$WHEREAMI"/run-java -Ddruid.node.type=$1 "-Ddruid.log.path=$LOG_DIR" `cat "$CONFDIR"/"$WHATAMI"/jvm.config | xargs` \
-cp "$CONFDIR"/"$WHATAMI":"$CONFDIR"/_common:"$CONFDIR"/_common/hadoop-xml:"$CONFDIR"/../_common:"$CONFDIR"/../_common/hadoop-xml:"$WHEREAMI/../lib/*" \
`cat "$CONFDIR"/$WHATAMI/main.config | xargs`

CLASS_PATH="$CONFDIR"/"$WHATAMI":"$CONFDIR"/_common:"$CONFDIR"/_common/hadoop-xml:"$CONFDIR"/../_common:"$CONFDIR"/../_common/hadoop-xml:"$WHEREAMI/../lib/*"

if [ "$#" -eq 3 ] || [ "$#" -eq 4 ]
then
# args: <service> <conf_path> <jvm_args> or <service> <conf_path> <jvm_args> <mm_task_count mm_task_java_props>
JVMARGS=`cat "$CONFDIR/_common/common.jvm.config" | xargs`
JVMARGS+=' '
JVMARGS+=$3

if [ "$#" -eq 3 ]
then
# args: <service> <conf_path> <jvm_args>
exec "$WHEREAMI"/run-java -Ddruid.node.type=$1 "-Ddruid.log.path=$LOG_DIR" $JVMARGS \
-cp $CLASS_PATH $MAIN_CLASS
else
# args: <service> <conf_path> <jvm_args> <mm_task_count mm_task_java_props>
exec "$WHEREAMI"/run-java -Ddruid.node.type=$1 $4 "-Ddruid.log.path=$LOG_DIR" $JVMARGS \
-cp $CLASS_PATH $MAIN_CLASS
fi
else
# args: <service> <conf_path>
exec "$WHEREAMI"/run-java -Ddruid.node.type=$1 "-Ddruid.log.path=$LOG_DIR" \
`cat "$CONFDIR"/"$WHATAMI"/jvm.config | xargs` \
-cp $CLASS_PATH $MAIN_CLASS
fi
35 changes: 35 additions & 0 deletions examples/bin/start-druid
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
#!/bin/bash -eu

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

PWD="$(pwd)"
WHEREAMI="$(dirname "$0")"
WHEREAMI="$(cd "$WHEREAMI" && pwd)"

if [ -x "$(command -v python3)" ]
then
exec python3 "$WHEREAMI/start-druid-main.py" "$@"
elif [ -x "$(command -v python2)" ]
then
exec python2 "$WHEREAMI/start-druid-main.py" "$@"
elif [ -x "$(command -v python)" ]
then
exec python "$WHEREAMI/start-druid-main.py" "$@"
else
echo "python interepreter not found"
fi
Loading