Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ public class LifecycleModule implements Module
* is materialized and injected, meaning that objects are not actually instantiated in dependency order.
* Registering with the LifecyceModule, on the other hand, will instantiate the objects after the normal object
* graph has already been instantiated, meaning that objects will be created in dependency order and this will
* only actually instantiate something that wasn't actually dependend upon.
* only actually instantiate something that wasn't actually depended upon.
*
* @param clazz the class to instantiate
* @return this, for chaining.
Expand All @@ -85,7 +85,7 @@ public static void register(Binder binder, Class<?> clazz)
* is materialized and injected, meaning that objects are not actually instantiated in dependency order.
* Registering with the LifecyceModule, on the other hand, will instantiate the objects after the normal object
* graph has already been instantiated, meaning that objects will be created in dependency order and this will
* only actually instantiate something that wasn't actually dependend upon.
* only actually instantiate something that wasn't actually depended upon.
*
* @param clazz the class to instantiate
* @param annotation The annotation class to register with Guice
Expand All @@ -110,7 +110,7 @@ public static void register(Binder binder, Class<?> clazz, Class<? extends Annot
* is materialized and injected, meaning that objects are not actually instantiated in dependency order.
* Registering with the LifecyceModule, on the other hand, will instantiate the objects after the normal object
* graph has already been instantiated, meaning that objects will be created in dependency order and this will
* only actually instantiate something that wasn't actually dependend upon.
* only actually instantiate something that wasn't actually depended upon.
*
* @param key The key to use in finding the DruidNode instance
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@

/**
*/
@SuppressWarnings("serial")
public class ISE extends IllegalStateException implements SanitizableException
{
public ISE(String formatText, Object... arguments)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -265,7 +265,6 @@ public void cleanup(Iterator<T> iterFromMake)
* {@link MergeCombineAction} do a final merge combine of all the parallel computed results, again pushing
* {@link ResultBatch} into a {@link BlockingQueue} with a {@link QueuePusher}.
*/
@SuppressWarnings("serial")
private static class MergeCombinePartitioningAction<T> extends RecursiveAction
{
private final List<Sequence<T>> sequences;
Expand Down Expand Up @@ -502,7 +501,6 @@ private int computeNumTasks()
* how many times a task has continued executing, and utilized to compute a cumulative moving average of task run time
* per amount yielded in order to 'smooth' out the continual adjustment.
*/
@SuppressWarnings("serial")
private static class MergeCombineAction<T> extends RecursiveAction
{
private final PriorityQueue<BatchedResultsCursor<T>> pQueue;
Expand Down Expand Up @@ -685,7 +683,6 @@ protected void compute()
* majority of its time will be spent managed blocking until results are ready for each cursor, or will be incredibly
* short lived if all inputs are already available.
*/
@SuppressWarnings("serial")
private static class PrepareMergeCombineInputsAction<T> extends RecursiveAction
{
private final List<BatchedResultsCursor<T>> partition;
Expand Down
48 changes: 29 additions & 19 deletions docs/operations/api-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,39 +128,39 @@ Returns the serialized JSON of segments to load and drop for each Historical pro
#### Segment Loading by Datasource

Note that all _interval_ query parameters are ISO 8601 strings (e.g., 2016-06-27/2016-06-28).
Also note that these APIs only guarantees that the segments are available at the time of the call.
Also note that these APIs only guarantees that the segments are available at the time of the call.
Segments can still become missing because of historical process failures or any other reasons afterward.

##### GET

* `/druid/coordinator/v1/datasources/{dataSourceName}/loadstatus?forceMetadataRefresh={boolean}&interval={myInterval}`

Returns the percentage of segments actually loaded in the cluster versus segments that should be loaded in the cluster for the given
datasource over the given interval (or last 2 weeks if interval is not given). `forceMetadataRefresh` is required to be set.
Setting `forceMetadataRefresh` to true will force the coordinator to poll latest segment metadata from the metadata store
(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms
Returns the percentage of segments actually loaded in the cluster versus segments that should be loaded in the cluster for the given
datasource over the given interval (or last 2 weeks if interval is not given). `forceMetadataRefresh` is required to be set.
Setting `forceMetadataRefresh` to true will force the coordinator to poll latest segment metadata from the metadata store
(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms
of the load on the metadata store but can be necessary to make sure that we verify all the latest segments' load status)
Setting `forceMetadataRefresh` to false will use the metadata cached on the coordinator from the last force/periodic refresh.
Setting `forceMetadataRefresh` to false will use the metadata cached on the coordinator from the last force/periodic refresh.
If no used segments are found for the given inputs, this API returns `204 No Content`

* `/druid/coordinator/v1/datasources/{dataSourceName}/loadstatus?simple&forceMetadataRefresh={boolean}&interval={myInterval}`

Returns the number of segments left to load until segments that should be loaded in the cluster are available for the given datasource
over the given interval (or last 2 weeks if interval is not given). This does not include segment replication counts. `forceMetadataRefresh` is required to be set.
Setting `forceMetadataRefresh` to true will force the coordinator to poll latest segment metadata from the metadata store
(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms
Returns the number of segments left to load until segments that should be loaded in the cluster are available for the given datasource
over the given interval (or last 2 weeks if interval is not given). This does not include segment replication counts. `forceMetadataRefresh` is required to be set.
Setting `forceMetadataRefresh` to true will force the coordinator to poll latest segment metadata from the metadata store
(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms
of the load on the metadata store but can be necessary to make sure that we verify all the latest segments' load status)
Setting `forceMetadataRefresh` to false will use the metadata cached on the coordinator from the last force/periodic refresh.
If no used segments are found for the given inputs, this API returns `204 No Content`
Setting `forceMetadataRefresh` to false will use the metadata cached on the coordinator from the last force/periodic refresh.
If no used segments are found for the given inputs, this API returns `204 No Content`

* `/druid/coordinator/v1/datasources/{dataSourceName}/loadstatus?full&forceMetadataRefresh={boolean}&interval={myInterval}`

Returns the number of segments left to load in each tier until segments that should be loaded in the cluster are all available for the given datasource
over the given interval (or last 2 weeks if interval is not given). This includes segment replication counts. `forceMetadataRefresh` is required to be set.
Setting `forceMetadataRefresh` to true will force the coordinator to poll latest segment metadata from the metadata store
(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms
Returns the number of segments left to load in each tier until segments that should be loaded in the cluster are all available for the given datasource
over the given interval (or last 2 weeks if interval is not given). This includes segment replication counts. `forceMetadataRefresh` is required to be set.
Setting `forceMetadataRefresh` to true will force the coordinator to poll latest segment metadata from the metadata store
(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms
of the load on the metadata store but can be necessary to make sure that we verify all the latest segments' load status)
Setting `forceMetadataRefresh` to false will use the metadata cached on the coordinator from the last force/periodic refresh.
Setting `forceMetadataRefresh` to false will use the metadata cached on the coordinator from the last force/periodic refresh.
You can pass the optional query parameter `computeUsingClusterView` to factor in the available cluster services when calculating
the segments left to load. See [Coordinator Segment Loading](#coordinator-segment-loading) for details.
If no used segments are found for the given inputs, this API returns `204 No Content`
Expand Down Expand Up @@ -464,7 +464,7 @@ Update overlord dynamic worker configuration.

* `/druid/coordinator/v1/compaction/progress?dataSource={dataSource}`

Returns the total size of segments awaiting compaction for the given dataSource.
Returns the total size of segments awaiting compaction for the given dataSource.
The specified dataSource must have [automatic compaction](../ingestion/automatic-compaction.md) enabled.

##### GET
Expand All @@ -490,7 +490,7 @@ The `latestStatus` object has the following keys:

* `/druid/coordinator/v1/compaction/status?dataSource={dataSource}`

Similar to the API `/druid/coordinator/v1/compaction/status` above but filters response to only return information for the {dataSource} given.
Similar to the API `/druid/coordinator/v1/compaction/status` above but filters response to only return information for the {dataSource} given.
Note that {dataSource} given must have/had auto-compaction enabled.

#### Automatic compaction configuration
Expand Down Expand Up @@ -935,11 +935,21 @@ Returns segment information lists including server locations for the given query

#### GET

* `/druid/v1/router/cluster`

Returns a list of the servers registered within the cluster. Similar to
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the response format of this API be similar to, or exactly the same as, /druid/coordinator/v1/cluster? Ideally, if it's exactly the same as, we should say that; if it's merely similar we should outline the differences.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the docs state, the preferred solution is to query the system tables. Yet, as I tinker with clusters, I find it very easy to screw things up so that the the cluster is too broken for SQL. This API is meant to be a light layer on top of ZK to diagnose such issues without having to fire up the ZK client and come up with a way to decode node payloads. Reworded the docs to highlight this idea.

There are slight differences between the formats of the two endpoints. The Coordinator one appears to be tailored to the needs of the Druid Console (maybe?)

The coordinator one is more heavily formatted to put services in some preferred order:

{'coordinator': [{'service': 'druid/coordinator',
   'plaintextPort': 8081,
   'host': 'coordinator-one'},
  {'service': 'druid/coordinator',
   'plaintextPort': 8081,
   'host': 'coordinator-two'}],
 ...
 'broker': [{'service': 'druid/broker',
   'plaintextPort': 8082,
   'host': 'broker'}],
 'historical': [],

This one lists services alphabetically, including only those services actually running:

{'broker': [{'service': 'druid/broker',
   'host': 'broker',
   'plaintextPort': 8082}],
 'coordinator': [{'service': 'druid/coordinator',
   'host': 'coordinator-one',
   'plaintextPort': 8081},
  {'service': 'druid/coordinator',
   'host': 'coordinator-two',
   'plaintextPort': 8081}],
...

`/druid/coordinator/v1/cluster`, but visible on the Router to allow discovery of Druid
servers if all you have is the Router endpoint.

> Note: Much of this information is available in a simpler, easier-to-use form through the Druid SQL
> [`INFORMATION_SCHEMA.TABLES`](../querying/sql-metadata-tables.md#tables-table),
> [`INFORMATION_SCHEMA.COLUMNS`](../querying/sql-metadata-tables.md#columns-table), and
> [`sys.segments`](../querying/sql-metadata-tables.md#segments-table) tables.

This API is primarily for debugging when setting up a cluster and things are broken
enough that SQL doesn't work: this API gives a direct view of the nodes registered
in ZooKeeper.

* `/druid/v2/datasources`

Returns a list of queryable datasources.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1202,12 +1202,12 @@ public CompactionTask build()
}

/**
* Compcation Task Tuning Config.
* Compaction Task Tuning Config.
*
* An extension of ParallelIndexTuningConfig. As of now, all this TuningConfig
* does is fail if the TuningConfig contains
* `awaitSegmentAvailabilityTimeoutMillis` that is != 0 since it is not
* supported for Compcation Tasks.
* supported for Compaction Tasks.
*/
public static class CompactionTuningConfig extends ParallelIndexTuningConfig
{
Expand Down
Loading