Add `sys.supervisors` table to system tables by surekhasaharan · Pull Request #8547 · apache/druid

surekhasaharan · 2019-09-16T23:57:11Z

Fixes #7007
Proposal issue #8463

Description

This PR adds a sys.supervisors table to the pool of system tables. This would allow to query the supervisors via DruidSQL.

SupervisorResource changes

Added new queryParam fullStatus to GET /druid/indexer/v1/supervisor
Return a List<SupervisorStatus> instead of Map<String,Object> from specGetAll
Added class SupervisorStatus

`SystemSchema` changes

Added SupervisorTable class to SystemSchema to contain supervisor specific code

This PR has:

been self-reviewed.
added documentation for new or modified features or behaviors.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths.
been tested in a test Druid cluster.

ccaominh · 2019-09-17T00:51:54Z

+|`state`|String|basic state of the supervisor. Available states:`UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`|
+|`detailedState`|String|supervisor specific state. (See documentation of specific supervisor for details)|
+|`healthy`|Boolean|true or false indicator of overall supervisor health|
+|`specString`|String|a json string of supervisor spec|


FYI, some of these docs changes will conflict with #8548 (e.g., "json" should be "JSON").

ok, thanks for heads-up, changed to uppercase here

Travis shows this spell check report:

To fix:

eg. -> e.g.,

kafka -> Kafka

kinesis -> Kinesis

supervisor_id and detailed_state -> add to file suppressions by adding entries after: https://github.com/apache/incubator-druid/blob/master/website/.spelling#L1307

ccaominh · 2019-09-17T00:58:55Z

+    return suspended;
+  }
+
+  @JsonPOJOBuilder


Cool! Didn't know you could could have Jackson use a builder.

ccaominh · 2019-09-17T01:00:26Z

+  private final boolean healthy;
+  private final SupervisorSpec spec;
+  /**
+   * This is a stringified version of spec object


Maybe mention the format? Is it JSON?

ccaominh · 2019-09-17T01:05:17Z

+    try {
+      request = indexingServiceClient.makeRequest(
+          HttpMethod.GET,
+          StringUtils.format("/druid/indexer/v1/supervisor?fullStatus"),


Is there a benefit of using StringUtils.format() if there are no args to be formatted?

not sure, may be not, but i saw some places using this, and because of habit by now.

I'd remove the StringUtils.format if there aren't any format args.

ok will remove from all the places in this class

ccaominh · 2019-09-17T01:08:08Z

                      manager.getSupervisorState(x);
-                  ImmutableMap.Builder<String, Object> theBuilder = ImmutableMap.builder();
-                  theBuilder.put("id", x);
+                  SupervisorStatus.Builder theBuilder = new SupervisorStatus.Builder();


Nice change to have it use a builder!

ccaominh · 2019-09-17T01:10:32Z

+                    Optional<SupervisorSpec> theSpec = manager.getSupervisorSpec(x);
+                    if (theSpec.isPresent()) {
+                      try {
+                        theBuilder.withSpecString(objectMapper.writeValueAsString(manager.getSupervisorSpec(x).get()));


It may be nice to have to builder accept SupervisorSpec and serialize/deserialize it to a string internally so that the formatting logic is all in one place.

i tried that, but the objectMapper is injected here, which is used to serialize the SupervisorSpec

Does it work if you add another constructor for SupervisorStatus.Builder that injects the ObjectMapper?

hmm, it seems odd to add ObjectMapper to Builder, and the builder already accepts a spec, in addition to specString. I could generate specString from spec in SupervisorStatus#getSpecString if I have the right ObjectMapper, but then I think the specString would appear in response to druid/indexer/v1/supervisor?full as well as druid/indexer/v1/supervisor?fullStatus, since it's not null anymore for the former call and it's not required/desired in former call.

Oh, since the SupervisorStatus DTO already has a SupervisorSpec field that gets serialized when the response is serialized, why do you need to have another field that's an explicitly serialized version (which is populated before the response is serialized)?

Yeah, i added already serialized specString to get the json payload in SystemSchema#SupervisorsTable via the JsonParserIterator, otherwise the deserialization errors out since KafkaSupervisorSpec's attributes are not present in broker, they are only available in overlord.

@ccaominh added comment and javadoc about the need for explicitly serialized spec, let me know if it helps clarify things

…r-table

ccaominh

LGTM 👍

…r-table

gianm · 2019-09-30T21:28:17Z

+|Field|Type|Description|
+|---|---|---|
+|`id`|String|supervisor unique identifier|
+|`state`|String|basic state of the supervisor. Available states:`UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`|


Missing a space between "Available states:" and "`UNHEALTHY"

This should link to a page that defines what these states mean. I don't think one exists (I checked briefly) so for now I'd say link to the Kafka docs and say that users can look there for details.

gianm · 2019-09-30T22:20:08Z

+|---|---|---|
+|`id`|String|supervisor unique identifier|
+|`state`|String|basic state of the supervisor. Available states:`UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`|
+|`detailedState`|String|supervisor specific state. (See documentation of specific supervisor for details)|


There's only two right now, let's help the users out by linking to them directly. i.e. "See documentation of the specific supervisor for details, e.g. [Kafka] or [Kinesis]."

gianm · 2019-09-30T22:22:26Z

+
+|Column|Type|Notes|
+|------|-----|-----|
+|supervisor_id|STRING|supervisor task identifier|


The notes should be in sentence case: first letter capitalized. (Like the other tables on this page.)

gianm · 2019-09-30T22:22:49Z

 SELECT * FROM sys.tasks WHERE status='FAILED';
 ```

+#### SUPERVISORS table


All the comments on the api-reference page apply here as well.

ok, fixed here as well

gianm · 2019-09-30T22:23:09Z

+|`detailedState`|String|supervisor specific state. (See documentation of specific supervisor for details)|
+|`healthy`|Boolean|true or false indicator of overall supervisor health|
+|`specString`|String|a JSON string of supervisor spec|
+|`type`|String|type of supervisor task, e.g., `kafka` or `kinesis`|


This is just "type of supervisor" (supervisors aren't tasks)

removed task

gianm · 2019-09-30T22:53:07Z


 public class KinesisSupervisorSpec extends SeekableStreamSupervisorSpec
 {
+  private static final String TASK_TYPE = "kinesis";


SUPERVISOR_TYPE

gianm · 2019-09-30T22:54:17Z

+   * This API is only used for informational purposes in
+   * org.apache.druid.sql.calcite.schema.SystemSchema.SupervisorsTable
+   *
+   * @return supervisor task type


(Not a task.)

gianm · 2019-09-30T22:54:29Z

+import java.util.Objects;
+
+/**
+ * This class contains the attributes of a supervisor task which are returned by the API's in


(Not a task)

gianm · 2019-09-30T22:56:59Z

+    try {
+      request = indexingServiceClient.makeRequest(
+          HttpMethod.GET,
+          StringUtils.format("/druid/indexer/v1/supervisor?fullStatus"),


I'd remove the StringUtils.format if there aren't any format args.

gianm · 2019-09-30T23:01:46Z

+  private final SupervisorSpec spec;
+  /**
+   * This is a JSON representation of {@code spec}
+   * The explicit serialization is present here so that users of  {@code SupervisorStatus} which cannot


Does this work?

Won't the spec still be serialized into spec, and won't callers still try to deserialize the spec and get some kind of error?

(If it does work -- how does it work? I am perplexed.)

yeah it seems to work in my local cluster, the exception with using SupervisorSpec instead of string was

java.lang.RuntimeException: com.fasterxml.jackson.databind.JsonMappingException: Can not construct instance of org.apache.druid.indexing.kafka.supervisor.KafkaSupervisorSpec, problem: Guice configuration errors: 1) No implementation for org.apache.druid.indexing.overlord.TaskStorage was bound. while locating org.apache.druid.indexing.overlord.TaskStorage

the way it works is, i create SupervisorStatus with json string here, so SupervisorStatus#getSpecString returns the json payload, and SupervisorStatus#getSpec would return null in this case.

…r-table

jon-wei · 2019-10-17T23:10:53Z

+|`healthy`|Boolean|true or false indicator of overall supervisor health|
+|`suspended`|Boolean|true or false indicator of whether the supervisor is in suspended state|
+
+* `/druid/indexer/v1/supervisor?fullStatus`


Could the fullStatus API be unified with /druid/indexer/v1/supervisor?full?

It seems a bit confusing to have two "full" parameters, and the information returned by fullStatus appears to contain everything that full would return.

/druid/indexer/v1/supervisor?fullStatus returns everything "full" returns, in addition adds:

type

source

specString
these are added to fill the columns in sys.supervisor table without deserializing the spec object(which cannot be done in SystemSchema because of dependency issues). Also to avoid changing the behavior of existing API.

Can you add an explanation to the docs describing why the fullStatus param exists and when you would use one vs. the other

Since fullStatus is used internally for sys tables, i'm now thinking should we document it, as I think for users, the full query param might suffice. Not sure what is generally our stance on these kind of internal options. I think, the only reason, one would use fullStatus , is if they care to get type or source without digging into the spec object. I can either remove the fullStatus or add that explanation. Thoughts ?

IMO, it's ok to skip documentation for options meant for internal use. Maybe call it ?system rather than ?full to make it clearer in the code what the purpose is.

?system sounds good to me and remove the option from docs.

I think it's also better to skip the documentation. If it's intended for internal use and folks start using it because it's documented, it may be harder to change the behavior/API later if needed.

Got it, if it's only meant for internal use, then renaming fullStatus to system and leaving out the docs for that sounds good to me.

jon-wei

Had a minor comment and comment about docs, LGTM otherwise

jon-wei · 2019-10-18T01:52:34Z

+      Function<SupervisorStatus, Iterable<ResourceAction>> raGenerator = supervisor -> Collections.singletonList(
+          AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(supervisor.getSource()));
+
+      final Iterable<SupervisorStatus> authorizedTasks = AuthorizationUtils.filterAuthorizedResources(


authorizedTasks -> authorizedSupervisors

…r-table

jon-wei

LGTM after CI

Surekha Saharan added 3 commits September 16, 2019 14:06

Add supervisors table to SystemSchema

21c1bd6

Add docs

51a2c77

fix checkstyle

a64fbd3

surekhasaharan added Area - SQL Area - Metadata labels Sep 16, 2019

ccaominh reviewed Sep 17, 2019

View reviewed changes

Surekha Saharan added 4 commits September 17, 2019 17:09

fix test

7e3531d

Merge branch 'master' of github.com:druid-io/druid into sys-superviso…

4631624

…r-table

fix CI

e0fe32f

Add comments

cb9be53

ccaominh approved these changes Sep 19, 2019

View reviewed changes

Surekha Saharan added 2 commits September 24, 2019 11:13

Fix javadoc teamcity error

34b4130

Merge branch 'master' of github.com:druid-io/druid into sys-superviso…

c5bf351

…r-table

gianm reviewed Sep 30, 2019

View reviewed changes

Surekha Saharan added 4 commits October 1, 2019 08:12

comments

9cea0c1

Merge branch 'master' of github.com:druid-io/druid into sys-superviso…

d5a9be9

…r-table

fix links in docs

e177120

fix links

aa1ac79

jon-wei reviewed Oct 17, 2019

View reviewed changes

jon-wei reviewed Oct 18, 2019

View reviewed changes

Surekha Saharan added 2 commits October 18, 2019 10:58

rename fullStatus query param to system and remove it from docs

3debca7

Merge branch 'master' of github.com:druid-io/druid into sys-superviso…

263dab0

…r-table

jon-wei approved these changes Oct 18, 2019

View reviewed changes

jon-wei merged commit 98f59dd into apache:master Oct 18, 2019

surekhasaharan deleted the sys-supervisor-table branch October 18, 2019 22:32

jon-wei added this to the 0.17.0 milestone Dec 17, 2019

jon-wei added the Release Notes label Dec 18, 2019

jon-wei mentioned this pull request Dec 28, 2019

0.17.0 release notes #9066

Closed

Conversation

surekhasaharan commented Sep 16, 2019

Description

SupervisorResource changes

SystemSchema changes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ccaominh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

`SystemSchema` changes