Skip to content

Add a SQL sys.supervisor table #8463

@surekhasaharan

Description

@surekhasaharan

Motivation

It would be useful to add a sys.supervisor table to system tables in DruidSQL. See #7007

Proposed changes

Add a supervisors table with following columns:

col type desc
id string supervisor id
state string supervisor task status
detailed_state string detailed state
healthy boolean health status
spec string json payload
type string kafka/kinesis
source string topic or stream name
suspended boolean suspended or not

Overlord Rest API to currently fetch this info is GET /druid/indexer/v1/supervisor?full. Planning to another queryParam fullStatus, so the API would be GET /druid/indexer/v1/supervisor?fullStatus

Add a new class SupervisorStatus, similar to TaskStatusPlus which would encapsulate all the attributes of supervisor table(listed above) and the GET api would return a List<SupervisorStatus>

The json response would contain 3 extra fields in addition to current output, for example:

"type": "index_kafka",
"source": "wikipedia",
"suspended": false

Above fields are extracted from the SupervisorSpec to make it easy to filter on those without having to look at the spec json itself.

A new class SupervisorTable will be added in SystemSchema, which would make a call to overlord API using DruidLeaderClient and get the results in a streaming way.

The sql to query supervisor table
select * from sys.supervisors;

Rationale

Some other potential approaches considered :

First approach was considered to use the existing API, the API currently returns List<Map<String, ?>> type , the issue is the format is too loose without any object types, so it's easy to make mistakes and introduce bugs. The new class SupervisorStatus would give the benefits of type-safety and a nicer API. Other issue was getting the attributes from spec, like type and source for the supervisor table.

A second approach was to extract the existing builder ImmutableMap.Builder<String, Object> from SupervisorResource#specGetAll , with proper data types. Attempted this with a SupervisorSpec type( so it's possible extract type, source, suspended status later in SystemSchema), but it fails at the broker, when a sys query is issued, since KafkaSupervisorSpec needs objects like TaskStorage, TaskMaster, RowIngestionMetersFactory to be injected which are not bound by broker, and so this would not work.

About the change in the API response, another option would be to add a new Rest API and keep the existing one unchanged, but since it's adding a new queryParam, so I am leaning towards using the existing API.

Operational impact

This change would be incompatible in the sense that it changes the json response format of the API.
But this should not impact the updates if the 3 new fields are there.

Test plan (optional)

Plan to add tests for supervisor table and test in a Druid cluster.

Future work (optional)

Add a task payload column similar to spec to sys.tasks table and possibly consolidate the TaskState / TaskStatus / TaskStatus / TaskStatusPlus classes.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions