Skip to content

[Proposal] HTTPAnnouncer + Remove Zookeeper Dependency #2312

@guobingkun

Description

@guobingkun

HTTP Announcer

Druid uses Announcer for internal announcement and CuratorServiceAnnouncer for external announcement. Both of them reply on the existence of Zookeeper, as a step of moving toward to removing the dependency of Zookeeper, I am proposing an HTTPAnnouncer that can be used to announce various information to Coordinator instead of Zookeeper, Coordinator should be able to still function as it currently does using the information collected from HTTPAnnouncer.

An HTTP announcer will be bound to a Druid node at a very early stage of its lifecycle. When the node' lifecycle starts, HTTPAnnouncer starts a background task that periodically sends heartbeat message to Coordinator. This heartbeat message servers as an indication that its associated Druid node exists in the cluster. That Coordinator not receiving a node's heartbeat messages for a long time means that the node has died. The heartbeat message should only contain necessary and lightweight metadata in order for Coordinator to identify its Druid node or even its type/capabilities.

Besides sending heartbeat messages in background, HTTPAnnouncer can also be used to announce other different kinds of metadata (e.g., segment being served, running tasks, etc) to Coordinator.

Remove Zookeeper dependency

Once HTTPAnnouncer is ready, Coordinator should be able to build up a brief view of the entire Druid cluster, it doesn't need to store all the details about Druid nodes, but it should be able to ask for a specific Druid node for information when it needs.

Since there is no Zookeeper, there are some open questions that are worth discussing,

How does Coordinator maintain a segment view of what realtime/historical nodes are serving what segments?

Realtime and historical nodes will actively announce segments when they are indeed serving them. Coordinator will receive those announcements and update the segment view accordingly. Meanwhile, Coordinator will also have a background task running that periodically asks those historical/realtime nodes if they are still serving the segments.

How does Broker maintain a segment view of what realtime/historical nodes are serving what segments?

IMO Broker should ask for that view from Coordinator at startup, and receive HTTP callbacks from Coordinator when the view has changed.

How to assign a segment to a historical?

Coordinator should be able to maintain a view of all the historical nodes just like it currently does with Zookeeper. LoadQueuePeon should still be used for loading/dropping segments except that instead of creating zNodes on Zookeeper, it will send http request to historical.
#2314 is a concrete proposal to answer this question.

How to do leader election?

There are various leader election algorithms available, Raft is one of the option that is already mentioned in the Druid roadmap.

How does Druid recover in case of Coordinator leader crashes?

Since there is no Zookeeper, each Druid node should have a runtime.property that tells it who are the Coordinators, for example, druid.coordinator.hosts=host1:port1,host2,port2,host3,port3
In case of leader crashes, the heartbeat message sent to the old leader will no long get acknowledged, at that moment a Druid node should realize leader is gone, and send heartbeat message to another Coordinator specified in druid.coordinator.hosts until a new leader is elected.

How does Overlord keep track of running tasks?

Middle Manager could announce tasks when they started, and Overlord could periodically query Middle managers for the latest task status.

All the questions are open, feel free share your thoughts.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions