Skip to content

kubernetes based discovery druid extension to run Druid on K8S without Zookeeper#10544

Merged
himanshug merged 18 commits intoapache:masterfrom
himanshug:k8s
Dec 15, 2020
Merged

kubernetes based discovery druid extension to run Druid on K8S without Zookeeper#10544
himanshug merged 18 commits intoapache:masterfrom
himanshug:k8s

Conversation

@himanshug
Copy link
Copy Markdown
Contributor

@himanshug himanshug commented Oct 30, 2020

Fixes #9053

Description

Please read #9053 first and maybe https://groups.google.com/g/druid-development/c/tWnwPyL0Vk4/m/2uLwqgQiAAAJ?pli=1 for more background on HTTP based segment and task management that were introduced before.

This patch has been tested to successfully run a small Druid Test cluster with K8S and without Zookeeper.

Most of the code introduced here goes in a new extension. At a high level, it introduces a new Druid Kubernetes Extension that implements 3 druid discovery and leader election related interfaces... "DruidNodeDiscoveryProvider", "DruidLeaderSelector", "DruidNodeAnnouncer" and provides necessary plumbing to use those when configured.

Internally, it uses https://github.com/kubernetes-client/java for all the kubernetes interactions and https://github.com/kubernetes-client/java/tree/master/extended/src/main/java/io/kubernetes/client/extended/leaderelection for the leader election.

See kubernetes.md for how to use it. It can support multiple Druid clusters running on same K8S cluster [in same namespace].


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests. See Create a travis build that runs Druid on a K8S Cluster and runs the integration tests #10542
  • been tested in a test Druid cluster.

Note: This patch also adds "Bouncy Castle License" being same as "MIT License" for license checking purposes in distribution/bin/check-licenses.py . See http://www.bouncycastle.org/licence.html

@himanshug himanshug removed the WIP label Nov 17, 2020
@himanshug himanshug changed the title [WIP] kubernetes based discovery druid extension to run Druid on K8S without Zookeeper kubernetes based discovery druid extension to run Druid on K8S without Zookeeper Nov 17, 2020
@himanshug
Copy link
Copy Markdown
Contributor Author

This PR is no longer WIP. Build is passing, there is good amount of test coverage and I have also tested it successfully on small test clusters running on kubernetes without a zookeeper cluster at all. This PR is ready to be merged.

While I am sure caveats would pop up when this gets used in long running large clusters, this is a good starting point to be released as an experimental feature in next Druid release and noted as such in the docs introduced.

sidenote:
if someone has experimented with running k8s clusters in travis builds, please feel free to work/comment on #10542 as that would set the stage of testing the code here on real k8s cluster in the travis builds.

Copy link
Copy Markdown
Member

@nishantmonu51 nishantmonu51 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, 👍
minor nit: would be great if we can also add @nullable annotations


public DiscoveryDruidNodeList(
String resourceVersion,
Map<String, DiscoveryDruidNode> druidNodes
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Add @nullable annotations.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added, thanks

@himanshug himanshug merged commit ac1882b into apache:master Dec 15, 2020
harinirajendran pushed a commit to harinirajendran/druid that referenced this pull request Dec 15, 2020
…t Zookeeper (apache#10544)

* honor zk enablement config in more places in druid code

* kubernetes based discovery module

* fix spotbugs check

* fix intellij checks error

* fix doc link to kubernetes.md from extension

* make spellchecker happy

* update license.yaml

* fix dependency check errors

* update extension coverage

* UTs for BaseNodeRoleWatcher

* fix forbidden-api check

* update k8s module coverage ignores

* add Bouncy Castle License being same as MIT License for license checking purposes

* further update licenses.yaml

* label/annotation pre-existence assumption

* address review comment
@jihoonson jihoonson added this to the 0.21.0 milestone Jan 4, 2021
JulianJaffePinterest pushed a commit to JulianJaffePinterest/druid that referenced this pull request Jan 22, 2021
…t Zookeeper (apache#10544)

* honor zk enablement config in more places in druid code

* kubernetes based discovery module

* fix spotbugs check

* fix intellij checks error

* fix doc link to kubernetes.md from extension

* make spellchecker happy

* update license.yaml

* fix dependency check errors

* update extension coverage

* UTs for BaseNodeRoleWatcher

* fix forbidden-api check

* update k8s module coverage ignores

* add Bouncy Castle License being same as MIT License for license checking purposes

* further update licenses.yaml

* label/annotation pre-existence assumption

* address review comment
@harinirajendran
Copy link
Copy Markdown
Contributor

@himanshug @gianm Do we know of any big production clusters using this feature? how does the migration from zk to k8s based discovery work without downtime?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Proposal] Druid discovery extension for Kubernetes

4 participants