Skip to content

Add ability for historicals to clone an existing historical#17863

Merged
kfaraz merged 26 commits intoapache:masterfrom
adarshsanjeev:historical-cloning
Apr 9, 2025
Merged

Add ability for historicals to clone an existing historical#17863
kfaraz merged 26 commits intoapache:masterfrom
adarshsanjeev:historical-cloning

Conversation

@adarshsanjeev
Copy link
Copy Markdown
Contributor

@adarshsanjeev adarshsanjeev commented Apr 2, 2025

Description

Add the ability for Historicals to "clone" another existing Historicals, copying the segments they serve.
This would be useful for a case like rolling updates, where the goal is to launch a new historical as replacement for a currently existing one.

Sample config:

  "cloneServers": {"historicalA":"historicalB"}

The above config causes historicalA to be considered as an "unmanaged" node by the coordinator. This means that the node:

  • Is not normally considered for segment assignment.
  • Does not have its segments balanced.
  • Is not counted towards replica counts as configured by load rules.

Coordinator manages historicalB as normal. However, any segment assignment or drop to historicalB is mirrored to historicalA.

If the mapping is removed, the coordinator once again manages historicalA during its duty cycles.

Release notes

  • Added a new dynamic coordinator configuration, cloneServers, containing a map from target Historical server to source Historical server which should be cloned by the target. The target Historical does not participate in regular segment assignment or balancing. Instead, the Coordinator mirrors any segment assignment made to the source Historical onto the target Historical, so that the target becomes an exact copy of the source. Segments on the target Historical do not count towards replica counts either. If the source disappears, the target remains in the last known state of the source server until removed from the cloneServers.

Testing

  • Added Coordinator simulation tests.
  • Testing the changes on a local cluster with multiple historicals, and verified that segment changes to one are synced to the other.

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Copy link
Copy Markdown
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good to me. Left some initial feedback.

Comment thread server/src/main/java/org/apache/druid/server/coordinator/stats/Stats.java Outdated
@kfaraz kfaraz marked this pull request as ready for review April 4, 2025 08:07
Copy link
Copy Markdown
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the sim tests, @adarshsanjeev !
I have left some final comments.

@kfaraz
Copy link
Copy Markdown
Contributor

kfaraz commented Apr 4, 2025

@adarshsanjeev , please update the title and the description of the PR too.

@adarshsanjeev adarshsanjeev changed the title [DRAFT] Historical cloning Add ability for a historicals to clone an existing historical Apr 7, 2025
@adarshsanjeev adarshsanjeev changed the title Add ability for a historicals to clone an existing historical Add ability for historicals to clone an existing historical Apr 7, 2025
@kfaraz
Copy link
Copy Markdown
Contributor

kfaraz commented Apr 7, 2025

Thanks for the detailed description, @adarshsanjeev !

Copy link
Copy Markdown
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adarshsanjeev , found a couple of things that I had originally missed. Please let me know what you think.

Comment thread docs/configuration/index.md Outdated
@adarshsanjeev adarshsanjeev added the Needs web console change Backend API changes that would benefit from frontend support in the web console label Apr 7, 2025
Comment thread server/src/main/java/org/apache/druid/server/coordinator/DruidCluster.java Outdated
Comment thread server/src/main/java/org/apache/druid/server/coordinator/ServerHolder.java Outdated
Comment thread server/src/main/java/org/apache/druid/server/coordinator/stats/Stats.java Outdated
Copy link
Copy Markdown
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀 🚀

@kfaraz kfaraz merged commit cefac47 into apache:master Apr 9, 2025
76 checks passed
@capistrant capistrant added this to the 34.0.0 milestone Jul 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area - Documentation Needs web console change Backend API changes that would benefit from frontend support in the web console

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants