SOLR-18080: Initiate Leader election for ShardTerms #4069

HoustonPutman · 2026-01-21T20:15:09Z

https://issues.apache.org/jira/browse/SOLR-18080

Whenever the shard terms change such that the leader is no longer the highest term, a leader election should take place, and all non-up-to-date replicas should go into recovery.

mlbiscoc · 2026-01-21T20:53:32Z

solr/core/src/java/org/apache/solr/cloud/RecoveringCoreTermWatcher.java

+        lastRecoveryTerm = lastTermDoRecovery.get();
+        newTerm = terms.getTerm(coreNodeName);
+        if (lastRecoveryTerm < newTerm) {
+          lastTermDoRecovery.set(newTerm);


lastTermDoRecovery is set here but its possible recovery is deferred below because of leader election now. Is that right? The old logic set it, then actually does recovery regardless. Reading this, seems like there is a possibility that lastTermDoRecovery is set to the new term but can skip actually doing recovery further down. So the term this was set to is incorrect based on the name if recovery is skipped?

Yeah lastTermDoRecovery might be a bad name, but after leader election, recovery is guaranteed for these replicas at this term value. So while recovery is not explicitly being done here, we know that leader election will do the recovery. So lastTermDoRecovery is still technically correct, just assuming the leader election succeeds.

… open

solr/core/src/java/org/apache/solr/cloud/ShardLeaderElectionContext.java

solr/core/src/java/org/apache/solr/cloud/ShardLeaderElectionContextBase.java

gerlowskija · 2026-01-29T14:06:30Z

solr/core/src/java/org/apache/solr/cloud/ShardTerms.java

    return replicasNeedingRecovery.contains(key);
  }

+  public ShardTerms setHighestTerms(Set<String> highestTermKeys) {


The whole "term" algorithm makes some pretty strict assumptions about who can update term values, and on what conditions. From class javadocs on ZkShardTerms:

* <p>Terms can only updated in two strict ways: * * <ul> * <li>A replica sets its term equals to leader's term * <li>The leader increase its term and some other replicas by 1 * </ul>

This method seems to fit under the latter provision, which is good. But could we add Javadocs here to indicate that this method should only be called by current shard-leaders? Or if this is safe for non-leaders to call in certain situations, add javadocs to describe what those are and why.

Just trying to defend against the possibility of someone coming back through here in a month or two and thinking: "Hey this doesn't fit with the documented algorithm at all"

So this ultimately doesn't happen from the leader, and the leaders term is not guaranteed to be increased. If we see in our new test class, leader election can be triggered because of this new API.

I think the easiest thing to do here is insist that the collection is in a read-only state when this API is called. I'm not sure how I'll do that, but it will definitely guard against any issues with missing updates or anything like that.

gerlowskija · 2026-01-29T14:11:32Z

solr/core/src/java/org/apache/solr/cloud/ZkShardTerms.java

    mutate(terms -> terms.increaseTerms(leader, replicasNeedingRecovery));
  }

+  public void ensureHighestTerms(Set<String> mostUpToDateCores) {


Ditto, re: my previous comment on ShardTerms.setHighestTerms. We should add some Javadocs to make sure this only called by leaders, or if it's actually safe to call elsewhere describe where and why.

gerlowskija · 2026-01-29T14:19:35Z

solr/core/src/test-files/log4j2.xml

        <Pattern>
          %maxLen{%-4r %-5p (%t) [%notEmpty{n:%X{node_name}}%notEmpty{ c:%X{collection}}%notEmpty{ s:%X{shard}}%notEmpty{ r:%X{replica}}%notEmpty{ x:%X{core}}%notEmpty{ t:%X{trace_id}}] %c{1.} %m%notEmpty{
-          =>%ex{short}}}{10240}%n
+          =>%ex}}{10240}%n


[Q] If I understand this correctly, this is changing how test runs in 'core' log exceptions?

No problem with changing that but we have one of these files in each module and I imagine if we're updating one of them we should update them all?

I know your time is limited for the next few weeks and you've taken a lot of pains to pare this PR down. With your agreement let's drop this change from the PR and I'll open another PR to make the change across the board for our test files and shepherd it through.

Yeah, I add this when testing on all my PRs then remove it before committing. This got through unfortunately. But thanks for taking the lead on it, appreciate that! I'll remove it.

gerlowskija · 2026-01-29T14:23:11Z

solr/core/src/test/org/apache/solr/cloud/ZkShardTermsRecoveryTest.java

+    DocCollection docCollection = cluster.getZkStateReader().getCollection(COLLECTION);
+    JettySolrRunner jetty = cluster.getRandomJetty(random());
+
+    Slice shard1 = docCollection.getSlice(shard);


[0] Should probably be "shard2", given the value of the string variable set on L64?

Idrc about the name, just mentioning it in case it's a bug in your test logic.

gerlowskija · 2026-01-29T14:27:33Z

solr/core/src/test/org/apache/solr/cloud/ZkShardTermsRecoveryTest.java

+        state -> {
+          Slice shardState = state.getSlice(shard);
+          for (Replica r : recoveryReplicas) {
+            if (shardState.getReplica(r.name).getState() != Replica.State.RECOVERING) {


[Q] Since "recovering" is a transient state and the doc/index size here is very small, is it possible that the replica would go into recovery as expected and waitForState would just miss it based on when it polls?

I believe waitForState uses a ZK watcher, so it should be safe and generally get all of the ZK updates... I haven't had it fail yet and I've run it at least 100 times. But yeah there is a concern there, so might think about it later. Ultimately going into recovery is very necessary here, and didn't used to be what happened always.

Ultimately, we can probably just check that the shard terms all become equal in the end and that the docs are there. That would probably be good enough (And check that the replicas become active of course).

Ok, so I changed it to make sure that all given replicas go into recovery at some point, not necessarily at the same point. This should make it a bit more resilient on maybe weird hardware.

HoustonPutman added 2 commits January 21, 2026 12:13

SOLR-18080: Initiate Leader election for ShardTerms

b13d97e

Add changelog entry

7522caa

github-actions bot added client:solrj tests cat:cloud cat:index labels Jan 21, 2026

Fix precommit issues

776c28b

mlbiscoc reviewed Jan 21, 2026

View reviewed changes

HoustonPutman added 3 commits January 23, 2026 17:00

Fix for empty versions, still fails with uncommitted docs

e2b0e84

Fix no uLog error case

9f102cb

Fix test

d68acf3

github-actions bot added the module:opentelemetry label Jan 28, 2026

Make some fixes, primarily indexFetcher checking files that are still…

b60d971

… open

gerlowskija reviewed Jan 29, 2026

View reviewed changes

HoustonPutman added 3 commits January 29, 2026 17:08

Undo logging change

9446312

Make test resilient to possible different recovery speeds

34cba84

Tidy

68475bc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SOLR-18080: Initiate Leader election for ShardTerms #4069

SOLR-18080: Initiate Leader election for ShardTerms #4069

HoustonPutman commented Jan 21, 2026

Uh oh!

mlbiscoc Jan 21, 2026

Uh oh!

HoustonPutman Jan 21, 2026

Uh oh!

Uh oh!

Uh oh!

gerlowskija Jan 29, 2026

Uh oh!

HoustonPutman Jan 30, 2026

Uh oh!

gerlowskija Jan 29, 2026

Uh oh!

gerlowskija Jan 29, 2026

Uh oh!

HoustonPutman Jan 29, 2026 •

edited

Loading

Uh oh!

gerlowskija Jan 29, 2026

Uh oh!

gerlowskija Jan 29, 2026

Uh oh!

HoustonPutman Jan 29, 2026

Uh oh!

HoustonPutman Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SOLR-18080: Initiate Leader election for ShardTerms #4069

Are you sure you want to change the base?

SOLR-18080: Initiate Leader election for ShardTerms #4069

Conversation

HoustonPutman commented Jan 21, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HoustonPutman Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HoustonPutman Jan 29, 2026 •

edited

Loading