Skip to content

Conversation

@xichen01
Copy link
Contributor

@xichen01 xichen01 commented Mar 27, 2024

What changes were proposed in this pull request?

Make client priority read from IN_SERVICE status Datanode.

When Ozone client read Block form Datanode, the SCM will return all the Datanodes where the replicas are located, including the nodes in the IN_MAINTENANCE, etc state, and the order may be random, so the client may try to access the IN_MAINTENANCE state node first.
Sometimes, we set a Datanode to IN_MAINTENANCE, this may be due to some faults of the machine or wanting to restart a Datanode, in this scenario, that the Client does not read data form these IN_MAINTENANCE datanode is a good strategy.

PS:
Since OM has a cache of Datanode state (expiration time 6min), the client can't get the latest state of Datanode, so we need to enhance the cache evicting mechanism.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10593

How was this patch tested?

Unit Test.

@xichen01 xichen01 changed the title HDDS-10593 Make client priority read from IN_SERVICE status Datanode HDDS-10593. Make client priority read from IN_SERVICE status Datanode Mar 27, 2024
Copy link
Contributor

@ivandika3 ivandika3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch. LGTM +1.

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xichen01 Thanks for working over this, LGTM

@xichen01
Copy link
Contributor Author

xichen01 commented Apr 8, 2024

Thanks @ivandika3 and @sumitagrawl for the review.

@adoroszlai adoroszlai requested a review from sodonnel April 8, 2024 14:58
}
}

boolean allInService = datanodeList.stream()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change seems sensible to me. I tried to think if there was a nicer way to do this, and thought that the following was easier to follow:

    Iterator<DatanodeDetails> iter = datanodeList.iterator();
    List<DatanodeDetails> notInService = null;
    while(iter.hasNext()) {
      DatanodeDetails dn = iter.next();
      if (dn.getPersistedOpState() != HddsProtos.NodeOperationalState.IN_SERVICE) {
        iter.remove();
        if (notInService == null) {
          notInService = new ArrayList<>();
        }
        notInService.add(dn);
      }
    }
    if (notInService != null) {
      datanodeList.addAll(notInService);
    }

It could be shorter, but I tried to avoid allocating a list for notInService unless it was really needed, as it won't be needed most of the time.

Only change this if you feel it is better - I am not insisting it should be changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion. This Implementation need to remove and readd the element, this may not bring in more benefits compare with the sort.

@adoroszlai adoroszlai merged commit 9b248a0 into apache:master Apr 9, 2024
@adoroszlai
Copy link
Contributor

Thanks @xichen01 for the patch, @ivandika3, @sodonnel, @sumitagrawl for the review.

Tejaskriya pushed a commit to Tejaskriya/ozone that referenced this pull request Apr 17, 2024
jojochuang pushed a commit to jojochuang/ozone that referenced this pull request May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants