Skip to content

Conversation

@szetszwo
Copy link
Contributor

@szetszwo szetszwo commented Mar 7, 2023

What changes were proposed in this pull request?

Similar to HDDS-8024, the client should retry other datanodes if it has failed to getBlock from a datanode.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8090

How was this patch tested?

This is to fix TestHSync

@adoroszlai
Copy link
Contributor

Thanks @szetszwo for continuing work on this. Launched 10x10 run, this time only with TestHSync, which failed 12 times.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that client will log the warning for the first and second failure, but the third failure it will simply throw the exception. I think it would be clear if the message is something like "Failed to get Block from " + d + ", will try another DataNode"

@szetszwo
Copy link
Contributor Author

@jojochuang , thanks a lot for review this! Just have pushed a commit to address your comments and to add new tests.

Unfortunately, the cannot completely fix TestHSync. Will continue the work in https://issues.apache.org/jira/browse/HDDS-8146

@szetszwo
Copy link
Contributor Author

@adoroszlai , thanks for the ci.yml for running TestHSync! Will continue using it in https://issues.apache.org/jira/browse/HDDS-8146

@adoroszlai
Copy link
Contributor

thanks for the ci.yml for running TestHSync! Will continue using it in https://issues.apache.org/jira/browse/HDDS-8146

👍

Note: I think it's better to create a separate branch for repeated tests, to avoid the need for a revert or force-push at the end. If you tweak the fix, just merge the fix branch into the repeat branch.

Also, since CI runs in forks, too, no need to create PR until the fix is verified by repetitions.

@szetszwo szetszwo requested a review from jojochuang March 14, 2023 02:31
Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @szetszwo !
Just a very minor comment otherwise this is goo to go.

public void testGetBlockRetryAlNodes() {
final ArrayList<DatanodeDetails> allDNs = new ArrayList<>(dns);
Assertions.assertTrue(allDNs.size() > 1);
try (XceiverClientGrpc client = new XceiverClientGrpc(pipeline, conf) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated, but looks like the existing tests cases don't close XceiverClientGrpc, potentially leaking resources.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, let me fix the existing problems.

return Collections.singletonList(validator);
}

public static List<CheckedBiFunction> getValidatorList(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO a getXXX() method typically implies a O( 1 ) operation. It would be better off to rename it, such as toValidatorList() ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... a getXXX() method typically implies a O( 1 ) operation. ...

This is not true for TreeMap.get(..), LinkedList.get(..), etc. But I am okay to do the rename.

@jojochuang jojochuang merged commit 2db7dda into apache:master Mar 15, 2023
@szetszwo
Copy link
Contributor Author

@jojochuang , thanks for reviewing and merging this!

@adoroszlai , thanks for the ci hints!

errose28 added a commit to errose28/ozone that referenced this pull request Mar 16, 2023
* master: (262 commits)
  HDDS-8153. Integrate ContainerBalancer with MoveManager (apache#4391)
  HDDS-8090. When getBlock from a datanode fails, retry other datanodes. (apache#4357)
  HDDS-8163 Use try-with-resources to ensure close rockdb connection in SstFilteringService (apache#4402)
  HDDS-8065. Provide GNU long options (apache#4394)
  HDDS-7930. [addendum] input stream does not refresh expired block token.
  HDDS-7930. input stream does not refresh expired block token. (apache#4378)
  HDDS-7740. [Snapshot] Implement SnapshotDeletingService (apache#4244)
  HDDS-8076. Use container cache in Key listing API. (apache#4346)
  HDDS-8091. [addendum] Generate list of config tags from ConfigTag enum - Hadoop 3.1 compatibility fix (apache#4374)
  HDDS-8144. TestDefaultCertificateClient#testTimeBeforeExpiryGracePeriod fails as we approach DST. (apache#4382)
  HDDS-8151. Support fine grained lifetime for root CA certificate (apache#4386)
  HDDS-8150. RpcClientTest and ConfigurationSourceTest not run due to naming convention (apache#4388)
  HDDS-8131. Add Configuration for OM Ratis Log Purge Tuning Parameters. (apache#4371)
  HDDS-8133. Create ozone sh key checksum command (apache#4375)
  HDDS-8142. Check if no entries in Block DB for a container on container delete (apache#4379)
  HDDS-8118. Fail container delete on non empty chunks dir (apache#4367)
  HDDS-8028. JNI for RocksDB SST Dump tool (apache#4315)
  HDDS-8129. ContainerStateMachine allows two different tasks with the same container id running in parallel. (apache#4370)
  HDDS-8119. Remove loosely related AutoCloseable from SendContainerOutputStream (apache#4368)
  close db connection (apache#4366)
  ...
@ivandika3
Copy link
Contributor

@szetszwo Sorry for the random comment, but from my understanding we already have a logic to retry other datanodes in XceiverClientGrpc#sendCommandWithRetry.

for (DatanodeDetails dn : datanodeList) {
      try {
        if (LOG.isDebugEnabled()) {
          LOG.debug("Executing command {} on datanode {}",
              processForDebug(request), dn);
        }
        // In case the command gets retried on a 2nd datanode,
        // sendCommandAsyncCall will create a new channel and async stub
        // in case these don't exist for the specific datanode.
        reply.addDatanode(dn);
        responseProto = sendCommandAsync(request, dn).getResponse().get();
        if (validators != null && !validators.isEmpty()) {
          for (Validator validator : validators) {
            validator.accept(request, responseProto);
          }
        }
...

The call trace is ContainerProtocolCalls#getBlock -> XceiverClientGrpc#sendCommand -> XceiverClientGrpc#sendCommandWithTraceIDAndRetry -> XceiverClientGrpc#sendCommandWithRetry

This means that getBlock will be retried 27 times (3 from sendCommandWithRetry assuming RATIS/THREE * 3 from the tryEachDatanode * 3 from BlockInputStream's retry policy)? This is also applied to readChunk #4336.

Please correct me if I'm mistaken. Thanks in advance.

@szetszwo
Copy link
Contributor Author

@ivandika3 , you are right there are multiple levels of retries. It seems the original retries (which are not added here) may not work as expected -- some tests were failing when a datanode cannot return the chunk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants