-
Notifications
You must be signed in to change notification settings - Fork 594
HDDS-4473. Reduce number of sortDatanodes RPC calls #1610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I noticed this code before, in that we have a batch call to get the pipelines, but then for a key with multiple blocks, we must make a call per block to sort the datanodes. Am I correct in thinking this is mostly a problem with large keys which have many blocks? Do you think it would make sense to add the sort flag to the |
I think so, and also for
Actually that's a great idea, better than the third (and not yet implemented) improvement I proposed, which would still require 2 batch requests. I think we have to pay attention to compatibility, though. Client would need to check if SCM supports this new flag, so that it can get sorted pipeline from both new and old SCM. |
|
cc: @avijayanhwx for compat check though upgrades. |
|
This change looks good to me and I am +1 to commit it. However I would like to highlight one issue that may make this change less effective. In SCMClientProtocolServer this is the code where the refreshPipelinesBatch call lands. For open containers the current pipeline is returned, but for a closed container, a new "pipeline" is created. This is not a real ratis pipeline - its just a list of nodes. If you follow the logic to SimplePipelineProvider, the pipeline ID is a random ID: This means, that even if several blocks have the same set of DNs, they will still be distinct pipelines, and hence the caching will not work. The caching will only be effective for open containers. There are two places this logic is used - listStatus and lookupkey / getOzoneFileStatus. I guess the place where this could be a win is in listStatus, but if most keys are part of closed containers (which would be the usual case) the sorted pipeline cache introduced here may cause more overhead than it saves. I also wonder - why does ListStatus need to return the pipeline and optionally sort the pipeline for all keys, all the time? I guess we can have two uses - simply listing the directory and displaying the contents - usually locations is not needed here. Second, an application list the folders with the intention of reading each key - in this second case the locations are needed. |
|
Thanks @sodonnel for digging the details on pipeline refresh, and pointing out pipeline ID being random. In this case I think we should use the set of node IDs, instead of the pipeline ID, as cache key. |
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
Show resolved
Hide resolved
Good idea - I think that will work much better. I checked the latest version and it looks good to me. I just have one small comment you can consider if you think it makes sense. |
sodonnel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1.
|
Thanks @sodonnel for the reviews. |
* master: (40 commits) HDDS-4473. Reduce number of sortDatanodes RPC calls (apache#1610) HDDS-4485. [DOC] add the authentication rules of the Ozone Ranger. (apache#1603) HDDS-4528. Upgrade slf4j to 1.7.30 (apache#1639) HDDS-4424. Update README with information how to report security issues (apache#1548) HDDS-4484. Use RaftServerImpl isLeader instead of periodic leader update logic in OM and isLeaderReady for read/write requests (apache#1638) HDDS-4429. Create unit test for SimpleContainerDownloader. (apache#1551) HDDS-4461. Reuse compiled binaries in acceptance test (apache#1588) HDDS-4511: Avoiding StaleNodeHandler to take effect in TestDeleteWithSlowFollower. (apache#1625) HDDS-4510. SCM can avoid creating RetriableDatanodeEventWatcher for deletion command ACK (apache#1626) HDDS-3363. Intermittent failure in testContainerImportExport (apache#1618) HDDS-4370. Datanode deletion service can avoid storing deleted blocks. (apache#1620) HDDS-4512. Remove unused netty3 transitive dependency (apache#1627) HDDS-4481. With HA OM can send deletion blocks to SCM multiple times. (apache#1608) HDDS-4487. SCM can avoid using RETRIABLE_DATANODE_COMMAND for datanode deletion commands. (apache#1621) HDDS-4471. GrpcOutputStream length can overflow (apache#1617) HDDS-4308. Fix issue with quota update (apache#1489) HDDS-4392. [DOC] Add Recon architecture to docs (apache#1602) HDDS-4501. Reload OM State fail should terminate OM for any exceptions. (apache#1622) HDDS-4492. CLI flag --quota should default to 'spaceQuota' to preserve backward compatibility. (apache#1609) HDDS-3689. Add various profiles to MiniOzoneChaosCluster to run different modes. (apache#1420) ...
What changes were proposed in this pull request?
KeyManagerImpl#listStatusand thesortDatanodeInPipelinehelper method sort datanodes using individual RPC call for each key location info.Improvements in this change:
Possible improvement left for later: send a single
sortDatanodesrequest for all datanodes in all relevant pipelines, then create the per-pipeline lists locally.https://issues.apache.org/jira/browse/HDDS-4473
How was this patch tested?
Added unit test for
sortDatanodesand added check in existing test case forlistStatus.