-
Notifications
You must be signed in to change notification settings - Fork 594
HDDS-13138. [Docs] Update Topology Awareness user doc. #8528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Change-Id: I66c09d82a3eb89edff3538c59fd53e131065cab8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR updates the Topology Awareness user documentation for Apache Ozone by revising outdated content and incorporating new placement policies.
- Updated descriptions of topology-aware operations including both static and dynamic mapping.
- Introduced new sections detailing the additional placement policies for both RATIS and Erasure Coded (EC) containers.
- Revised configuration examples and best practice guidelines for network topology and container placements.
Comments suppressed due to low confidence (2)
hadoop-hdds/docs/content/feature/Topology.md:57
- Ensure that the updated topology mapping file path (/etc/ozone/topology.map) is consistent with deployment standards and accurately documented.
<value>/etc/ozone/topology.map</value>
hadoop-hdds/docs/content/feature/Topology.md:81
- Confirm that the updated script path for dynamic rack mapping (/etc/ozone/determine_rack.sh) aligns with current deployment configurations and file system conventions.
<value>/etc/ozone/determine_rack.sh</value>
|
Gemini produces the content, and Copilot reviews the content. We are really into a new era :) |
|
|
||
| Ozone's topology-aware placement strategies vary by container replication type and state: | ||
|
|
||
| * **RATIS Replicated Containers:** Ozone uses RAFT replication for Open containers (write), and an async replication for closed, immutable containers (cold data). As RAFT requires low-latency network, topology awareness placement is available only for closed containers. See the [page about Containers](concept/Containers.md) about more information related to Open vs Closed containers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is not true, pipeline creation considers topology.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hdds.scm.pipeline.choose.policy.imp default is RandomPipelineChoosePolicy, same for EC: hdds.scm.ec.pipeline.choose.policy.impl.
/**
* Random choose policy that randomly chooses pipeline.
* That are we just randomly place containers without any considerations of
* utilization.
*/
public class RandomPipelineChoosePolicy implements PipelineChoosePolicy {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RandomPipelineChoosePolicy is to choose which pipeline to use if there are multiple qualified pipelines.
What I mean is RAFT pipeline creation, during RAFT pipeline creation, it do consider topology.
PipelinePlacementPolicy#chooseDatanodesInternal
// Randomly picks nodes when all nodes are equal or factor is ONE.
// This happens when network topology is absent or
// all nodes are on the same rack.
if (checkAllNodesAreEqual(nodeManager.getClusterNetworkTopologyMap())) {
return super.getResultSet(nodesRequired, healthyNodes);
} else {
// Since topology and rack awareness are available, picks nodes
// based on them.
return this.getResultSetWithTopology(nodesRequired, healthyNodes,
usedNodes);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. good to know.
|
Thanks @jojochuang , I have reviewed the RATIS part. For the EC part, it's better to have someone familiar with EC implementation to have a look. |
|
I'd like to add pipeline choosing policy such as https://issues.apache.org/jira/browse/HDDS-9345 Edit: added them. It was super easy to add using GitHub Copilot Agent |
Change-Id: Ieee0d4dc2d7748d09dccef38966be89e133ef590
Change-Id: I91a0d3f4b379a535245a3acbeef7f274d944cf6c
Change-Id: Ib39a35bb4bf4abd564790620ec7fbe22df84b239
Change-Id: I854ff3efcfba508f4ecae632c57ca99d578dff3c
|
Ok I removed the EC section in the doc. Will commit with just the Ratis related changes, and leave the EC topology awareness to a subtask: HDDS-13575 |
|
Thanks @ChenSammi I will raise another PR for the EC doc. |
* master: HDDS-13553. Recon Staging DB for OM full db reprocess (apache#8917) HDDS-13138. [Docs] Update Topology Awareness user doc. (apache#8528) HDDS-11944. Usability improvements for Ozone tools. (apache#7597) HDDS-12197. Update documentation for all ozone debug tools (apache#8868)
What changes were proposed in this pull request?
HDDS-13138. [Docs] Update Topology Awareness user doc.
Please describe your PR in detail:
Generated-by: Google Gemini 2.5 Pro (Preview) Deep Research, with the prompt:
and then updated the doc manually.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-13138
How was this patch tested?
User doc only.