Skip to content

Conversation

@jojochuang
Copy link
Contributor

What changes were proposed in this pull request?

HDDS-13138. [Docs] Update Topology Awareness user doc.

Please describe your PR in detail:

  • Couple of things to update in the doc: default placement policy is changed.
  • additional placement policies: SCMContainerPlacementCapacity
  • Erasure Coding placement policy: SCMCommonPlacementPolicy

Generated-by: Google Gemini 2.5 Pro (Preview) Deep Research, with the prompt:

I want to update the Ozone's Topology Awareness user doc: https://ozone.apache.org/docs/edge/feature/topology.html

First, look at the resources online and Ozone source code, and find out any stale content in the user doc. Find new placement policies not mentioned in the doc. Find out what they do and provide examples to illustrate how they work. Best practices.

Second, update the doc based on the new information incorporated.

Third, incorporate placement policy for Ozone Erasure Coding. Look at Ozone source code and design doc as source of truth.

and then updated the doc manually.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-13138

How was this patch tested?

User doc only.

Change-Id: I66c09d82a3eb89edff3538c59fd53e131065cab8
@jojochuang jojochuang requested review from ChenSammi and Copilot May 30, 2025 05:13
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the Topology Awareness user documentation for Apache Ozone by revising outdated content and incorporating new placement policies.

  • Updated descriptions of topology-aware operations including both static and dynamic mapping.
  • Introduced new sections detailing the additional placement policies for both RATIS and Erasure Coded (EC) containers.
  • Revised configuration examples and best practice guidelines for network topology and container placements.
Comments suppressed due to low confidence (2)

hadoop-hdds/docs/content/feature/Topology.md:57

  • Ensure that the updated topology mapping file path (/etc/ozone/topology.map) is consistent with deployment standards and accurately documented.
<value>/etc/ozone/topology.map</value>

hadoop-hdds/docs/content/feature/Topology.md:81

  • Confirm that the updated script path for dynamic rack mapping (/etc/ozone/determine_rack.sh) aligns with current deployment configurations and file system conventions.
<value>/etc/ozone/determine_rack.sh</value>

@jojochuang jojochuang added the documentation Improvements or additions to documentation label May 30, 2025
@ChenSammi
Copy link
Contributor

Gemini produces the content, and Copilot reviews the content. We are really into a new era :)


Ozone's topology-aware placement strategies vary by container replication type and state:

* **RATIS Replicated Containers:** Ozone uses RAFT replication for Open containers (write), and an async replication for closed, immutable containers (cold data). As RAFT requires low-latency network, topology awareness placement is available only for closed containers. See the [page about Containers](concept/Containers.md) about more information related to Open vs Closed containers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is not true, pipeline creation considers topology.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hdds.scm.pipeline.choose.policy.imp default is RandomPipelineChoosePolicy, same for EC: hdds.scm.ec.pipeline.choose.policy.impl.

/**
 * Random choose policy that randomly chooses pipeline.
 * That are we just randomly place containers without any considerations of
 * utilization.
 */
public class RandomPipelineChoosePolicy implements PipelineChoosePolicy {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RandomPipelineChoosePolicy is to choose which pipeline to use if there are multiple qualified pipelines.

What I mean is RAFT pipeline creation, during RAFT pipeline creation, it do consider topology.

PipelinePlacementPolicy#chooseDatanodesInternal
// Randomly picks nodes when all nodes are equal or factor is ONE.
    // This happens when network topology is absent or
    // all nodes are on the same rack.
    if (checkAllNodesAreEqual(nodeManager.getClusterNetworkTopologyMap())) {
      return super.getResultSet(nodesRequired, healthyNodes);
    } else {
      // Since topology and rack awareness are available, picks nodes
      // based on them.
      return this.getResultSetWithTopology(nodesRequired, healthyNodes,
          usedNodes);
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. good to know.

@ChenSammi
Copy link
Contributor

Thanks @jojochuang , I have reviewed the RATIS part. For the EC part, it's better to have someone familiar with EC implementation to have a look.

@jojochuang
Copy link
Contributor Author

jojochuang commented Jun 5, 2025

I'd like to add pipeline choosing policy such as https://issues.apache.org/jira/browse/HDDS-9345
-but maybe in another jira.-

Edit: added them. It was super easy to add using GitHub Copilot Agent

Change-Id: Ieee0d4dc2d7748d09dccef38966be89e133ef590
Change-Id: I91a0d3f4b379a535245a3acbeef7f274d944cf6c
Change-Id: Ib39a35bb4bf4abd564790620ec7fbe22df84b239
Change-Id: I84e6ff3130d8a133e7011729264c82ba99f46579
@jojochuang jojochuang marked this pull request as ready for review June 6, 2025 00:50
Change-Id: I55128582234d59f889b1a918cbba1f87d7d4a551
Change-Id: I854ff3efcfba508f4ecae632c57ca99d578dff3c
@jojochuang
Copy link
Contributor Author

Ok I removed the EC section in the doc. Will commit with just the Ratis related changes, and leave the EC topology awareness to a subtask: HDDS-13575

@jojochuang jojochuang merged commit cc2a42d into apache:master Aug 14, 2025
14 checks passed
@jojochuang
Copy link
Contributor Author

Thanks @ChenSammi I will raise another PR for the EC doc.

errose28 added a commit to errose28/ozone that referenced this pull request Aug 18, 2025
* master:
  HDDS-13553. Recon Staging DB for OM full db reprocess (apache#8917)
  HDDS-13138. [Docs] Update Topology Awareness user doc. (apache#8528)
  HDDS-11944. Usability improvements for Ozone tools. (apache#7597)
  HDDS-12197. Update documentation for all ozone debug tools (apache#8868)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI-gen documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants