Skip to content

Conversation

@Gargi-jais11
Copy link
Contributor

@Gargi-jais11 Gargi-jais11 commented Oct 3, 2025

What changes were proposed in this pull request?

Currently DiskBalancer start, stop and update command is send only to IN_SERVICE_HEALTHY DN but user has no info about this so improve cli output message to show as below:

bash-5.1$ ozone admin datanode diskbalancer start -t 0.0001 -a
Starting DiskBalancer on datanode(s) which are IN_SERVICE and HEALTHY. 

When start, stop and update command is sent to a specific DN which is not IN_SERVICE_HEALTHY, command should be rejected same as when sent to all DN.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-13667

How was this patch tested?

Updated existing Integration Test TestDiskBalancerDuringDecommissionAndMaintenance .
Also tested manually on docker-cluster:

bash-5.1$ ozone admin datanode decommission -id scmservice ozone-ha-datanode-5
Started decommissioning datanode(s):
ozone-ha-datanode-5

bash-5.1$ ozone admin datanode diskbalancer status
Status result:
Datanode                            Status          Threshold(%)    BandwidthInMB   Threads      SuccessMove  FailureMove  BytesMoved(MB)  EstBytesToMove(MB) EstTimeLeft(min)
ozone-ha-datanode-3.ozone-ha_default STOPPED         10.0000         200             5            0            0            0               0               0              
ozone-ha-datanode-2.ozone-ha_default STOPPED         10.0000         200             5            0            0            0               0               0              
ozone-ha-datanode-4.ozone-ha_default STOPPED         10.0000         200             5            0            0            0               0               0              
ozone-ha-datanode-1.ozone-ha_default STOPPED         10.0000         200             5            0            0            0               0               0              

Note: Estimated time left is calculated based on the estimated bytes to move and the configured disk bandwidth.

bash-5.1$ ozone admin datanode diskbalancer start -b 200 -d ozone-ha-datanode-5
Error: ozone-ha-datanode-5.ozone-ha_default: Datanode is not in optimal state for disk balancing. NodeStatus: DECOMMISSIONING(no expiry)-HEALTHY
Some nodes could not start DiskBalancer.
bash-5.1$ ozone admin datanode decommission -id scmservice ozone-ha-datanode-3 
Started decommissioning datanode(s):
ozone-ha-datanode-3

bash-5.1$ ozone admin datanode decommission -id scmservice ozone-ha-datanode-4 
Started decommissioning datanode(s):
ozone-ha-datanode-4

bash-5.1$ ozone admin datanode diskbalancer start -t 0.002 -a
Starting DiskBalancer on datanode(s) which are IN_SERVICE and HEALTHY.

@Gargi-jais11 Gargi-jais11 marked this pull request as ready for review October 3, 2025 09:01
Copy link
Contributor

@sarvekshayr sarvekshayr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch @Gargi-jais11.

@sarvekshayr
Copy link
Contributor

Thanks for updating the patch. Please change the log message from the first output in the PR description.

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Gargi-jais11 Given minor comment

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ChenSammi
Copy link
Contributor

Thanks @Gargi-jais11 for the contribution, and @sumitagrawl for the review.

@ChenSammi ChenSammi merged commit 2fa932c into apache:HDDS-5713 Oct 16, 2025
83 of 84 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants