Skip to content

Conversation

@Gargi-jais11
Copy link
Contributor

@Gargi-jais11 Gargi-jais11 commented Mar 12, 2025

What changes were proposed in this pull request?

It will be an estimation value, due to there are other activities going on, such as block deletion, new container creation, container replica deletion, new data ingestion.

This value will be an indicator for roughly how much time are still needed for the disk usage to become even.

Should Include this value in status report too.

The estimated time pending before disk usage become even is given for the happy path in status report.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-12437

How was this patch tested?

Unit Test is written for the function in testDiskBalancerService#testCalculateBytesToMove.

Tested it manually by running locally on docker-cluster.

bash-5.1$ ozone admin datanode diskbalancer status
Status result:
Datanode                            VolumeDensity             Status          Threshold(%)    BandwidthInMB   Threads      SuccessMove  FailureMove  EstBytesToMove(MB) EstTimeLeft(min)
ozone-datanode-2.ozone_default      0.001334742544789370      RUNNING         0.0002          10              5            5            0            816                2           
ozone-datanode-1.ozone_default      0.001486062001891789      RUNNING         0.0002          10              5            5            0            3185               6           

Note: Estimated time left is calculated based on the estimated bytes to move and the configured disk bandwidth.
bash-5.1$ ozone admin datanode diskbalancer status
Status result:
Datanode                            VolumeDensity             Status          Threshold(%)    BandwidthInMB   Threads      SuccessMove  FailureMove  EstBytesToMove(MB) EstTimeLeft(min)
ozone-datanode-2.ozone_default      0.001334742544789370      RUNNING         0.0002          10              5            8            2            683                2           
ozone-datanode-1.ozone_default      0.001486062001891789      RUNNING         0.0002          10              5            5            3            763                2           

Note: Estimated time left is calculated based on the estimated bytes to move and the configured disk bandwidth.

Gargi Jaiswal added 5 commits March 11, 2025 15:22
…fore disk usage becomes even

# Conflicts:
#	hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/diskbalancer/DiskBalancerInfo.java
#	hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/diskbalancer/DiskBalancerService.java
#	hadoop-hdds/interface-client/src/main/proto/hdds.proto
#	hadoop-hdds/interface-server/src/main/proto/ScmServerDatanodeHeartbeatProtocol.proto
#	hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/DiskBalancerManager.java
#	hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/datanode/DiskBalancerStatusSubcommand.java
@ChenSammi
Copy link
Contributor

@Gargi-jais11 , let's show EstBytesToMove(MB) together with the estimation time. And add a note at the command final, explain that estimation time is calculated based on estimation bytes to move and bandwidth.

@Gargi-jais11
Copy link
Contributor Author

@Gargi-jais11 , let's show EstBytesToMove(MB) together with the estimation time. And add a note at the command final, explain that estimation time is calculated based on estimation bytes to move and bandwidth.

Sure will do the above mentioned changes

@Gargi-jais11 Gargi-jais11 requested a review from ChenSammi March 13, 2025 12:24
getDiskBalancerService(containerSet, conf, keyValueHandler, null, 1);
svc.setShouldRun(true);
svc.setThreshold(10);
svc.setQueueSize(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why hard code the queue size to 2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because if I don't hard code the queue size than when it checks the actual value of bytesToMove through calculateBytesToMove then everytime the queue size is 0 as for checking the queue size getTask is not called and when I was trying to call it then containerChoosingPolicy was throwing lots of errors to just check whether the calculateBytesToMove I did it hardcode

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can pass the volumeSet and OzoneConfiguration to calculateBytesToMove() and make it public, so that you can directly call calculateBytesToMove from the unit test, and don't have to deal with the containerChoosingPolicy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried passing the volumeSet and OzoneConfiguration to calculateBytesToMove() without hard coding the queue size to 2 or as volume count but the bytesToMove is still returning to be 0 due to queuesize = 0 else after refactoring the unit test as you suggested on hard coding the queue size to 2 all test case are passing

@ChenSammi
Copy link
Contributor

Please also update the console output.

@Gargi-jais11
Copy link
Contributor Author

Please also update the console output.

done

@ChenSammi ChenSammi marked this pull request as ready for review March 21, 2025 10:41
@ChenSammi
Copy link
Contributor

Thanks @Gargi-jais11 .

@ChenSammi ChenSammi merged commit 2f2d730 into apache:HDDS-5713 Mar 24, 2025
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants