Skip to content

Conversation

@Gargi-jais11
Copy link
Contributor

@Gargi-jais11 Gargi-jais11 commented May 27, 2025

What changes were proposed in this pull request?

This JIRA tracks the addition of a micro-benchmark test to evaluate the performance and efficiency of container selection logic in DiskBalancer.

Test Scenario: Test the container choosing efficiency.
SetUp:
Set up a cluster with one DN with 20 volume, and total 1 million containers.
Then run the diskbalancer command to test the container choosing efficiency.
Expected Behaviour:
Containers should be effectively chosen for balancing. No empty containers, open containers, containers under replication and container balancing containers should not be chosen.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-13055

How was this patch tested?

Added performance test with less containers.

@Gargi-jais11
Copy link
Contributor Author

Gargi-jais11 commented May 27, 2025

package org.apache.hadoop.ozone.scm.node;

import static org.apache.hadoop.ozone.container.common.impl.ContainerImplTestUtils.newContainerSet;
import static org.junit.jupiter.api.Assertions.assertTrue;
import static org.mockito.Mockito.mock;
import static org.mockito.Mockito.when;

import java.io.IOException;
import java.nio.file.Path;
import java.time.Duration;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.Random;
import java.util.Set;
import java.util.UUID;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicLong;
import org.apache.hadoop.hdds.conf.OzoneConfiguration;
import org.apache.hadoop.hdds.fs.MockSpaceUsageCheckFactory;
import org.apache.hadoop.hdds.fs.MockSpaceUsageSource;
import org.apache.hadoop.hdds.fs.SpaceUsageCheckFactory;
import org.apache.hadoop.hdds.fs.SpaceUsagePersistence;
import org.apache.hadoop.hdds.fs.SpaceUsageSource;
import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos.ContainerDataProto;
import org.apache.hadoop.ozone.container.ContainerTestHelper;
import org.apache.hadoop.ozone.container.common.impl.ContainerData;
import org.apache.hadoop.ozone.container.common.impl.ContainerLayoutVersion;
import org.apache.hadoop.ozone.container.common.impl.ContainerSet;
import org.apache.hadoop.ozone.container.common.volume.HddsVolume;
import org.apache.hadoop.ozone.container.diskbalancer.policy.ContainerChoosingPolicy;
import org.apache.hadoop.ozone.container.diskbalancer.policy.DefaultContainerChoosingPolicy;
import org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer;
import org.apache.hadoop.ozone.container.keyvalue.KeyValueContainerData;
import org.apache.hadoop.ozone.container.ozoneimpl.ContainerController;
import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.io.TempDir;

/**
 * This class tests the performance of the ContainerChoosingPolicy.
 */
public class TestContainerChoosingPolicyPerformance {

  private static final int NUM_VOLUMES = 20;
  private static final int NUM_CONTAINERS = 1000000; // 1 million containers
  private static final int NUM_THREADS = 100;
  private static final int NUM_ITERATIONS = 10000;

  private static final OzoneConfiguration CONF = new OzoneConfiguration();

  @TempDir
  private Path baseDir;

  private List<HddsVolume> volumes;
  private ContainerSet containerSet;
  private OzoneContainer ozoneContainer;
  private ContainerChoosingPolicy containerChoosingPolicy;
  private ExecutorService executor;
  private ContainerController containerController;

  // Simulate containers currently being balanced (in progress)
  private Set<Long> inProgressContainerIDs = ConcurrentHashMap.newKeySet();

  @BeforeEach
  public void setup() throws Exception {
    containerSet = newContainerSet();
    createVolumes(); // Create 20 volumes
    createContainers(); // Create 1 million containers with 1000 open, rest closed

    ozoneContainer = mock(OzoneContainer.class);
    containerController = new ContainerController(containerSet, null);
    when(ozoneContainer.getController()).thenReturn(containerController);

    containerChoosingPolicy = new DefaultContainerChoosingPolicy();
    executor = Executors.newFixedThreadPool(Math.min(NUM_THREADS, Runtime.getRuntime().availableProcessors()));
  }

  // @AfterEach
  public void cleanUp() {
    volumes.forEach(HddsVolume::shutdown);

    // Shutdown executor service
    if (executor != null && !executor.isShutdown()) {
      executor.shutdownNow();
    }

    // Clear in-progress container IDs
    inProgressContainerIDs.clear();

    // Clear ContainerSet
    containerSet = null;
  }

  @Test
  public void testConcurrentVolumeChoosing() throws Exception {
    for (int i = 0; i < 3; i++) {
      setup();
      testPolicyPerformance("ContainerChoosingPolicy", containerChoosingPolicy);
      cleanUp();
    }
  }

  /*
   * SuccessCount: Number of successful container choices from the policy.
   * FailureCount: Failures due to any exceptions thrown during container choice.
   */
  private void testPolicyPerformance(String policyName, ContainerChoosingPolicy policy) throws Exception {
    CountDownLatch latch = new CountDownLatch(NUM_THREADS);
    AtomicInteger containerChosenCount = new AtomicInteger(0);
    AtomicInteger containerNotChosenCount = new AtomicInteger(0);
    AtomicInteger failureCount = new AtomicInteger(0);
    AtomicLong totalTimeNanos = new AtomicLong(0);

    Random rand = new Random();

    for (int i = 0; i < NUM_THREADS; i++) {
      executor.submit(() -> {
        try {
          long threadStart = System.nanoTime();
          int containerChosen = 0;
          int containerNotChosen = 0;
          int failures = 0;

          for (int j = 0; j < NUM_ITERATIONS; j++) {
            try {
              // Choose a random volume
              HddsVolume volume = volumes.get(rand.nextInt(NUM_VOLUMES));
              ContainerData c = policy.chooseContainer(ozoneContainer, volume, inProgressContainerIDs);
              if (c == null) {
                containerNotChosen++;
              } else {
                containerChosen++;
                inProgressContainerIDs.add(c.getContainerID());
              }
            } catch (Exception e) {
              failures++;
            }
          }

          long threadEnd = System.nanoTime();
          totalTimeNanos.addAndGet(threadEnd - threadStart);
          containerChosenCount.addAndGet(containerChosen);
          containerNotChosenCount.addAndGet(containerNotChosen);
          failureCount.addAndGet(failures);
        } finally {
          latch.countDown();
        }
      });
    }

    // Wait max 50 minutes for test completion
    assertTrue(latch.await(50, TimeUnit.MINUTES), "Test timed out");

    long totalOperations = (long) NUM_THREADS * NUM_ITERATIONS;
    double avgTimePerOp = (double) totalTimeNanos.get() / totalOperations;
    double opsPerSec = totalOperations / (totalTimeNanos.get() / 1_000_000_000.0);

    System.out.println("Performance results for " + policyName);
    System.out.println("Total operations: " + totalOperations);
    System.out.println("Container Chosen operations: " + containerChosenCount.get());
    System.out.println("Container Not Chosen operations: " + containerNotChosenCount.get());
    System.out.println("Failed operations: " + failureCount.get());
    System.out.println("Total time (ms): " + totalTimeNanos.get() / 1_000_000);
    System.out.println("Average time per operation (ns): " + avgTimePerOp);
    System.out.println("Operations per second: " + opsPerSec);
  }

  public void createVolumes() throws IOException {
    // Create volumes with mocked space usage
    volumes = new ArrayList<>();
    for (int i = 0; i < NUM_VOLUMES; i++) {
      String volumePath = baseDir.resolve("disk" + i).toString();
      SpaceUsageSource source = MockSpaceUsageSource.fixed(1000000000, 1000000000 - i * 50000);
      SpaceUsageCheckFactory factory = MockSpaceUsageCheckFactory.of(
          source, Duration.ZERO, SpaceUsagePersistence.None.INSTANCE);
      HddsVolume volume = new HddsVolume.Builder(volumePath)
          .conf(CONF)
          .usageCheckFactory(factory)
          .build();
      volumes.add(volume);
    }
  }

  public void createContainers() {
    List<Long> closedContainerIDs = new ArrayList<>();
    Random random = new Random();

    for (int i = 0; i < NUM_CONTAINERS; i++) {
      boolean isOpen = i < 1000; // First 10 containers are open
      int volumeIndex = i % NUM_VOLUMES; // Distribute containers across volumes
      HddsVolume volume = volumes.get(volumeIndex);

      KeyValueContainerData containerData = new KeyValueContainerData(
          i, ContainerLayoutVersion.FILE_PER_BLOCK, ContainerTestHelper.CONTAINER_MAX_SIZE,
          UUID.randomUUID().toString(), UUID.randomUUID().toString());

      containerData.setState(isOpen ? ContainerDataProto.State.OPEN : ContainerDataProto.State.CLOSED);
      containerData.setVolume(volume);

      KeyValueContainer container = new KeyValueContainer(containerData, CONF);

      try {
        containerSet.addContainer(container); // Add container to ContainerSet
      } catch (Exception e) {
        throw new RuntimeException("Failed to add container to ContainerSet", e);
      }

      // Collect IDs of closed containers
      if (!isOpen) {
        closedContainerIDs.add((long) i);
      }
    }

    // Randomly select 10 closed containers to be in-progress
    Collections.shuffle(closedContainerIDs, random);
    inProgressContainerIDs.addAll(closedContainerIDs.subList(0, 1000));
  }
}

@Gargi-jais11
Copy link
Contributor Author

Gargi-jais11 commented May 27, 2025

Above is the Micro benchmark performance test for container choosing policy in diskBalancer.

@Gargi-jais11
Copy link
Contributor Author

Gargi-jais11 commented May 27, 2025

NUM_VOLUMES = 20;

NUM_CONTAINERS = 100000;
NUM_THREADS = 100;
NUM_ITERATIONS = 10000;


I. Performance results for ContainerChoosingPolicy
	Total operations: 1000000
	Container Chosen operations: 111131
	Container Not Chosen operations: 888869
	Failed operations: 0
	Total time (ms): 9327312
	Average time per operation (ns): 9327312.056165
	Operations per second: 107.21202356889496
	
II. Performance results for ContainerChoosingPolicy
	Total operations: 1000000
	Container Chosen operations: 108265
	Container Not Chosen operations: 891735
	Failed operations: 0
	Total time (ms): 8158006
	Average time per operation (ns): 8158006.623745
	Operations per second: 122.57896397011525
	
III. Performance results for ContainerChoosingPolicy
	Total operations: 1000000
	Container Chosen operations: 110885
	Container Not Chosen operations: 889115
	Failed operations: 0
	Total time (ms): 9459847
	Average time per operation (ns): 9459847.270908
	Operations per second: 105.70995190115954

overall approx 157 min for 1Lakh containers across 1M operations concurrently.



@Gargi-jais11 Gargi-jais11 marked this pull request as ready for review May 27, 2025 09:31
@ChenSammi
Copy link
Contributor

ChenSammi commented May 28, 2025

@Gargi-jais11 , please commit TestContainerChoosingPolicyPerformance with a smaller NUM_CONTAINERS.

About the TestContainerChoosingPolicyPerformance,

  1. Don't mock the containerController.getContainers(volume) call. We are trying to measure how much time is spent on chooseContainer(). Mock the containerController.getContainers(volume) call will lead to a not real time measurement.
  2. inProgressContainerIDs.add incorrect place.

@jojochuang jojochuang requested review from ChenSammi and xichen01 June 2, 2025 16:53
@ChenSammi ChenSammi changed the title HDDS-13055. [DiskBalancer] Add micro benchmark to test container choosing efficiency HDDS-13055. [DiskBalancer] Optimize DefaultContainerChoosingPolicy performance Jun 5, 2025
@ChenSammi
Copy link
Contributor

Without optimization

Created 100000 containers in 2066 ms
Performance results for ContainerChoosingPolicy
Total volumes: 20
Total containers: 100000
Total threads: 10
Total operations: 100000
Container Chosen operations: 100000
Container Not Chosen operations: 0
Failed operations: 0
Total time (ms): 1163929
Average time per operation (ns): 1.1639291495E7
Operations per second: 85.91588245981977

With optimization

Created 100000 containers in 2142 ms
Performance results for ContainerChoosingPolicy
Total volumes: 20
Total containers: 100000
Total threads: 10
Total operations: 100000
Container Chosen operations: 99983
Container Not Chosen operations: 17
Failed operations: 0
Total time (ms): 1593
Average time per operation (ns): 15932.3675
Operations per second: 62765.31093072012

Created 100000 containers in 2235 ms
Performance results for ContainerChoosingPolicy
Total volumes: 20
Total containers: 100000
Total threads: 10
Total operations: 1000000
Container Chosen operations: 999803
Container Not Chosen operations: 197
Failed operations: 0
Total time (ms): 4607
Average time per operation (ns): 4607.252501
Operations per second: 217049.0980867558

Created 500000 containers in 9831 ms
Performance results for ContainerChoosingPolicy
Total volumes: 20
Total containers: 500000
Total threads: 10
Total operations: 1000000
Container Chosen operations: 999962
Container Not Chosen operations: 38
Failed operations: 0
Total time (ms): 5890
Average time per operation (ns): 5890.142792
Operations per second: 169775.1710464815

Created 1000000 containers in 19750 ms
Performance results for ContainerChoosingPolicy
Total volumes: 20
Total containers: 1000000
Total threads: 10
Total operations: 1000000
Container Chosen operations: 999983
Container Not Chosen operations: 17
Failed operations: 0
Total time (ms): 7550
Average time per operation (ns): 7550.438456
Operations per second: 132442.6396993335

Created 2000000 containers in 40490 ms
Performance results for ContainerChoosingPolicy
Total volumes: 20
Total containers: 2000000
Total threads: 10
Total operations: 1000000
Container Chosen operations: 999996
Container Not Chosen operations: 4
Failed operations: 0
Total time (ms): 6536
Average time per operation (ns): 6536.22117
Operations per second: 152993.59889928572

@ChenSammi
Copy link
Contributor

Hi @symious, could you take a look?

@ChenSammi ChenSammi requested a review from symious June 6, 2025 03:22
Copy link
Contributor

@symious symious left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the improvement, LGTM.

@ChenSammi
Copy link
Contributor

Thanks @Gargi-jais11 for the contribution and @symious for the review.

@ChenSammi ChenSammi merged commit 232d133 into apache:HDDS-5713 Jun 13, 2025
42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants