Skip to content

Conversation

@adoroszlai
Copy link
Contributor

What changes were proposed in this pull request?

Cluster startup/shutdown is the most time-consuming part of the test for several classes (30+ seconds for each of the following test classes).

[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 43.108 s - in org.apache.hadoop.fs.ozone.TestOzoneFsHAURLs
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 35.312 s - in org.apache.hadoop.hdds.scm.TestAllocateContainer
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 36.292 s - in org.apache.hadoop.hdds.scm.TestContainerReportWithKeys
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 36.5 s - in org.apache.hadoop.hdds.scm.TestContainerSmallFile
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 36.365 s - in org.apache.hadoop.hdds.scm.TestGetCommittedBlockLengthAndPutKey
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 35.138 s - in org.apache.hadoop.hdds.scm.TestSCMNodeManagerMXBean
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 35.297 s - in org.apache.hadoop.hdds.scm.pipeline.TestPipelineManagerMXBean
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 23.23 s - in org.apache.hadoop.ozone.TestCpuMetrics
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 35.475 s - in org.apache.hadoop.ozone.TestGetClusterTreeInformation
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 35.577 s - in org.apache.hadoop.ozone.container.metrics.TestDatanodeQueueMetrics
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 42.81 s - in org.apache.hadoop.ozone.freon.TestDNRPCLoadGenerator
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 37.423 s - in org.apache.hadoop.ozone.om.TestObjectStore
[INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.007 s - in org.apache.hadoop.ozone.om.TestObjectStoreWithFSO
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 38.304 s - in org.apache.hadoop.ozone.om.TestOmBlockVersioning
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 35.132 s - in org.apache.hadoop.ozone.om.TestOzoneManagerRestInterface
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 38.587 s - in org.apache.hadoop.ozone.shell.TestScmAdminHA

Safe integration tests (ones which do not stop / restart components) can be run on the same MiniOzoneCluster to save time.

This PR creates two such test groups, one for HA and another for non-HA. Test implementations are kept in existing separate classes, but they no longer manage the lifecycle of the cluster.

Further tests can be updated in follow-up, but wanted to keep the change relatively small here.

https://issues.apache.org/jira/browse/HDDS-12183

How was this patch tested?

Total time for these grouped tests is less than 2 minutes:

2025-02-02T19:34:25.6207523Z [INFO] Running org.apache.ozone.test.TestOzoneIntegrationHA
2025-02-02T19:35:08.0532396Z [INFO] Running org.apache.ozone.test.HATests$ScmAdminHA
2025-02-02T19:35:08.3292991Z [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.264 s - in org.apache.ozone.test.HATests$ScmAdminHA
2025-02-02T19:35:08.3294514Z [INFO] Running org.apache.ozone.test.HATests$DatanodeQueueMetrics
2025-02-02T19:35:09.7697990Z [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.429 s - in org.apache.ozone.test.HATests$DatanodeQueueMetrics
2025-02-02T19:35:09.7788304Z [INFO] Running org.apache.ozone.test.HATests$GetClusterTreeInformation
2025-02-02T19:35:11.8574699Z [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.061 s - in org.apache.ozone.test.HATests$GetClusterTreeInformation
2025-02-02T19:35:11.8650174Z [INFO] Running org.apache.ozone.test.HATests$OzoneFsHAURLs
2025-02-02T19:35:13.6359721Z [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.757 s - in org.apache.ozone.test.HATests$OzoneFsHAURLs
2025-02-02T19:35:22.9608544Z [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 57.336 s - in org.apache.ozone.test.TestOzoneIntegrationHA
2025-02-02T19:35:24.3609647Z [INFO] Running org.apache.ozone.test.TestOzoneIntegrationNonHA
2025-02-02T19:35:56.2101096Z [INFO] Running org.apache.ozone.test.NonHATests$OzoneManagerRestInterface
2025-02-02T19:35:56.4804421Z [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.265 s - in org.apache.ozone.test.NonHATests$OzoneManagerRestInterface
2025-02-02T19:35:56.4808719Z [INFO] Running org.apache.ozone.test.NonHATests$OmBlockVersioning
2025-02-02T19:35:57.7629080Z [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.276 s - in org.apache.ozone.test.NonHATests$OmBlockVersioning
2025-02-02T19:35:57.7656265Z [INFO] Running org.apache.ozone.test.NonHATests$ObjectStoreWithFSO
2025-02-02T19:35:59.6805203Z [INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.912 s - in org.apache.ozone.test.NonHATests$ObjectStoreWithFSO
2025-02-02T19:35:59.6824980Z [INFO] Running org.apache.ozone.test.NonHATests$ObjectStore
2025-02-02T19:35:59.7759688Z [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.087 s - in org.apache.ozone.test.NonHATests$ObjectStore
2025-02-02T19:35:59.7780535Z [INFO] Running org.apache.ozone.test.NonHATests$DNRPCLoadGenerator
2025-02-02T19:36:04.4931112Z [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.715 s - in org.apache.ozone.test.NonHATests$DNRPCLoadGenerator
2025-02-02T19:36:04.4932291Z [INFO] Running org.apache.ozone.test.NonHATests$CpuMetrics
2025-02-02T19:36:05.0518019Z [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.553 s - in org.apache.ozone.test.NonHATests$CpuMetrics
2025-02-02T19:36:05.0524175Z [INFO] Running org.apache.ozone.test.NonHATests$PipelineManagerMXBean
2025-02-02T19:36:05.0637681Z [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.007 s - in org.apache.ozone.test.NonHATests$PipelineManagerMXBean
2025-02-02T19:36:05.0643743Z [INFO] Running org.apache.ozone.test.NonHATests$SCMNodeManagerMXBean
2025-02-02T19:36:05.0823699Z [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.013 s - in org.apache.ozone.test.NonHATests$SCMNodeManagerMXBean
2025-02-02T19:36:05.0830009Z [INFO] Running org.apache.ozone.test.NonHATests$GetCommittedBlockLengthAndPutKey
2025-02-02T19:36:05.2385383Z [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.143 s - in org.apache.ozone.test.NonHATests$GetCommittedBlockLengthAndPutKey
2025-02-02T19:36:05.2386091Z [INFO] Running org.apache.ozone.test.NonHATests$ContainerSmallFile
2025-02-02T19:36:05.3736364Z [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.135 s - in org.apache.ozone.test.NonHATests$ContainerSmallFile
2025-02-02T19:36:05.3742900Z [INFO] Running org.apache.ozone.test.NonHATests$ContainerReportWithKeys
2025-02-02T19:36:05.4369321Z [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.061 s - in org.apache.ozone.test.NonHATests$ContainerReportWithKeys
2025-02-02T19:36:05.4370413Z [INFO] Running org.apache.ozone.test.NonHATests$AllocateContainer
2025-02-02T19:36:05.4556415Z [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.007 s - in org.apache.ozone.test.NonHATests$AllocateContainer
2025-02-02T19:36:12.5130242Z [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 48.148 s - in org.apache.ozone.test.TestOzoneIntegrationNonHA

https://github.com/adoroszlai/ozone/actions/runs/13101544695

@adoroszlai adoroszlai added the test label Feb 2, 2025
@adoroszlai adoroszlai self-assigned this Feb 2, 2025
@adoroszlai adoroszlai marked this pull request as draft February 3, 2025 07:55
@adoroszlai adoroszlai marked this pull request as ready for review February 3, 2025 09:10
Copy link
Contributor

@sodonnel sodonnel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - this will be a nice improvement in test runtimes.

@adoroszlai adoroszlai merged commit 260434f into apache:master Feb 4, 2025
41 checks passed
@adoroszlai adoroszlai deleted the HDDS-12183 branch February 4, 2025 21:48
@adoroszlai
Copy link
Contributor Author

Thanks @sodonnel for the review.

errose28 added a commit to errose28/ozone that referenced this pull request Feb 5, 2025
* master: (168 commits)
  HDDS-12112. Fix interval used for Chunk Read/Write Dashboard (apache#7724)
  HDDS-12212. Fix grammar in decommissioning and observability documentation (apache#7815)
  HDDS-12195. Implement skip() in OzoneFSInputStream (apache#7801)
  HDDS-12200. Fix grammar in OM HA, EC and Snapshot doc (apache#7806)
  HDDS-12202. OpsCreate and OpsAppend metrics not incremented (apache#7811)
  HDDS-12203. Initialize block length before skip (apache#7809)
  HDDS-12183. Reuse cluster across safe test classes (apache#7793)
  HDDS-11714. resetDeletedBlockRetryCount with --all may fail and can cause long db lock in large cluster. (apache#7665)
  HDDS-12186. (addendum) Avoid array allocation for table iterator (apache#7799)
  HDDS-12186. Avoid array allocation for table iterator. (apache#7797)
  HDDS-11508. Decouple delete batch limits from Ratis request size for DirectoryDeletingService. (apache#7365)
  HDDS-12073. Don't show Source Bucket and Volume if null in DU metadata (apache#7760)
  HDDS-12142. Save logs from build check (apache#7782)
  HDDS-12163. Reduce number of individual getCapacity/getAvailable/getUsedSpace calls (apache#7790)
  HDDS-12176. Trivial dependency cleanup.(apache#7787)
  HDDS-12181. Bump jline to 3.29.0 (apache#7789)
  HDDS-12165. Refactor VolumeInfoMetrics to use getCurrentUsage (apache#7784)
  HDDS-12085. Add manual refresh button for DU page (apache#7780)
  HDDS-12132. Parameterize testUpdateTransactionInfoTable for SCM (apache#7768)
  HDDS-11277. Remove dependency on hadoop-hdfs in Ozone client (apache#7781)
  ...

Conflicts:
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/DatanodeConfiguration.java
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueHandler.java
hadoop-ozone/dist/src/main/smoketest/admincli/container.robot
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/ClosedContainerReplicator.java
errose28 added a commit to errose28/ozone that referenced this pull request Feb 6, 2025
* master: (168 commits)
  HDDS-12112. Fix interval used for Chunk Read/Write Dashboard (apache#7724)
  HDDS-12212. Fix grammar in decommissioning and observability documentation (apache#7815)
  HDDS-12195. Implement skip() in OzoneFSInputStream (apache#7801)
  HDDS-12200. Fix grammar in OM HA, EC and Snapshot doc (apache#7806)
  HDDS-12202. OpsCreate and OpsAppend metrics not incremented (apache#7811)
  HDDS-12203. Initialize block length before skip (apache#7809)
  HDDS-12183. Reuse cluster across safe test classes (apache#7793)
  HDDS-11714. resetDeletedBlockRetryCount with --all may fail and can cause long db lock in large cluster. (apache#7665)
  HDDS-12186. (addendum) Avoid array allocation for table iterator (apache#7799)
  HDDS-12186. Avoid array allocation for table iterator. (apache#7797)
  HDDS-11508. Decouple delete batch limits from Ratis request size for DirectoryDeletingService. (apache#7365)
  HDDS-12073. Don't show Source Bucket and Volume if null in DU metadata (apache#7760)
  HDDS-12142. Save logs from build check (apache#7782)
  HDDS-12163. Reduce number of individual getCapacity/getAvailable/getUsedSpace calls (apache#7790)
  HDDS-12176. Trivial dependency cleanup.(apache#7787)
  HDDS-12181. Bump jline to 3.29.0 (apache#7789)
  HDDS-12165. Refactor VolumeInfoMetrics to use getCurrentUsage (apache#7784)
  HDDS-12085. Add manual refresh button for DU page (apache#7780)
  HDDS-12132. Parameterize testUpdateTransactionInfoTable for SCM (apache#7768)
  HDDS-11277. Remove dependency on hadoop-hdfs in Ozone client (apache#7781)
  ...

Conflicts:
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/DatanodeConfiguration.java
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueHandler.java
hadoop-ozone/dist/src/main/smoketest/admincli/container.robot
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/ClosedContainerReplicator.java
errose28 added a commit to errose28/ozone that referenced this pull request Feb 7, 2025
…ee-improvements

* HDDS-10239-container-reconciliation: (168 commits)
  HDDS-12112. Fix interval used for Chunk Read/Write Dashboard (apache#7724)
  HDDS-12212. Fix grammar in decommissioning and observability documentation (apache#7815)
  HDDS-12195. Implement skip() in OzoneFSInputStream (apache#7801)
  HDDS-12200. Fix grammar in OM HA, EC and Snapshot doc (apache#7806)
  HDDS-12202. OpsCreate and OpsAppend metrics not incremented (apache#7811)
  HDDS-12203. Initialize block length before skip (apache#7809)
  HDDS-12183. Reuse cluster across safe test classes (apache#7793)
  HDDS-11714. resetDeletedBlockRetryCount with --all may fail and can cause long db lock in large cluster. (apache#7665)
  HDDS-12186. (addendum) Avoid array allocation for table iterator (apache#7799)
  HDDS-12186. Avoid array allocation for table iterator. (apache#7797)
  HDDS-11508. Decouple delete batch limits from Ratis request size for DirectoryDeletingService. (apache#7365)
  HDDS-12073. Don't show Source Bucket and Volume if null in DU metadata (apache#7760)
  HDDS-12142. Save logs from build check (apache#7782)
  HDDS-12163. Reduce number of individual getCapacity/getAvailable/getUsedSpace calls (apache#7790)
  HDDS-12176. Trivial dependency cleanup.(apache#7787)
  HDDS-12181. Bump jline to 3.29.0 (apache#7789)
  HDDS-12165. Refactor VolumeInfoMetrics to use getCurrentUsage (apache#7784)
  HDDS-12085. Add manual refresh button for DU page (apache#7780)
  HDDS-12132. Parameterize testUpdateTransactionInfoTable for SCM (apache#7768)
  HDDS-11277. Remove dependency on hadoop-hdfs in Ozone client (apache#7781)
  ...
nandakumar131 pushed a commit to nandakumar131/ozone that referenced this pull request Feb 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants