Skip to content

Conversation

@errose28
Copy link
Contributor

@errose28 errose28 commented Jun 15, 2022

What changes were proposed in this pull request?

After HDDS-6760, SCM finalization was intermittently timing out during the acceptance test. This happened because containers were put in a CLOSING state since finalization happened before datanode leader election, but the replication manager was not running to close the containers until 5 minutes after safemode exit. The pipeline scrubber also had a large wait time and pipeline timeout that prevented it from force closing these pipelines during the duration of the test.

This Jira modifies the acceptance test configurations to speed up moving containers to a CLOSED state so finalization can proceed. It also increases the test timeout.

What is the link to the Apache JIRA

HDDS-6898

How was this patch tested?

50 successful runs on my fork: https://github.com/errose28/hadoop-ozone/runs/6905140700?check_suite_focus=true

@errose28 errose28 marked this pull request as ready for review June 15, 2022 20:52
@errose28 errose28 requested a review from adoroszlai June 15, 2022 20:53
@kerneltime
Copy link
Contributor

LGTM

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @errose28 for the fix.

@adoroszlai adoroszlai merged commit 8665dc2 into apache:master Jun 16, 2022
errose28 added a commit to errose28/ozone that referenced this pull request Jun 23, 2022
* master: (34 commits)
  HDDS-6868 Add S3Auth information to thread local (apache#3527)
  HDDS-6877. Keep replication port unchanged when restarting datanode in MiniOzoneCluster (apache#3510)
  HDDS-6907. OFS should create buckets with FILE_SYSTEM_OPTIMIZED layout. (apache#3528)
  HDDS-6875. Migrate parameterized tests in hdds-common to JUnit5 (apache#3513)
  HDDS-6924. OBJECT_STORE isn't flat namespaced (apache#3533)
  HDDS-6899. [EC] Remove warnings and errors from console during online reconstruction of data. (apache#3522)
  HDDS-6695. Enable SCM Ratis by default for new clusters only (apache#3499)
  HDDS-4123. Integrate OM Open Key Cleanup Service Into Existing Code (apache#3319)
  HDDS-6882. Correct exit code for invalid arguments passed to command-line tools. (apache#3517)
  HDDS-6890. EC: Fix potential wrong replica read with over-replicated container. (apache#3523)
  HDDS-6902. Duplicate mockito-core entries in pom.xml (apache#3525)
  HDDS-6752. Migrate tests with rules in hdds-server-scm to JUnit5 (apache#3442)
  HDDS-6806. EC: Implement the EC Reconstruction coordinator. (apache#3504)
  HDDS-6829. Limit the no of inflight replication tasks in SCM. (apache#3482)
  HDDS-6898. [SCM HA finalization] Modify acceptance test configuration to speed up test finalization (apache#3521)
  HDDS-6577. Configurations to reserve HDDS volume space. (apache#3484)
  HDDS-6870 Clean up isTenantAdmin to use UGI (apache#3503)
  HDDS-6872. TestAuthorizationV4QueryParser should pass offline (apache#3506)
  HDDS-6840. Add MetaData volume information to the SCM and OM - UI (apache#3488)
  HDDS-6697. EC: ReplicationManager - create class to detect EC container health issues (apache#3512)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants