Skip to content

Conversation

@bharatviswa504
Copy link
Contributor

What changes were proposed in this pull request?

Wait forever to obtain the CA list.
And also made retry wait configurable.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-5246

How was this patch tested?

Tested with docker-compose ozonesecure-ha (with few changes to remove WAITFOR scm3.org)

docker-compose up scm1.org
docker-compose om1
docker-compose datanode1
om1_1        | 2021-05-19 09:02:16,082 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms
om1_1        | 2021-05-19 09:02:26,102 [main] INFO utils.HAUtils: Expected CA list size 4, where as received CA List size 2.
om1_1        | 2021-05-19 09:02:26,103 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms
om1_1        | 2021-05-19 09:02:36,147 [main] INFO utils.HAUtils: Expected CA list size 4, where as received CA List size 2.
om1_1        | 2021-05-19 09:02:36,147 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms
om1_1        | 2021-05-19 09:02:46,198 [main] INFO utils.HAUtils: Expected CA list size 4, where as received CA List size 2.
om1_1        | 2021-05-19 09:02:46,198 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms
om1_1        | 2021-05-19 09:02:56,222 [main] INFO utils.HAUtils: Expected CA list size 4, where as received CA List size 2.
om1_1        | 2021-05-19 09:02:56,222 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms
om1_1        | 2021-05-19 09:03:06,245 [main] INFO utils.HAUtils: Expected CA list size 4, where as received CA List size 2.
om1_1        | 2021-05-19 09:03:06,245 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms
om1_1        | 2021-05-19 09:03:16,263 [main] INFO utils.HAUtils: Expected CA list size 4, where as received CA List size 2.
om1_1        | 2021-05-19 09:03:16,263 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms
om1_1        | 2021-05-19 09:03:26,296 [main] INFO utils.HAUtils: Expected CA list size 4, where as received CA List size 2.
om1_1        | 2021-05-19 09:03:26,296 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms
om1_1        | 2021-05-19 09:03:36,333 [main] INFO utils.HAUtils: Expected CA list size 4, where as received CA List size 3.
om1_1        | 2021-05-19 09:03:36,333 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms
om1_1        | 2021-05-19 09:03:46,376 [main] INFO utils.HAUtils: Expected CA list size 4, where as received CA List size 3.
om1_1        | 2021-05-19 09:03:46,376 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms
datanode1_1  | 2021-05-19 09:02:50,155 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms
datanode1_1  | 2021-05-19 09:03:00,179 [main] INFO utils.HAUtils: Expected CA list size 4, where as received CA List size 2.
datanode1_1  | 2021-05-19 09:03:00,179 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms
datanode1_1  | 2021-05-19 09:03:10,204 [main] INFO utils.HAUtils: Expected CA list size 4, where as received CA List size 2.
datanode1_1  | 2021-05-19 09:03:10,204 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms
datanode1_1  | 2021-05-19 09:03:20,225 [main] INFO utils.HAUtils: Expected CA list size 4, where as received CA List size 2.
datanode1_1  | 2021-05-19 09:03:20,225 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms
datanode1_1  | 2021-05-19 09:03:30,266 [main] INFO utils.HAUtils: Expected CA list size 4, where as received CA List size 2.
datanode1_1  | 2021-05-19 09:03:30,266 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms
datanode1_1  | 2021-05-19 09:03:40,293 [main] INFO utils.HAUtils: Expected CA list size 4, where as received CA List size 3.
datanode1_1  | 2021-05-19 09:03:40,293 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms
datanode1_1  | 2021-05-19 09:03:50,323 [main] INFO utils.HAUtils: Expected CA list size 4, where as received CA List size 3.
datanode1_1  | 2021-05-19 09:03:50,323 [main] INFO utils.RetriableTask: Execution of task getCAList failed, will be retried in 10000 ms

Copy link
Contributor

@bshashikant bshashikant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'd like to suggest some some trivial code improvements.

bharatviswa504 and others added 8 commits May 19, 2021 16:04
…ls/HAUtils.java

Co-authored-by: Doroszlai, Attila <6454655+adoroszlai@users.noreply.github.com>
…ls/HAUtils.java

Co-authored-by: Doroszlai, Attila <6454655+adoroszlai@users.noreply.github.com>
…ls/HAUtils.java

Co-authored-by: Doroszlai, Attila <6454655+adoroszlai@users.noreply.github.com>
…ls/HAUtils.java

Co-authored-by: Doroszlai, Attila <6454655+adoroszlai@users.noreply.github.com>
Co-authored-by: Doroszlai, Attila <6454655+adoroszlai@users.noreply.github.com>
…ls/HAUtils.java

Co-authored-by: Doroszlai, Attila <6454655+adoroszlai@users.noreply.github.com>
…ls/HAUtils.java

Co-authored-by: Doroszlai, Attila <6454655+adoroszlai@users.noreply.github.com>
@bharatviswa504
Copy link
Contributor Author

Thank You @adoroszlai for the thorough review.
Addressed review comments.

@bharatviswa504 bharatviswa504 requested a review from adoroszlai May 19, 2021 11:32
Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bharatviswa504 for updating the patch.



<property>
<name>ozone.scm.ca.list.retry.wait.duration</name>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Can we rename ozone.scm.ca.list.retry.wait.duration to ozone.scm.ca.list.retry.interval to be consistent with other retry settings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@xiaoyuyao xiaoyuyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I just have one minor suggestion added inline...

@bharatviswa504 bharatviswa504 requested a review from xiaoyuyao May 19, 2021 16:26
Copy link
Contributor

@xiaoyuyao xiaoyuyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 pending CI.

@mukul1987 mukul1987 merged commit 08375d7 into apache:master May 20, 2021
bharatviswa504 added a commit to bharatviswa504/hadoop-ozone that referenced this pull request Jul 25, 2021
…DN startup (apache#2266)

(cherry picked from commit 08375d7)
Change-Id: I2aa46fc70ec37059946da1c69806bc09bbf4f0ae
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants