-
Notifications
You must be signed in to change notification settings - Fork 594
HDDS-2621. Enable OM HA acceptance tests. #265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bharatviswa504
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 pending CI.
|
OM ha test is failing in this patch: In github actions based runs the |
|
@hanishakoneru I tried to fix this, found the following two changes necessary: The test still doesn't pass completely, latest run failed with |
|
/pending acceptance tests are failing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marking this issue as un-mergeable as requested.
Please use /ready comment when it's resolved.
acceptance tests are failing
9934eb0 to
c656e33
Compare
|
/ready |
Blocking review request is removed.
|
/pending |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marking this issue as un-mergeable as requested.
Please use /ready comment when it's resolved.
/pending
|
Thanks @adoroszlai and @elek. |
adoroszlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hanishakoneru for the update. I also confirmed that the test passes locally. Re-triggered CI check, as previously acceptance test timed out (at a much earlier point than the OM HA test, so unrelated to this change).
One question: why did you set /pending again?
|
Thanks @adoroszlai for verifying. I set to "pending" so that someone does not merge this change by mistake as there is 1 approval already :) |
I'm seeing something similar on some of the PR runs but not all even though this PR has not been merged yet, like #520 I mentioned. Can you confirm @hanishakoneru if this is the same issue. 2019-11-16T12:01:24.9714368Z Test Multiple Failovers | FAIL | |
|
@xiaoyuyao, this is the same issue but I am surprised how the OM-HA tests are running when they are disabled. Can you please point to another PR where you observed this. I don't see HA tests in #520 . |
4b79844 to
81d08c3
Compare
598a764 to
ca4b385
Compare
|
Thank you all for the reviews and suggestions. I will merge this shortly. |
|
Test is failed on the master after the merge: https://github.com/apache/hadoop-ozone/actions/runs/67591808 |
And no OM logs are available: https://issues.apache.org/jira/browse/HDDS-3311 |
|
Two more failures:
I hate to be the bad guy (again), and I am very sorry, but I have to revert/reopen it as it's failed 3 times from the last 6 master build. |
|
Ok. Instead of revert, I just excluded the tests from the pr / daily build. We can create an other pull request and start to test it (#ce6ad30a3) |
|
I will create a separated github action to make it easier to run only one acceptance test multiple times... |
|
If I didn't do any mistake, the om-ha tests (and only them) will be executed twice per hour here: https://github.com/elek/hadoop-ozone/actions?query=workflow%3Aacceptance-single |
|
Thanks @elek for disabling this test. These OM HA failures are very elusive. I tested multiple times before merging and they didnt fail any of those times. |
|
It seems to be flaky. Got a lot of runs and 7 out of the last 25 are failed. (Yes, hard to detect this number, because you might have even 3 green builds without detecting them). Examples: https://github.com/elek/hadoop-ozone/runs/554176250 |
… log level to warn Merge in SDPOZONE/component-ozone from SDPOZN-1962 to sdp-ozone-1.4 * commit '96678f24a2637e75639571088130d5b2e955b8bb': SDPOZN-1962. Move notLeaderException message log level to warn
What changes were proposed in this pull request?
OM HA robot tests were disable in HDDS-2533 as they were failing intermittently. HDDS-2454 fixes some issues in the HA tests. Creating this Jira so as to run re-enable the HA acceptance tests.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-2621
How was this patch tested?
CI acceptance test suit.