Bug 2072202: Check for reachability of API and API-Int URLs later in bootkube#6611
Conversation
|
@sadasu: This pull request references Bugzilla bug 2072202, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/bugzilla refresh |
|
@sadasu: This pull request references Bugzilla bug 2072202, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
a732f41 to
060a84a
Compare
|
@sadasu: No Bugzilla bug is referenced in the title of this pull request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/jira refresh |
|
@sadasu: No Jira bug is referenced in the title of this pull request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@sadasu: This pull request references Bugzilla bug 2072202, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/test e2e-aws-ovn |
barbacbd
left a comment
There was a problem hiding this comment.
I see some redundant checks in things like if [[ ! -z "${API_INT_SERVER_URL}" ]] ; then where the function call inside will make the same check. The outside check doesn't appear necessary
|
@sadasu The BZ linked in this PR was reported against OCP-4.9. Do you intend to backport this change? If so, we'll need to create a bug in Jira. |
|
/bugzilla refresh |
|
@r4f4: This pull request references Bugzilla bug 2072202, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/bugzilla refresh |
|
@r4f4: This pull request references Bugzilla bug 2072202, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Bugzilla (yunjiang@redhat.com), skipping review request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/retest-required |
060a84a to
f9ad0c2
Compare
f9ad0c2 to
5832239
Compare
|
/retest-required |
1 similar comment
|
/retest-required |
patrickdillon
left a comment
There was a problem hiding this comment.
Can you add some context about the motivation for this PR and how this fixes the issue?
I assume the problem is we're trying to reduce false negatives. Pretty much every failed install I look at says the API server is down, but the install has progressed to a point where the API server must have been up.
I think this is due to the fact that the API server may become periodically unavailable throughout the bootstrap process. So perhaps we want to relax our requirements in this check and only resolve the api url rather than checking for success? WDYT?
|
@sadasu: This pull request references Bugzilla bug 2072202, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Bugzilla (yunjiang@redhat.com), skipping review request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hopefully, #6611 (comment) provides the necessary context. Yes, we can just limit ourselves to just the check to see if the URLs are resolvable. This is an attempt to provide some additional diagnostics. |
Oh yeah, sorry I missed that comment. So do you want to do:
Or just do this first and then continue to step down if needed? Personally I think we should just do the resolution check and we can do more later if needed, but I do not feel strongly. |
I would like to step down to just resolving the URL later, if these changes don't serve us. |
|
/retest-required |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: patrickdillon The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest-required |
|
/skip |
|
/override ci/prow/e2e-openstack |
|
@patrickdillon: Overrode contexts on behalf of patrickdillon: ci/prow/e2e-openstack DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/test ci/prow/e2e-aws-ovn |
|
@sadasu: The specified target(s) for
The following commands are available to trigger optional jobs:
Use
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/test e2e-aws-ovn |
|
@sadasu: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
@sadasu: All pull requests linked via external trackers have merged: Bugzilla bug 2072202 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The check to resolve the API and Internal API server URLs were performed together within the bootkube service right after MCO was started. This resulted in false negatives because the API Server became periodically unavailable during the bootstrap process.
To make these checks better, this PR splits the check into 2 parts:
This split has resulted in these checks happening in 2 stages "resolve-api(-int)-url" and "check-api(-int)-url" so the success and failure of these stages can be reported individually. Also, although a stage may report failure, they will not cause the bootstrap process to stop. The output from these stages are available in the analyse output to diagnose any issues.