Skip to content

Conversation

@wking
Copy link
Member

@wking wking commented Dec 10, 2018

installer bits of #2321, CC @crawford.

@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Dec 10, 2018
@crawford
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 10, 2018
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: crawford, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Today I saw [1]:

  error: watch closed before Until timeout
  error openshift-ingress/deploy/router-default did not come up
  sleep: invalid option -- '4'
  Try 'sleep --help' for more information.

I suspect that the 'rollout status' request took long enough that the
fresh 'date' call generated a time larger than wait_expiry_time.  This
commit rerolls the logic last touched by 7991fd3 (Fix how we wait on
router rollout as the new cluster ingress operator, 2018-10-23, openshift#2004).

Now we pick a total wait time (10 minutes), regardless of how many
times we need to reconnect the watcher.  With this commit, each
watcher will try to wait for the full remaining period.  So the first
watcher tries to wait for 10 minutes.  And if the first times out
after 2 minutes, the second watcher will try to wait for 8 minutes.

And the cool-off sleep is no longer parameterized, which removes the
change of flaking like I saw today.

[1]: https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/688/pull-ci-openshift-installer-master-e2e-aws/1971/build-log.txt
@openshift-merge-robot openshift-merge-robot merged commit df71a57 into openshift:master Dec 10, 2018
@openshift-ci-robot
Copy link
Contributor

@wking: Updated the following 3 configmaps:

  • prow-job-cluster-launch-installer-src configmap using the following files:
    • key cluster-launch-installer-src.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-src.yaml
  • prow-job-cluster-launch-installer-e2e configmap using the following files:
    • key cluster-launch-installer-e2e.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-e2e.yaml
  • prow-job-cluster-launch-installer-libvirt-e2e configmap using the following files:
    • key cluster-launch-installer-libvirt-e2e.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-libvirt-e2e.yaml
Details

In response to this:

installer bits of #2321, CC @crawford.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@smarterclayton
Copy link
Contributor

This has an error on rollout

wking added a commit to wking/openshift-release that referenced this pull request Dec 10, 2018
Fixing a typo from ac206e7 (ci-operator/templates/openshift: Refactor
router-rollout wait (again), 2018-11-05, openshift#2342).
NOW="$(date +%s)"
while [[ "${NOW}" -lt "${TARGET}" ]]; do
REMAINING="$((TARGET - NOW))"
if oc --request-timeout="${REMAINING}s" oc rollout status "${ROUTER_DEPLOYMENT}" -n "${ROUTER_NAMESPACE}" -w; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop the second oc

wking added a commit to wking/openshift-release that referenced this pull request Dec 10, 2018
Catch up with ac206e7 (ci-operator/templates/openshift: Refactor
router-rollout wait (again), 2018-11-05, openshift#2342) and ff16a01
(ci-operator/templates/openshift: Refactor router-rollout 'oc oc',
2018-12-09, openshift#2343).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants