Skip to content

Bug 1899175: data/rhcos.json: Update boot images for RHEL 8.3#4414

Merged
openshift-merge-robot merged 2 commits intoopenshift:masterfrom
travier:rhel83
Dec 7, 2020
Merged

Bug 1899175: data/rhcos.json: Update boot images for RHEL 8.3#4414
openshift-merge-robot merged 2 commits intoopenshift:masterfrom
travier:rhel83

Conversation

@travier
Copy link
Copy Markdown
Member

@travier travier commented Nov 24, 2020

amd64: 47.83.202012030221-0
s390x: 47.83.202012030410-0
ppc64le: 47.83.202012030110-0

@lucab
Copy link
Copy Markdown

lucab commented Nov 25, 2020

/retest

6 similar comments
@travier
Copy link
Copy Markdown
Member Author

travier commented Nov 26, 2020

/retest

@travier
Copy link
Copy Markdown
Member Author

travier commented Nov 26, 2020

/retest

@travier
Copy link
Copy Markdown
Member Author

travier commented Nov 26, 2020

/retest

@lucab
Copy link
Copy Markdown

lucab commented Nov 27, 2020

/retest

@travier
Copy link
Copy Markdown
Member Author

travier commented Nov 27, 2020

/retest

@travier
Copy link
Copy Markdown
Member Author

travier commented Nov 27, 2020

/retest

@miabbott
Copy link
Copy Markdown
Member

miabbott commented Nov 30, 2020

e2e-crc looks like it has been red for the most recent 40+ runs, so its unlikely this PR is affecting the job status

e2e-metal-ipi has had some success recently, so it looks like there may be some impact being felt from this PR. The bootstrap node appears to be successfully created, but the masters are not able to finish creation. The logs are a bit sparse around section of the test, so it is not obvious why the masters are failing to be created.

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Nov 30, 2020

/retitle Bug 1899175: data/rhcos.json: Update boot images for RHEL 8.3

@openshift-ci-robot openshift-ci-robot changed the title data/rhcos.json: Update boot images for RHEL 8.3 Bug 1899175: data/rhcos.json: Update boot images for RHEL 8.3 Nov 30, 2020
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@travier: This pull request references Bugzilla bug 1899175, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.7.0) matches configured target release for branch (4.7.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
Details

In response to this:

Bug 1899175: data/rhcos.json: Update boot images for RHEL 8.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Nov 30, 2020
@sdodson
Copy link
Copy Markdown
Member

sdodson commented Nov 30, 2020

/test e2e-gcp
/test e2e-azure
/test e2e-metal
/test e2e-metal-ipi
/test e2e-vsphere

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Nov 30, 2020

Gateway timeouts pushing release image to CI registry across the board, seems like a CI infra problem, will try again later.

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Dec 1, 2020

/test e2e-gcp
/test e2e-azure
/test e2e-metal
/test e2e-metal-ipi
/test e2e-vsphere

@travier
Copy link
Copy Markdown
Member Author

travier commented Dec 1, 2020

/retest

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Dec 1, 2020

@shardy @stbenjam PTAL at the metal-ipi failures, are these previously known or believed to be related to this change?

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Dec 1, 2020

@jcpowermac ARe the vsphere tests in line with your expectations for where vsphere is right now or are you concerned about this boot image bump moving forward?

@jcpowermac
Copy link
Copy Markdown
Contributor

@jcpowermac ARe the vsphere tests in line with your expectations for where vsphere is right now or are you concerned about this boot image bump moving forward?

@sdodson
vSphere failures are caused by docker rate-limiting. I would go ahead and move forward

@stbenjam
Copy link
Copy Markdown
Member

stbenjam commented Dec 1, 2020

The metal-ipi failures are possibly related, we're otherwise green on 4.7.

I've kicked off a local install to examine the bootstrap (we didn't collect an install bundle because of #3927).

@stbenjam
Copy link
Copy Markdown
Member

stbenjam commented Dec 1, 2020

From the bootstrap:

Dec 01 21:30:55 localhost startironic.sh[11449]: + podman wait -i 1000 ipa-downloader
Dec 01 21:30:55 localhost startironic.sh[11449]: Error: invalid argument "1000" for "-i, --interval" flag: …n 1000
Dec 01 21:30:55 localhost systemd[1]: ironic.service: Main process exited, code=exited, status=125/n/a

This RHCOS build shipped with the busted podman that broke backwards compatibility with accepting podman wait without units. I had filed a BZ for this as we saw it in RHEL 8.3: https://bugzilla.redhat.com/show_bug.cgi?id=1897282

This very unfortunate bug won't be fixed until 8.4.

#4377 would fix it, but it is not compatible with some very old podmans, which probably isn't an issue for us if we stay on 8.3. Maybe you could try to include that commit here? I don't think it's working with 8.2-based RHCOS.

@stbenjam
Copy link
Copy Markdown
Member

stbenjam commented Dec 2, 2020

/retest

@stbenjam
Copy link
Copy Markdown
Member

stbenjam commented Dec 2, 2020

Something seems messed up with networking

error: error creating buildah builder: Error writing blob: error storing blob to file "/var/tmp/storage520951614/1": read tcp 10.129.59.52:53030->172.217.193.128:443: read: connection reset by peer

@stbenjam
Copy link
Copy Markdown
Member

stbenjam commented Dec 2, 2020

/retest

@stbenjam
Copy link
Copy Markdown
Member

stbenjam commented Dec 2, 2020

Looks like the oc we use disappeared off mirror.openshift.com sometime around this afternoon. We (and 35 other jobs in the release repo) use the oc utils from https://mirror.openshift.com/pub/openshift-v4/clients/oc/4.4/linux/oc.tar.gz. That's just to bootstrap extracting it from the release payload.

I posted openshift/release#13999 to grab client from somewhere else for now.

@travier
Copy link
Copy Markdown
Member Author

travier commented Dec 2, 2020

Waiting on openshift/release#14012

@stbenjam
Copy link
Copy Markdown
Member

stbenjam commented Dec 2, 2020

openshift/release#14012 landed

/test e2e-metal-ipi
/test e2e-metal-ipi-ovn-ipv6

@travier
Copy link
Copy Markdown
Member Author

travier commented Dec 3, 2020

We will need a new image bump for other fixes so I'll place this one on hold until updated.
/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 3, 2020
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Dec 3, 2020
@travier
Copy link
Copy Markdown
Member Author

travier commented Dec 3, 2020

/unhold

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 3, 2020
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@travier: This pull request references Bugzilla bug 1899175, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.7.0) matches configured target release for branch (4.7.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
Details

In response to this:

Bug 1899175: data/rhcos.json: Update boot images for RHEL 8.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Dec 3, 2020

/test e2e-gcp
/test e2e-azure
/test e2e-vsphere

@miabbott
Copy link
Copy Markdown
Member

miabbott commented Dec 3, 2020

e2e-aws hitting capacity issues:

Error creating VPC: VpcLimitExceeded: The maximum number of VPCs has been reached.

@miabbott
Copy link
Copy Markdown
Member

miabbott commented Dec 3, 2020

e2e-gcp has been red for the last 3-4 days. Looks like it has been hitting a combination of:

@travier
Copy link
Copy Markdown
Member Author

travier commented Dec 4, 2020

/retest

@stbenjam
Copy link
Copy Markdown
Member

stbenjam commented Dec 4, 2020

Why did the commit from #4377 get pulled out of this PR? I don't think this should be merged without it. If you'll do it separately you'll end up getting some release payloads rejected while 4377 tries to get through CI.

@travier
Copy link
Copy Markdown
Member Author

travier commented Dec 4, 2020

Why did the commit from #4377 get pulled out of this PR? I don't think this should be merged without it. If you'll do it separately you'll end up getting some release payloads rejected while 4377 tries to get through CI.

Sorry, I pushed a new version and I had not realized that you had added this commit to this PR. Will push it again.

@stbenjam
Copy link
Copy Markdown
Member

stbenjam commented Dec 4, 2020

Why did the commit from #4377 get pulled out of this PR? I don't think this should be merged without it. If you'll do it separately you'll end up getting some release payloads rejected while 4377 tries to get through CI.

Sorry, I pushed a new version and I had not realized that you had added this commit to this PR. Will push it again.

Oh wasn't me, maybe Scott pushed to your branch :) But should get included here, thanks!

stbenjam and others added 2 commits December 4, 2020 17:18
A [bug in some versions of podman](https://bugzilla.redhat.com/show_bug.cgi?id=1897282)
means that units end up being required for the podman wait command, even
though it was documented as optional. This is fixed, but the podman
requiring units shipped in RHEL 8.3 and recent Fedoras, and won't be
updated until 8.4.  This proactively adds units to baremetal's
podman-wait commands in case this broken podman ends up in RHCOS or
FCOS. It doesn't hurt to be explicit anyway.
  - amd64:   47.83.202012030221-0
  - s390x:   47.83.202012030410-0
  - ppc64le: 47.83.202012030110-0
@sdodson
Copy link
Copy Markdown
Member

sdodson commented Dec 5, 2020

/retest
/test e2e-gcp
/test e2e-azure
/test e2e-vsphere

@miabbott
Copy link
Copy Markdown
Member

miabbott commented Dec 5, 2020

/retest

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Dec 6, 2020

/test e2e-metal
e2e-aws finished the installer, so i'm fine overriding that, I just don't know that we've seen a successful metal upi install

@travier
Copy link
Copy Markdown
Member Author

travier commented Dec 7, 2020

/retest

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Dec 7, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 7, 2020
@sdodson
Copy link
Copy Markdown
Member

sdodson commented Dec 7, 2020

All of the platforms have completed install phase so no reason to block this.

@openshift-merge-robot
Copy link
Copy Markdown
Contributor

@travier: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-metal-ipi 8bb5d7e8f4ffb98ee24f1d8d21f518611c025e36 link /test e2e-metal-ipi
ci/prow/e2e-ovirt 4d62f9c link /test e2e-ovirt
ci/prow/e2e-gcp 4d62f9c link /test e2e-gcp
ci/prow/e2e-crc 4d62f9c link /test e2e-crc
ci/prow/e2e-libvirt 4d62f9c link /test e2e-libvirt

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 1611f0f into openshift:master Dec 7, 2020
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@travier: All pull requests linked via external trackers have merged:

Bugzilla bug 1899175 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1899175: data/rhcos.json: Update boot images for RHEL 8.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants