Skip to content

Bug 1889855: additional AWS regions for RHCOS AMIs#4286

Merged
openshift-merge-robot merged 1 commit intoopenshift:release-4.6from
miabbott:bz1889855_new_regions
Nov 5, 2020
Merged

Bug 1889855: additional AWS regions for RHCOS AMIs#4286
openshift-merge-robot merged 1 commit intoopenshift:release-4.6from
miabbott:bz1889855_new_regions

Conversation

@miabbott
Copy link
Copy Markdown
Member

There are RFEs requesting support for additional AWS regions:
ap-east-1, af-south-1, and eu-south-1

https://issues.redhat.com/browse/RFE-903
https://issues.redhat.com/browse/RFE-1267

The existing 4.6 RHCOS AMI was manually copied to these regions by the
ART team.

@openshift-ci-robot openshift-ci-robot added the bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. label Oct 20, 2020
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@miabbott: This pull request references Bugzilla bug 1889855, which is invalid:

  • expected the bug to target the "4.6.0" release, but it targets "4.6.z" instead
  • expected dependent Bugzilla bug 1889852 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), but it is ASSIGNED instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

Bug 1889855: additional AWS regions for RHCOS AMIs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Oct 20, 2020
@miabbott
Copy link
Copy Markdown
Member Author

@jupierce did the copying for us

cc: @katherinedube

@miabbott
Copy link
Copy Markdown
Member Author

cc: @ashcrow @cgwalters

@miabbott
Copy link
Copy Markdown
Member Author

/retest

@cgwalters
Copy link
Copy Markdown
Member

Hmm, one concern with manual copying is that in prior versions of OpenShift the installer used the underlying snapshot which also required public access; not doing that would be a trap because this PR would sail right through CI (because we don't actually test in those regions AFAIK).

I think in 4.6 though the installer just uses the AMI directly. But it is probably worth validating this manually at least.

@miabbott
Copy link
Copy Markdown
Member Author

I think in 4.6 though the installer just uses the AMI directly. But it is probably worth validating this manually at least.

I'll build this PR locally and try to install to the new regions.

@miabbott
Copy link
Copy Markdown
Member Author

Similar to what was reported in #4288 (comment), it looks like we need additional changes to support these new regions.

2020/10/21 14:31:17 Executing pod "verify-codegen"
+ go generate ./pkg/types/installconfig.go
+ go generate ./pkg/rhcos/ami.go
2020/10/21 14:33:00 srcPath:  /go/src/github.com/openshift/installer/data/data/rhcos-amd64.json
2020/10/21 14:33:00 dstPath:  /go/src/github.com/openshift/installer/pkg/rhcos/ami_regions.go
+ set +x
diff --git a/pkg/rhcos/ami_regions.go b/pkg/rhcos/ami_regions.go
index aebc16f..25c1008 100644
--- a/pkg/rhcos/ami_regions.go
+++ b/pkg/rhcos/ami_regions.go
@@ -4,6 +4,8 @@ package rhcos
 
 // AMIRegoins is a list of regions where the RHEL CoreOS is published.
 var AMIRegions = []string{
+	"af-south-1",
+	"ap-east-1",
 	"ap-northeast-1",
 	"ap-northeast-2",
 	"ap-south-1",
@@ -12,6 +14,7 @@ var AMIRegions = []string{
 	"ca-central-1",
 	"eu-central-1",
 	"eu-north-1",
+	"eu-south-1",
 	"eu-west-1",
 	"eu-west-2",
 	"eu-west-3",

@abhinavdahiya @sdodson Are you folks comfortable extending installer support to these regions for 4.6? @katherinedube requested these regions be supported in 4.6

There are RFEs requesting support for additional AWS regions:
ap-east-1, af-south-1, and eu-south-1

https://issues.redhat.com/browse/RFE-903
https://issues.redhat.com/browse/RFE-1267

The existing 4.6 RHCOS AMI was manually copied to these regions by the
ART team.
@miabbott miabbott force-pushed the bz1889855_new_regions branch from 5c65fb6 to 2b1295d Compare October 21, 2020 19:33
@miabbott
Copy link
Copy Markdown
Member Author

Pushed an update with the regions added to pkg/rhcos/ami_regions.go

@miabbott
Copy link
Copy Markdown
Member Author

Not sure what is going on with e2e-aws; looks like the container is failing to build when trying to get the certs:

STEP 25: RUN curl -L -O -k https://vcsa-ci.vmware.devcluster.openshift.com/certs/download.zip &&     unzip download.zip &&     cp certs/lin/* /etc/pki/ca-trust/source/anchors &&     update-ca-trust extract
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  6050  100  6050    0     0   7962      0 --:--:-- --:--:-- --:--:--  7971
Archive:  download.zip
  inflating: certs/lin/742f5a15.0    
  inflating: certs/mac/742f5a15.0    
  inflating: certs/win/742f5a15.0.crt  
  inflating: certs/lin/742f5a15.r1   
  inflating: certs/mac/742f5a15.r1   
  inflating: certs/win/742f5a15.r1.crl  
cp: target '/etc/pki/ca-trust/source/anchors' is not a directory
subprocess exited with status 1
subprocess exited with status 1
error: build error: error building at STEP "RUN curl -L -O -k https://vcsa-ci.vmware.devcluster.openshift.com/certs/download.zip &&     unzip download.zip &&     cp certs/lin/* /etc/pki/ca-trust/source/anchors &&     update-ca-trust extract": exit status 1

Actually, it is affecting all the jobs that failed 🤔

/retest

@miabbott
Copy link
Copy Markdown
Member Author

The images job is failing (along with the rest of them) like so:

STEP 25: RUN curl -L -O -k https://vcsa-ci.vmware.devcluster.openshift.com/certs/download.zip &&     unzip download.zip &&     cp certs/lin/* /etc/pki/ca-trust/source/anchors &&     update-ca-trust extract
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  6044  100  6044    0     0   9201      0 --:--:-- --:--:-- --:--:--  9213
Archive:  download.zip
  inflating: certs/lin/742f5a15.0    
  inflating: certs/mac/742f5a15.0    
  inflating: certs/win/742f5a15.0.crt  
  inflating: certs/lin/742f5a15.r1   
  inflating: certs/mac/742f5a15.r1   
  inflating: certs/win/742f5a15.r1.crl  
cp: target '/etc/pki/ca-trust/source/anchors' is not a directory
subprocess exited with status 1
subprocess exited with status 1
error: build error: error building at STEP "RUN curl -L -O -k https://vcsa-ci.vmware.devcluster.openshift.com/certs/download.zip &&     unzip download.zip &&     cp certs/lin/* /etc/pki/ca-trust/source/anchors &&     update-ca-trust extract": exit status 1
2020/10/22 17:13:10 Build libvirt-installer succeeded after 15m55s
2020/10/22 17:13:10 Tagging libvirt-installer into stable
2020/10/22 17:16:25 Build installer-artifacts succeeded after 10m43s
2020/10/22 17:16:25 Tagging installer-artifacts into stable
2020/10/22 17:16:25 No custom metadata found and prow metadata already exists. Not updating the metadata.
2020/10/22 17:16:25 Ran for 24m27s
error: some steps failed:
  * could not run steps: step upi-installer failed: could not wait for build: the build upi-installer failed after 15m37s with reason DockerBuildFailed: Dockerfile build strategy has failed.

I've filed https://bugzilla.redhat.com/show_bug.cgi?id=1891167 for this because I can't figure out why this PR would cause the jobs to fail like this.

@miabbott
Copy link
Copy Markdown
Member Author

The root of the problem is that /etc/pki/ca-trust/source is mounted in from the host in 4.6 - https://github.com/openshift/builder/blob/master/imagecontent/etc/containers/mounts.conf

The Dockerfile was adjusted accordingly in #4284

/retest

@miabbott
Copy link
Copy Markdown
Member Author

/retest

1 similar comment
@miabbott
Copy link
Copy Markdown
Member Author

/retest

@miabbott
Copy link
Copy Markdown
Member Author

Finally got CI looking decent. e2e-aws finally passed.

e2e-openstack - Been red for the majority of the job history; PR doesn't affect OpenStack, though

e2e-azure-upi - Another job that is red for the recent job history; looks like it is hitting the "VM didn't provision in time" issue. This PR does not affect Azure.

e2e-crc - Yet another job that is red in recent history; latest failures looks like extracting the installer is failing. But this PR dosen't affect CRC either.

e2e-gcp-upi-xpn - The final job that is red in recent history; latest failure looks like the API server is not fully up. This PR doesn't affect GCP.

Can I get a /lgtm on this? There are some customers wanting these regions supported as part of 4.6.z sooner rather than later.

@miabbott
Copy link
Copy Markdown
Member Author

/bugzilla refresh

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@miabbott: An error was encountered adding this pull request to the external tracker bugs for bug 1889855 on the Bugzilla server at https://bugzilla.redhat.com:

JSONRPC error 1004: The combination of ext_type_* fields matched more than one External Tracker.
Please contact an administrator to resolve this issue, then request a bug refresh with /bugzilla refresh.

Details

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@staebler
Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 27, 2020
@miabbott
Copy link
Copy Markdown
Member Author

Let's see if BZ integration is fixed...

/bugzilla refresh

@openshift-ci-robot openshift-ci-robot added bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Oct 28, 2020
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@miabbott: This pull request references Bugzilla bug 1889855, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

6 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.6.z) matches configured target release for branch (4.6.z)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
  • dependent bug Bugzilla bug 1889852 is in the state VERIFIED, which is one of the valid states (VERIFIED, RELEASE_PENDING, CLOSED (ERRATA))
  • dependent Bugzilla bug 1889852 targets the "4.7.0" release, which is one of the valid target releases: 4.7.0
  • bug has dependents
Details

In response to this:

Let's see if BZ integration is fixed...

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@staebler
Copy link
Copy Markdown
Contributor

/approve

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

10 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Nov 2, 2020

/hold
@jhixson74 @jstuever can you please look at the azure and gcp tests respectively, thanks

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 2, 2020
@ashcrow
Copy link
Copy Markdown
Member

ashcrow commented Nov 2, 2020

@sdodson Since this is specifically a change to AWS only do we need to have azure and gcp jobs pass for this to merge?

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Nov 2, 2020

@sdodson Since this is specifically a change to AWS only do we need to have azure and gcp jobs pass for this to merge?

that's fine, but it won't merge until the window opens later this week anyway so lets see what John and Jeremiah say about those two tests.

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Nov 3, 2020

/skip

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Nov 3, 2020

/retest

@ashcrow
Copy link
Copy Markdown
Member

ashcrow commented Nov 3, 2020

All ✔️!

@miabbott
Copy link
Copy Markdown
Member Author

miabbott commented Nov 5, 2020

@sdodson can we drop the /hold? This looks primed to be merged.

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Nov 5, 2020

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 5, 2020
@sdodson sdodson added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Nov 5, 2020
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Nov 5, 2020

/override ci/prow/e2e-azure-upi
/override ci/prow/e2e-metal-ipi
irrelevant tests

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@sdodson: Overrode contexts on behalf of sdodson: ci/prow/e2e-azure-upi, ci/prow/e2e-metal-ipi

Details

In response to this:

/override ci/prow/e2e-azure-upi
/override ci/prow/e2e-metal-ipi
irrelevant tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-merge-robot openshift-merge-robot merged commit 6e02d04 into openshift:release-4.6 Nov 5, 2020
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@miabbott: All pull requests linked via external trackers have merged:

Bugzilla bug 1889855 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1889855: additional AWS regions for RHCOS AMIs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-merge-robot
Copy link
Copy Markdown
Contributor

@miabbott: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-gcp-upi-xpn 2b1295d link /test e2e-gcp-upi-xpn
ci/prow/e2e-metal-ipi 2b1295d link /test e2e-metal-ipi
ci/prow/e2e-azure-upi 2b1295d link /test e2e-azure-upi

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants