Skip to content

aws-ec2: add m6i as preferred instance type#5327

Merged
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
mtulio:aws-ec2-m6i-preferred
Oct 27, 2021
Merged

aws-ec2: add m6i as preferred instance type#5327
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
mtulio:aws-ec2-m6i-preferred

Conversation

@mtulio
Copy link
Copy Markdown
Contributor

@mtulio mtulio commented Oct 26, 2021

AWS recently launched the m6i instance that is a newer generation of m5 (default AWS IPI). The m6i offers up to 15% improvement in price/performance versus comparable fifth-generation instances. The new instances are powered by the latest generation Intel Xeon Scalable processors (code-named Ice Lake) with an all-core turbo frequency of 3.5 GHz.

Compared to M5 instances using an Intel processor, this new instance type provides:

  • Up to 15% improvement in compute price/performance.
  • Up to 20% higher memory bandwidth.
  • Up to 40 Gbps for Amazon Elastic Block Store (EBS) and 50 Gbps for networking.
  • Always-on memory encryption.

The m6i instance is available in 8 of 21 regions, so clusters running on those regions can take advantage of that new instance type.

                count(m5.xlarge)  count(m6i.xlarge)  diff(m5_m6i)
region                                                           
af-south-1                     3                  0             3
ap-east-1                      3                  0             3
ap-northeast-1                 3                  2             1
ap-northeast-2                 4                  0             4
ap-northeast-3                 3                  0             3
ap-south-1                     3                  0             3
ap-southeast-1                 3                  3             0
ap-southeast-2                 3                  0             3
ca-central-1                   3                  0             3
eu-central-1                   3                  2             1
eu-north-1                     3                  0             3
eu-south-1                     3                  0             3
eu-west-1                      3                  3             0
eu-west-2                      3                  0             3
eu-west-3                      3                  0             3
me-south-1                     3                  0             3
sa-east-1                      3                  0             3
us-east-1                      5                  5             0
us-east-2                      3                  3             0
us-west-1                      2                  2             0
us-west-2                      6                  4             2

The point of attention is the regions that are not deployed that type across all zones: ap-northeast-1, eu-central-1, and us-west-2. But afaik it should not be a problem as the algorithm to choose the better instance should get the instance more available in the region.

Changing the default AWS IPI ec2 to m6i.xlarge and the EBS volume type to gp3 on control planes could be an "easy win" for performance and costs.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Oct 26, 2021

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 26, 2021
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Oct 26, 2021

The cluster install with no issues[1], and MAPI has no restrictions to replace nodes for the new type[2]

[1] provisioned by installer:
installer-m6i-130671361-1ea609f8-dcfb-4e92-bfd9-edee61b71c9f

[2] Machine with new size (replaced in a existing cluster):

$ oc get machines -n openshift-machine-api
NAME                                PHASE     TYPE         REGION      ZONE         AGE
mrb-0x272-master-0                  Running   m6i.xlarge   us-east-1   us-east-1a   8h
mrb-0x272-master-1                  Running   m6i.xlarge   us-east-1   us-east-1b   8h
mrb-0x272-master-2                  Running   m6i.xlarge   us-east-1   us-east-1c   8h
mrb-0x272-worker-us-east-1a-xd8tj   Running   m5.xlarge    us-east-1   us-east-1a   8h
mrb-0x272-worker-us-east-1b-xvbb6   Running   m5.xlarge    us-east-1   us-east-1b   8h
mrb-0x272-worker-us-east-1c-8j2s4   Running   m5.xlarge    us-east-1   us-east-1c   8h

Copy link
Copy Markdown
Contributor

@staebler staebler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable to me. Let me discuss with my team to make sure that I am not missing some wrinkle on why we would not want to do this.

Comment thread pkg/types/aws/defaults/platform.go Outdated
AWS recently launched the m6i instances that is a newer generation of m5 (default AWS IPI). The m6i offer up to 15% improvement in price/performance versus comparable fifth-generation instances. The new instances are powered by the latest generation Intel Xeon Scalable processors (code-named Ice Lake) with an all-core turbo frequency of 3.5 GHz.

Compared to M5 instances using an Intel processor, this new instance type provides:
- Up to 15% improvement in compute price/performance.
- Up to 20% higher memory bandwidth.
- Up to 40 Gbps for Amazon Elastic Block Store (EBS) and 50 Gbps for networking.
- Always-on memory encryption.

The m6i instance is available in 8 of 21 regions, so clusters running on those regions can take advantage of that new instance type.

                count(m5.xlarge)  count(m6i.xlarge)  count(m5_diff_m6i)
region
af-south-1                     3                  0                   3
ap-east-1                      3                  0                   3
ap-northeast-1                 3                  2                   1
ap-northeast-2                 4                  0                   4
ap-northeast-3                 3                  0                   3
ap-south-1                     3                  0                   3
ap-southeast-1                 3                  3                   0
ap-southeast-2                 3                  0                   3
ca-central-1                   3                  0                   3
eu-central-1                   3                  2                   1
eu-north-1                     3                  0                   3
eu-south-1                     3                  0                   3
eu-west-1                      3                  3                   0
eu-west-2                      3                  0                   3
eu-west-3                      3                  0                   3
me-south-1                     3                  0                   3
sa-east-1                      3                  0                   3
us-east-1                      5                  5                   0
us-east-2                      3                  3                   0
us-west-1                      2                  2                   0
us-west-2                      6                  4                   2

The point of attention is the regions that are not deployed that type across all zones: ap-northeast-1, eu-central-1 and us-west-2. But afaik it should not be a problem as the algorithm to choose the better instance should get the instance more available in the region.

Changing the default AWS IPI ec2 to m6i.xlarge and the EBS volume type to gp3 on control planes could be a "easy win" for performance and costs.
@mtulio mtulio force-pushed the aws-ec2-m6i-preferred branch from 05d2549 to b918810 Compare October 26, 2021 14:34
Copy link
Copy Markdown
Contributor

@staebler staebler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the draft status when you are comfortable.

/lgtm

@staebler
Copy link
Copy Markdown
Contributor

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Oct 26, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Oct 26, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: staebler

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 26, 2021
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Oct 26, 2021

Cool, thanks to review @staebler .
I just opened the fix on CI operator, ptal openshift/release#23031 ?

I am also converting to a PR. I can't see any issue to backport it to current releases (4.6+), what do you think?

@mtulio mtulio marked this pull request as ready for review October 26, 2021 20:17
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 26, 2021
@staebler
Copy link
Copy Markdown
Contributor

I am also converting to a PR. I can't see any issue to backport it to current releases (4.6+), what do you think?

No, I don't think this meets the bar for backporting. Using 5th-gen instances does not prevent a user from installing and using OpenShift. And the user has the option to set the instance type themselves if they want to take advantage of the 6th-gen instances.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

4 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Oct 26, 2021

@mtulio: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-single-node b918810 link false /test e2e-aws-single-node
ci/prow/e2e-aws-workers-rhel8 b918810 link false /test e2e-aws-workers-rhel8
ci/prow/e2e-aws-workers-rhel7 b918810 link false /test e2e-aws-workers-rhel7

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

3 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

3 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants