Bug 1774465: aws: pick instance types based on selected availability zones#3051
Conversation
|
@abhinavdahiya: This pull request references Bugzilla bug 1774465, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
de04fff to
8a2d4eb
Compare
|
/approve Do you think we need to provide this priority mapping information in product documentation? |
@sdodson not really, they are internal defaults when the user doesn't provide them. So i don't think these need to be documented in product docs... |
|
At some point, we'll need to or should default to m5. Do we have a timeline we'll swap or reason we won't to (or potentially a later one like m6/m7, etc.)? |
@cuppett some of the reasoning for not chaining the defaults to m5 completely are https://bugzilla.redhat.com/show_bug.cgi?id=1710981#c13 |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jhixson74, sdodson The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
There was a problem hiding this comment.
nit: would be nice to show the constraints to help folks pick a reasonable instance type. Something like:
missing := []string{}
for _, t := range types {
zoneDiff := reqZones.Difference(sets.NewString(found[t]...))
if zoneDiff.Len() == 0 {
return t, nil
}
missing = append(missing, fmt.Sprintf("%s not found in %v", t, strings.Join(zoneDiff.List(), ", ")))
}
return types[0], errors.Errorf("no instance type found for the zone constraint (%s)", strings.Join(missing, "; "))There was a problem hiding this comment.
gonna keep for future work.
There was a problem hiding this comment.
It's not clear to me why we want this to be a soft warning, instead of a fatal error. Can we make it a fatal error, because a failure to find a default instance type can be worked around by users setting explicit types in their install-config.yaml?
There was a problem hiding this comment.
If the user doesn't have access to run this command, or it's not supported in all the regions or we are rate limited, i don't want the install to fail as this stage, 4.4 feature complete
I think for now the current behavior of use the defaults when we can't dynamically verify it is good enough. we can revisit the requirement in next feature cycle.
8a2d4eb to
ad333dd
Compare
|
going to cancel the hold as nits where fixed. /hold cancel |
|
/lgtm |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
Using new AWS API DescribeInstanceTypeOfferings [1] we can now check what instance types are available for specific region/AZs, which allows us to check which instance types are available in the selected AZs. Since we default to m4 for most regions and only override to m5 for certain, in cases where new AZs are added to older regions like sa-east-1, see BZ 1774465 [2], we end up picking m4 when m5 is present in all the chosen AZs. Now we can define a list in descreasing priority order for instance classes for a region and the installer can try to pick default based on what works for all the selected AZs. We still do not support heterogenous setup which different instance types in different regions, but this solves the case when newer m5 needs to be used in certain regions based on the AZ selection. The default order for instance classes stays m4 > m5. [1]: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeInstanceTypeOfferings.html [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1774465
ad333dd to
d45e881
Compare
|
/lgtm |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
5 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
3 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
@abhinavdahiya: All pull requests linked via external trackers have merged. Bugzilla bug 1774465 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@abhinavdahiya: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
| @@ -24,8 +24,18 @@ func SetPlatformDefaults(p *aws.Platform) { | |||
| // region. We prefer m4 if available (more EBS volumes per node) but will use | |||
| // m5 in regions that don't have m4. | |||
| func InstanceClass(region string) string { | |||
There was a problem hiding this comment.
@abhinavdahiya With InstanceClasses function added, I am wondering why keeping InstanceClass function around.
The guts were removed by 805a108 (platformtests: drop aws as no longer required, 2020-03-11, openshift#3277), citing d45e881 (aws: pick instance types based on selected availability zones, 2020-02-03, openshift#3051). No need to keep the useless directory around now that it holds no code.
Using new AWS API DescribeInstanceTypeOfferings 1 we can now check what instance types are available for specific region/AZs, which allows us to check which instance types are available in the selected AZs.
Since we default to m4 for most regions and only override to m5 for certain, in cases where new AZs are added to older regions like sa-east-1, see BZ 1774465 2, we end up picking m4 when m5 is present in all the chosen AZs.
Now we can define a list in descreasing priority order for instance classes for a region and the installer can try to pick default based on what works for all the selected AZs. We still do not support heterogenous setup which different instance types in different regions, but this
solves the case when newer m5 needs to be used in certain regions based on the AZ selection.
The default order for instance classes stays m4 > m5.
/cc @wking @cuppett @sdodson