Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 28 additions & 2 deletions pkg/asset/manifests/scheduler.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,11 @@ import (

"github.com/ghodss/yaml"
"github.com/pkg/errors"
"github.com/sirupsen/logrus"

configv1 "github.com/openshift/api/config/v1"
"github.com/openshift/installer/pkg/asset"
"github.com/openshift/installer/pkg/asset/installconfig"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

Expand All @@ -30,7 +32,9 @@ func (*Scheduler) Name() string {
// Dependencies returns all of the dependencies directly needed to generate
// the asset.
func (*Scheduler) Dependencies() []asset.Asset {
return []asset.Asset{}
return []asset.Asset{
&installconfig.InstallConfig{},
}
}

// Generate generates the scheduler config and its CRD.
Expand All @@ -44,7 +48,29 @@ func (s *Scheduler) Generate(dependencies asset.Parents) error {
Name: "cluster",
// not namespaced
},
Spec: configv1.SchedulerSpec{},
Spec: configv1.SchedulerSpec{
MastersSchedulable: false,
},
}

installConfig := &installconfig.InstallConfig{}
dependencies.Get(installConfig)
computeReplicas := int64(0)
for _, pool := range installConfig.Config.Compute {
if pool.Replicas != nil {
computeReplicas += *pool.Replicas
}
}
if computeReplicas == 0 {
// A schedulable host is required for a successful install to complete.
// If the install config has 0 replicas for compute hosts, it's one of two cases:
// 1. An IPI deployment with no compute hosts. The deployment can not succeed
// without MastersSchedulable = true.
// 2. A UPI deployment. The deployment may add compute hosts, but to ensure the
// the highest probability of a successful deployment, we default to
// schedulable masters.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trade-off here I guess is between:

  • How painful is it to configure for (2) as a post-install (i.e. we made an incorrect assumption, and the user needs to correct for that) versus
  • How much do we want to avoid an install-config option that allows the user to express their intent here - i.e. scheduler: Allow configuration of schedulable masters. #2024

I would imagine that in the case of (2) we might find users just leave masters as schedulable even after they add workers?

IMO it's important enough for us to understand the intent that we should add the install-config option

logrus.Warningf("Making control-plane schedulable by setting MastersSchedulabe to true for Scheduler cluster settings")
config.Spec.MastersSchedulable = true
}

configData, err := yaml.Marshal(config)
Expand Down
8 changes: 0 additions & 8 deletions pkg/types/validation/installconfig.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ import (

dockerref "github.com/containers/image/docker/reference"
"github.com/pkg/errors"
"github.com/sirupsen/logrus"
"k8s.io/apimachinery/pkg/util/validation/field"

"github.com/openshift/installer/pkg/types"
Expand Down Expand Up @@ -183,7 +182,6 @@ func validateControlPlane(platform *types.Platform, pool *types.MachinePool, fld
func validateCompute(platform *types.Platform, pools []types.MachinePool, fldPath *field.Path) field.ErrorList {
allErrs := field.ErrorList{}
poolNames := map[string]bool{}
foundPositiveReplicas := false
for i, p := range pools {
poolFldPath := fldPath.Index(i)
if p.Name != "worker" {
Expand All @@ -193,14 +191,8 @@ func validateCompute(platform *types.Platform, pools []types.MachinePool, fldPat
allErrs = append(allErrs, field.Duplicate(poolFldPath.Child("name"), p.Name))
}
poolNames[p.Name] = true
if p.Replicas != nil && *p.Replicas > 0 {
foundPositiveReplicas = true
}
allErrs = append(allErrs, ValidateMachinePool(platform, &p, poolFldPath)...)
}
if !foundPositiveReplicas {
logrus.Warnf("There are no compute nodes specified. The cluster will not fully initialize without compute nodes.")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we really remove this warning? Does this PR allow the ingress scheduler to be scheduled on control plane nodes (e.g. see here and kubernetes/kubernetes#65618)? Are we running CI on zero-compute clusters?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, when you set schedulableMasters to true, the worker role gets added to the master nodes, as well.

nothing in OpenShift CI for this yet

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On cloud platforms, while adding a worker role to masters will allow ingress controllers to be scheduled on the masters, the master nodes will remain excluded from load balancer target pools, breaking the practical feature.

As to whether Kube will support including masters in LB target pools, the jury is still out and there is still no KEP AFAICT.

kubernetes/kubernetes#65618
https://groups.google.com/d/msg/kubernetes-sig-network/erULVdDkLNw/hvCukZ3CBQAJ

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@russelb, can you restore the warning?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, with a single compute, we won't trigger this schedulable-control-plane logic, so won't ingress hang with "I can't get replicas up to two"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning wouldn't be appropriate for a 3-node bare metal cluster. I guess it could be restored and only emitted if the platform type is not baremetal or none because of the load balancer issue mentioned by @ironcladlou ?

}
return allErrs
}

Expand Down