Refactor controller initialization (part 2)#14293
Refactor controller initialization (part 2)#14293openshift-bot merged 1 commit intoopenshift:masterfrom
Conversation
|
@enj @sttts attempt to fix the serviceaccount controllers... basically I created "pre" group for them and start them before we run kubernetes controllers. To make it work, I added @deads2k if we go this path, I think we should move the loop where we start the controllers (initFn) into controllers.go as we duplicate the code between pre-initialization and initialization ;-) |
| // controllerRoleBindings is a slice of roles used for controllers | ||
| controllerRoleBindings = []rbac.ClusterRoleBinding{} | ||
|
|
||
| ReadWrite = []string{"get", "list", "watch", "create", "update", "patch", "delete", "deletecollection"} |
There was a problem hiding this comment.
@deads2k copied from kube, i guess this will make better experience for people writing the rules? ;-)
There was a problem hiding this comment.
Aren't that accessible constants in kube? If not, maybe they should be public.
There was a problem hiding this comment.
they are but i was going to avoid importing the upstream bootstrap package just to get them.
There was a problem hiding this comment.
We need the same for auditing. So we probably make them available in k8s.io/apiserver/pkg/endpoints/request
There was a problem hiding this comment.
@sttts i will go with TODO for now and once we have stable location we can import
There was a problem hiding this comment.
Also, we don't use the ReadWrite or Read variables in controller policy.
|
|
||
| // serviceaccount-tokens-controller | ||
| addControllerRole(rbac.ClusterRole{ | ||
| ObjectMeta: metav1.ObjectMeta{Name: saRolePrefix + InfraServiceAccountTokensControllerServiceAccountName}, |
There was a problem hiding this comment.
@deads2k wonder if we want/need this... this controller will use the RootClientBuilder which means privileged client will be used.
pkg/cmd/server/origin/controller.go
Outdated
| "github.com/openshift/origin/pkg/cmd/server/origin/controller" | ||
| ) | ||
|
|
||
| func (c *MasterConfig) NewOpenShiftControllerPreInitializers() (map[string]controller.InitFunc, error) { |
There was a problem hiding this comment.
maybe deserve a better name?
There was a problem hiding this comment.
only the SA token controller needs to run first, right?
There was a problem hiding this comment.
yeah, not sure about the serviceaccount controller (and pullsecrets/etc.) they ran first in old world.
| options.ServiceAccounts = append(options.ServiceAccounts, sa) | ||
| } | ||
|
|
||
| //REBASE: add new args to NewServiceAccountsController |
There was a problem hiding this comment.
Not sure. What does a blame say? Last rebase or ages ago?
There was a problem hiding this comment.
@ncdc addded this in Nov/2016... if this refers to args options then maybe the comment is still valid.
There was a problem hiding this comment.
That was a comment that I was reacting to changes made between kube 1.4 and 1.5. It can be deleted.
|
|
||
| func (c *ServiceAccountTokensControllerOptions) RunController(ctx ControllerContext) (bool, error) { | ||
| if len(c.ServiceAccountConfig.PrivateKeyFile) == 0 { | ||
| return true, fmt.Errorf("skipped starting Service Account Token Manager, no private key specified") |
There was a problem hiding this comment.
@deads2k want confirm that returning true and error here is ok.
There was a problem hiding this comment.
A godoc wouldn't harm. It's not obvious what the bool means.
| oc.RunServiceAccountsController() | ||
| oc.RunServiceAccountTokensController(controllerManagerOptions) | ||
| // used by admission controllers | ||
| oc.RunServiceAccountPullSecretsControllers() |
| denyReason = err.Error() | ||
| } | ||
|
|
||
| glog.V(4).Infof("RBAC DENY: user %q groups %q cannot %v", attributes.GetUser().GetName(), attributes.GetUser().GetGroups(), denyReason) |
There was a problem hiding this comment.
ignore this pathetic attempt to imitate kube messages ;-)
|
|
||
| func (c *ServiceAccountControllerOptions) RunController(ctx ControllerContext) (bool, error) { | ||
| if len(c.ManagedNames) == 0 { | ||
| return true, fmt.Errorf("Skipped starting Service Account Manager, no managed names specified") |
pkg/cmd/server/start/start_master.go
Outdated
| utilwait "k8s.io/apimachinery/pkg/util/wait" | ||
| restclient "k8s.io/client-go/rest" | ||
| kctrlmgr "k8s.io/kubernetes/cmd/kube-controller-manager/app" | ||
| kubecontroller "k8s.io/kubernetes/cmd/kube-controller-manager/app" |
05512c3 to
c8ea7e1
Compare
| addControllerRole(rbac.ClusterRole{ | ||
| ObjectMeta: metav1.ObjectMeta{Name: saRolePrefix + InfraServiceAccountControllerServiceAccountName}, | ||
| Rules: []rbac.PolicyRule{ | ||
| rbac.NewRule(ReadWrite...).Groups(kapiGroup).Resources("serviceaccounts").RuleOrDie(), |
There was a problem hiding this comment.
We don't use these in the controller_policy upstream. Mutation is rarely uniform in the controllers and the powers are subtly different. Update can require knowledge of resourceversion, patch doesn't. Delete requires knowledge of names, deletecollection does not.
There was a problem hiding this comment.
@deads2k i knew you will not like it... i realized that those rules are too broad for controllers where we know what controllers can/cannot do.
pkg/cmd/server/origin/controller.go
Outdated
| serviceAccount := controller.ServiceAccountControllerOptions{ | ||
| ManagedNames: c.Options.ServiceAccountConfig.ManagedNames, | ||
| } | ||
| ret["serviceaccount"] = serviceAccount.RunController |
There was a problem hiding this comment.
this one ought not require pre-init
pkg/cmd/server/origin/controller.go
Outdated
| ret["serviceaccount"] = serviceAccount.RunController | ||
|
|
||
| serviceAccountPullSecrets := controller.ServiceAccountPullSecretsControllerOptions{} | ||
| ret["serviceaccount-pull-secrets"] = serviceAccountPullSecrets.RunController |
There was a problem hiding this comment.
this one doesn't need to be pre-init
|
|
||
| // RootClient should only be used for pre-initialization controllers | ||
| // (iow. serviceaccount-tokens controller) | ||
| RootClientBuilder *controller.SimpleControllerClientBuilder |
There was a problem hiding this comment.
I would try to wire this down separately into the init function when you create the struct the init function lives on. It's not a preferred means of creating the init function (makes it harder to read), but we really don't want this escaping
There was a problem hiding this comment.
i can wire this from MasterConfig I believe and pass it in the struct.
| projectcontroller "github.com/openshift/origin/pkg/project/controller" | ||
| ) | ||
|
|
||
| type OriginNamespaceControllerOptions struct{} |
There was a problem hiding this comment.
if we have nothing in here, don't have the struct. Just a bare function.
|
|
||
| func (c *ServiceAccountControllerOptions) RunController(ctx ControllerContext) (bool, error) { | ||
| if len(c.ManagedNames) == 0 { | ||
| return true, fmt.Errorf("skipped starting Service Account Manager, no managed names specified") |
There was a problem hiding this comment.
return false, nil. returning an error will fatal the process.
There was a problem hiding this comment.
so use glog?
You can if you like. The caller actually drops a message in the log saying its skipping it.
| ctx.DeprecatedOpenshiftInformers.KubernetesInformers().Core().V1().ServiceAccounts(), | ||
| ctx.DeprecatedOpenshiftInformers.KubernetesInformers().Core().V1().Namespaces(), | ||
| ctx.ClientBuilder.ClientOrDie(bootstrappolicy.InfraServiceAccountControllerServiceAccountName), | ||
| options).Run(1, ctx.Stop) |
There was a problem hiding this comment.
Running only one worker is a little weird.
| ) | ||
|
|
||
| type ServiceAccountControllerOptions struct { | ||
| ManagedNames []string |
There was a problem hiding this comment.
I like this. Much cleaner than what we had before.
|
|
||
| func (c *ServiceAccountTokensControllerOptions) RunController(ctx ControllerContext) (bool, error) { | ||
| if len(c.ServiceAccountConfig.PrivateKeyFile) == 0 { | ||
| return true, fmt.Errorf("skipped starting Service Account Token Manager, no private key specified") |
There was a problem hiding this comment.
return false, nil. The bool indicates that this controller was enabled, the error means we couldn't start the controller when we were supposed to. So returning false without an error means "skipped" and return an error means, "couldn't fullfill intent, crashloop."
There was a problem hiding this comment.
I see this as outstanding.
|
|
||
| privateKey, err := serviceaccount.ReadPrivateKey(c.ServiceAccountConfig.PrivateKeyFile) | ||
| if err != nil { | ||
| return true, fmt.Errorf("error reading signing key for Service Account Token Manager: %v", err) |
|
|
||
| type ServiceAccountTokensControllerOptions struct { | ||
| ServiceAccountConfig configapi.ServiceAccountConfig | ||
| ServiceServingCertConfig configapi.ServiceServingCert |
There was a problem hiding this comment.
Only the CA file contents please. []byte
There was a problem hiding this comment.
Only the CA file contents please. []byte
Reasoning: This should be a runtime optimized config, so content instead of file reference (which is easier for a user) and this controller only needs to know the non-secret information. If we ever separate out these controllers, it doesn't need the full config.
There was a problem hiding this comment.
I see this as still outstanding.
| return true, nil | ||
| } | ||
|
|
||
| type ServiceAccountPullSecretsControllerOptions struct { |
pkg/cmd/server/start/start_master.go
Outdated
| allowedOpenshiftControllers := sets.NewString( | ||
| // The serviceaccount-token controller has to be started before any other controller | ||
| // runs in order to be able to issue tokens. | ||
| "serviceaccount-tokens", |
There was a problem hiding this comment.
I think we can more or less hard code this one first, then start it directly. The others simply flow from it.
pkg/cmd/server/start/start_master.go
Outdated
| } | ||
| controllerInitializers := kctrlmgr.NewControllerInitializers() | ||
|
|
||
| oc.RunSecurityAllocationController() |
|
@mfojtik want to clean up what you have so far and merge it rather than growing bigger? |
|
@deads2k just cleaning up and addressing comments... yeah I would prefer to add this in iteration to prevent rebasing hell. |
|
@deads2k this was interrupted by #14321 (merge conflict) @pweil- @jupierce @smarterclayton i would really like to merge this otherwise this will be a rebase hell once the dcut opens. This is improving security of the controllers (well previously there was no security...). Also I would like QA to spend time on running on this to capture bugs associated with new controller privileges. |
|
flake is: #14208 [test] |
|
flake : #14353 |
|
[test] |
|
@smarterclayton seems like this is not broken after I rebased against the shared informers stuff... the serviceaccount-token controller is not creating tokens (seems like the informer is stuck or not started). |
|
Once the two controllers are started, we need to start the subset of Kube informers that control secrets and service accounts. See the old code, there's a massive comment right after the controller start |
@smarterclayton I saw that comment and I made sure we do start the kube informers right after the controller (serviceaccount-token) is started. That controller indeed starts and origin runs and I even see some tokens creates but there are no SA/secrets for new projects... |
| func RunServiceAccountPullSecretsController(ctx ControllerContext) (bool, error) { | ||
| kc := ctx.ClientBuilder.KubeInternalClientOrDie(bootstrappolicy.InfraServiceAccountPullSecretsControllerServiceAccountName) | ||
|
|
||
| serviceaccountcontrollers.NewDockercfgDeletedController( |
There was a problem hiding this comment.
@smarterclayton hrm... this has to be a goroutine now.... a cost I pay for rebasing this without coffe.
|
Evaluated for origin testextended up to f0d6e70 |
|
flake: #14122 [test] |
|
continuous-integration/openshift-jenkins/testextended FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin_extended/528/) (Base Commit: 375c727) (Extended Tests: core(builds)) |
|
[test] again |
|
dockerhub flake (pulling centos...) [test] |
|
Evaluated for origin test up to f0d6e70 |
|
continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/1889/) (Base Commit: 1bf8a17) |
|
[merge] (lets see if the merge is open now ;-) |
|
re- [merge] |
|
Evaluated for origin merge up to f0d6e70 |
|
continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_request_origin/892/) (Base Commit: 90592c5) (Image: devenv-rhel7_6300) |
This is a follow up what started in #13996 and is tracked here: https://trello.com/c/ZtOfFpFz/907-5-use-kubernetes-controller-roles
In a nutshell: