Skip to content

Register SDN informers synchronously#15354

Merged
liggitt merged 2 commits intoopenshift:release-3.6from
liggitt:sdn-register-order-3.6
Jul 21, 2017
Merged

Register SDN informers synchronously#15354
liggitt merged 2 commits intoopenshift:release-3.6from
liggitt:sdn-register-order-3.6

Conversation

@liggitt
Copy link
Contributor

@liggitt liggitt commented Jul 19, 2017

Pick of #15353 and more contained version of #15364

The SDN controller was registering shared informer event handlers in a goroutine, so registration raced with informer start. If the registration lost, then SDN event handlers would never get namespace events.

@openshift-bot
Copy link
Contributor

openshift-bot commented Jul 19, 2017

continuous-integration/openshift-jenkins/merge Waiting: You are in the build queue at position: 14

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to 07c92dd

@openshift-bot
Copy link
Contributor

[Test]ing while waiting on the merge queue

@liggitt
Copy link
Contributor Author

liggitt commented Jul 20, 2017

continuous-integration/openshift-jenkins/test FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/3303/) (Base Commit: 12575c5) (PR Branch Commit: 07c92dd)

[test]

@liggitt
Copy link
Contributor Author

liggitt commented Jul 20, 2017

I'm seeing a consistent failure in the ansible install at the "Reconcile Cluster Roles and Cluster Role Bindings and Security Context Constraints." step:

Output to stderr: The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? Error from server (Forbidden): User "system:admin" cannot get clusterroles at the cluster scope

@liggitt
Copy link
Contributor Author

liggitt commented Jul 20, 2017

all-in-one master is dying with failed to start SDN plugin controller: User "system:serviceaccount:openshift-infra:sdn-controller" cannot get clusternetworks at the cluster scope

@liggitt
Copy link
Contributor Author

liggitt commented Jul 20, 2017

looks like all the controller roles moved out of the block that auto-reconciles them on server start, which means ansible reconciliation races controllers which kill the process if they hit permission errors long enough

@openshift openshift deleted a comment from deads2k Jul 20, 2017
@liggitt liggitt added this to the 3.6.0 milestone Jul 20, 2017
@openshift-bot
Copy link
Contributor

Evaluated for origin test up to 04fdc79

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/3338/) (Base Commit: 4c2392b) (PR Branch Commit: 04fdc79)

@liggitt
Copy link
Contributor Author

liggitt commented Jul 20, 2017

@deads2k PTAL at the second commit for 3.6

@deads2k
Copy link
Contributor

deads2k commented Jul 20, 2017

lgtm

@liggitt
Copy link
Contributor Author

liggitt commented Jul 20, 2017

[merge]

@liggitt
Copy link
Contributor Author

liggitt commented Jul 21, 2017

green tests on HEAD^, contains fix for flake other merge jobs are hitting, merging

@liggitt liggitt merged commit 787f4e2 into openshift:release-3.6 Jul 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants