Skip to content

Leader remove runlocal#7

Merged
jmrodri merged 12 commits intooperator-framework:mainfrom
jmrodri:leader-remove-runlocal
Jul 24, 2020
Merged

Leader remove runlocal#7
jmrodri merged 12 commits intooperator-framework:mainfrom
jmrodri:leader-remove-runlocal

Conversation

@jmrodri
Copy link
Copy Markdown
Member

@jmrodri jmrodri commented Jul 22, 2020

No description provided.

@jmrodri jmrodri changed the title Leader remove runlocal WIP: Leader remove runlocal Jul 22, 2020
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 22, 2020
@coveralls
Copy link
Copy Markdown

coveralls commented Jul 22, 2020

Pull Request Test Coverage Report for Build 181321825

  • 23 of 36 (63.89%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-25.9%) to 74.074%

Changes Missing Coverage Covered Lines Changed/Added Lines %
leader/leader.go 23 36 63.89%
Totals Coverage Status
Change from base Build 180379840: -25.9%
Covered Lines: 220
Relevant Lines: 297

💛 - Coveralls

Comment thread leader/leader.go Outdated
Comment thread leader/leader.go Outdated
@jmrodri jmrodri force-pushed the leader-remove-runlocal branch from 1b6db72 to 9eabf9d Compare July 23, 2020 20:30
@jmrodri jmrodri changed the title WIP: Leader remove runlocal Leader remove runlocal Jul 23, 2020
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 23, 2020
@jmrodri jmrodri force-pushed the leader-remove-runlocal branch from 9eabf9d to affe8aa Compare July 23, 2020 21:33
Comment thread leader/leader.go Outdated
Comment thread leader/leader_test.go Outdated
Copy link
Copy Markdown
Member

@joelanford joelanford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more review round (sorry!)

And another thing that just popped into my head. It occurs to me that controller-runtime has some pre- and post- leader-election runnables as part of the manager. I wonder if it would be possible to shim this leader election implementation into controller-runtime. This point is not a PR a blocker, but it would be nice to handle in a follow-up if it is possible. I'll investigate and report back.

Comment thread leader/leader_test.go Outdated
Comment on lines +15 to +16
var _ = Describe("Leader", func() {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this Describe layer is unnecessary. Every test in this suite (i.e. package) is a leader test.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remove it, it doesn't seem to work. I renamed it to Leader election to better describe it. I would say, it works so I'd rather not change this.

Comment thread leader/leader.go Outdated
// attempts to become the leader.
const maxBackoffInterval = time.Second * 16

type Option func(*config) error
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah crap. I totally messed this up. Now that I'm seeing this, I feel like it's a little user-hostile. If we don't export Config, users can't provide their own Option implementations.

Given that we only have one field to set, it might not seem important now, but if we add more fields in the future, we want to give users the flexibility to write their own Options.

Opening this up later would be a breaking change to the Option definition.

So given all of this, I think we want:

type Option func(*Config) error

type Config struct { ... }

// now that Config is exported, setDefaults would need 
// to be unexported since this only needs to be called
// by us after all options have run.
func (c *Config) setDefaults() error { ... }

Sorry for the back and forth!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted the commit that unexported Config. I then added a new commit to unexport just setDefaults

@joelanford
Copy link
Copy Markdown
Member

Following up about controller-runtime manager integration.

It doesn't seem immediately do-able given controller-runtime's direct use of resourcelock.Interface, which seems to have a hardcoded expectation of using leases.

Here's where the manager starts the pre-leader-election runnables vs the post-leader-election runnables: https://github.com/kubernetes-sigs/controller-runtime/blob/ab02953d0bcb6467333fcc5809eb1df83a8d89c4/pkg/manager/internal.go#L465-L478.

The only way I can see this working would be for controller-runtime to give callers a hook into startLeaderElection to override the default. Perhaps it could abstract the leader election implementation to something that allows a LeaderCallbacks to be injected and implements Run(context.Context). So perhaps:

type LeaderElector interface {
    Run(context.Context)
}

And then a new manager Option field:

type Option struct {
	// NewLeaderElector overrides the default lease-based leader elector
	// and allows callers to use their own leader election implementation.
	NewLeaderElector func(leaderelection.LeaderCallbacks) LeaderElector
}

This would definitely require a controller-runtime proposal and thorough discussion, so not anything that could happen anytime really soon.

Comment thread leader/leader.go
// leader. Upon termination of that pod, the garbage collector will delete the
// ConfigMap, enabling a different pod to become the leader.
func Become(ctx context.Context, lockName string) error {
func Become(ctx context.Context, lockName string, opts ...Option) error {
Copy link
Copy Markdown
Contributor

@camilamacedo86 camilamacedo86 Jul 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We replaced the sdk impl for the controller-runtime one. So, I do not think that we need to keep it at all.

@joelanford why we will keep our implementation since all opers now are using the controller-runtime one? If yes, i need to revert the change made in operator-framework/operator-sdk#3514.

c/c @jmrodri

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@camilamacedo86 my understanding was that this is a leader for life and we kept it for someone that wanted to have leader for life. It will not be used in the SDK but could be used by an operator developer.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of existing operators are using the leader-for-life implementation. They'll either need a way to continue using that, or will need an easy path to migrating to the lease-based election.

These days the lease-based election is easy to use and robust, which hasn't always been the case. But even still, each pattern has advantages and disadvantages.

Comment thread leader/leader_test.go

err := Become(context.TODO(), "leader-test", WithClient(client))
Expect(err).Should(BeNil())
})
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test case for returning ErrNoNamespace from Become and using errors.Is(err, ErrNoNamespace)?

@jmrodri jmrodri force-pushed the leader-remove-runlocal branch from 769d3ca to dac301c Compare July 24, 2020 14:40
Copy link
Copy Markdown
Member

@joelanford joelanford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 24, 2020
@openshift-ci-robot
Copy link
Copy Markdown

New changes are detected. LGTM label has been removed.

@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jul 24, 2020
@jmrodri jmrodri merged commit dca3fec into operator-framework:main Jul 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants