Skip to content

Cleanup Bus/ClusterBus reconciler#420

Merged
knative-prow-robot merged 6 commits into
knative:masterfrom
scothis:bus-informers
Sep 11, 2018
Merged

Cleanup Bus/ClusterBus reconciler#420
knative-prow-robot merged 6 commits into
knative:masterfrom
scothis:bus-informers

Conversation

@scothis
Copy link
Copy Markdown
Contributor

@scothis scothis commented Sep 7, 2018

Proposed Changes

  • only start the needed informer for either Bus or ClusterBus, we don't
    need both
  • consolidate the createOrUpdateBus and createOrUpdateClusterBus methods
  • use BusReference to check if the updated bus is the reconciled bus

Refs #418

Release Note

NONE

/assign @greghaynes

- only start the needed informer for either Bus or ClusterBus, we don't
  need both
- consolidate the createOrUpdateBus and createOrUpdateClusterBus methods
- use BusReference to check if the updated bus is the reconciled bus

Refs knative#418
@knative-prow-robot knative-prow-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 7, 2018
Copy link
Copy Markdown
Contributor

@neosab neosab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Reconciler struct now has a bus reference, we should remove the busRef parameter in itsRun function L293

@scothis
Copy link
Copy Markdown
Contributor Author

scothis commented Sep 7, 2018

@neosab good catch, updated

@neosab
Copy link
Copy Markdown
Contributor

neosab commented Sep 7, 2018

/lgtm

@knative-prow-robot
Copy link
Copy Markdown
Contributor

@neosab: changing LGTM is restricted to assignees, and only knative/eventing repo collaborators may be assigned issues.

Details

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Comment thread pkg/buses/reconciler.go
r.bus = clusterBus.DeepCopy()
if !equality.Semantic.DeepEqual(r.bus.GetSpec(), previousBus.GetSpec()) {
// stash the new bus on the reconciler while retaining the old bus
bus, r.bus = r.bus, bus
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r.bus is no longer a DeepCopy of the bus we get back from the busesLister, but during create we make a deepcopy in this same scenario (https://github.com/knative/eventing/pull/420/files#diff-d2cf608fb7c41df730d64ac95f391e7aR313). I don't see why the DeepCopy is needed (am I missing something?) but I imagine these two codepaths should operate the same way.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bus resource on the reconciler should never be mutated. The general pattern is to create a copy before mutating a resource. The DeepCopy here was an extra layer of protected. With the switch to use GenericBus there is no longer a consistent implementation of DeepCopy to call.

I tried to add DeepCopy() GenericBus to the GenericBus interface, but couldn't get the type checker to accept the generated DeepCopy methods on Bus and ClusterBus since they don't return GenericBus but a struct that implements GenericBus.

To make the behavior more consistent we could call DeepCopy() before passing the bus into createOrUpdateBus(), or drop the other (hopefully) redundant DeepCopy calls so that's is obvious a copy needs to be created before mutation. Thoughts?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, makes sense genericbus would cause issues there. Either way seems reasonable to me, as long as we end up with consistency.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now calling .DeepCopy() before passing the bus to createOrUpdateBus().

Copy link
Copy Markdown
Contributor

@neosab neosab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I just meant to say this PR looks good to me 👍

@greghaynes
Copy link
Copy Markdown
Contributor

Minor question but lgtm

@scothis
Copy link
Copy Markdown
Contributor Author

scothis commented Sep 10, 2018

/check-cla
/assign @grantr

@greghaynes
Copy link
Copy Markdown
Contributor

/lgtm

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 10, 2018
Copy link
Copy Markdown
Contributor

@grantr grantr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @scothis! I'd like some clarification about the bus stashing since at first glance it seems like it might not be thread-safe (but I haven't looked at it too carefully).

Comment thread pkg/buses/reconciler.go
Comment thread pkg/buses/reconciler.go
r.bus = clusterBus.DeepCopy()
if !equality.Semantic.DeepEqual(r.bus.GetSpec(), previousBus.GetSpec()) {
// stash the new bus on the reconciler while retaining the old bus
bus, r.bus = r.bus, bus
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this stash needed? How does it work when there are multiple worker threads?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workqueue guarantees that the same resource will never be emitted concurrently, so that should handle any race concerns.

We capture the bus resource for a few reasons:

  • in order to resolve parameters for Channels and Subscriptions
  • to know if a Channel is backed by the particular bus instance
  • to record Events against

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same resource will not be emitted concurrently, but IIUC there is still the case of threadiness different resources being reconciled simultaneously.

  1. resource A enters worker thread A
  2. resource B enters worker thread B
  3. resource A and B race to set r.bus
  4. What is the value of r.bus?

Is this scenario possible?

Serving and the Feed controller have context.Context objects plumbed through the entire Reconcile loop now, and it's reasonable to expect this will be a general pattern in Knative controllers. We may be able to eliminate the race (if there is one) by stashing the bus resource in a context.Context. That also makes the single-call scope of the bus stash clear. I'm intentionally ignoring any opinions on the design of context.Context for now. 😁

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A reconciler is for a single bus resource. Resource A and B will never race to set r.bus because only A or B can be for the the bus that is being reconciled (otherwise there's a major bug in workqueue). When r.bus is updated, the kind, apiVersion, name and namespace of the previous and future value are identical. We set the value to pickup spec mutations.

#422 adds all provisioned Channels and/or Subscriptions (as needed) to the workqueue when the Bus params change. This will cause them to be reprocessed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scothis and I talked about this offline. Restating the above a bit: Reconciler will have all Bus update events in its workqueue, but discards update events for all Buses except one. That happens at https://github.com/knative/eventing/pull/420/files#diff-d2cf608fb7c41df730d64ac95f391e7aR542.

The Reconciler's workqueue has multiple object types in it. This is why the Reconciler needs threadiness even though it only ever reconciles a single Bus. This is not conventional. I don't know enough about the implementation to say whether the convention should apply, but I'd to see a comment in the code that acknowledges and briefly explains this break from convention.

So tl;dr my understanding is that this isn't a race condition even though it looks like one. I'd like to see a comment in the code acknowledging that as well.

@knative-prow-robot knative-prow-robot removed the lgtm Indicates that a PR is ready to be merged. label Sep 11, 2018
Copy link
Copy Markdown
Contributor

@grantr grantr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve
/hold

Ok, we're on the same page now. Thanks for explaining your reasoning @scothis.

These comments describe the contents of workqueue keys incorrectly:
https://github.com/knative/eventing/pull/420/files#diff-d2cf608fb7c41df730d64ac95f391e7aR381
https://github.com/knative/eventing/pull/420/files#diff-d2cf608fb7c41df730d64ac95f391e7aR393
https://github.com/knative/eventing/pull/420/files#diff-d2cf608fb7c41df730d64ac95f391e7aR417

These are issues that are only related to this PR in the sense that I was confused by them while reviewing, so if you want to address them in a separate PR that's fine (you can cancel the hold in that case). Or if you want to address them now I'll lgtm again.

@knative-prow-robot knative-prow-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. labels Sep 11, 2018
@knative-prow-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: grantr, scothis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 11, 2018
@knative-prow-robot knative-prow-robot removed the lgtm Indicates that a PR is ready to be merged. label Sep 11, 2018
Copy link
Copy Markdown
Contributor

@grantr grantr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 11, 2018
@grantr
Copy link
Copy Markdown
Contributor

grantr commented Sep 11, 2018

/hold cancel

@knative-prow-robot knative-prow-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 11, 2018
@knative-prow-robot knative-prow-robot merged commit 94e79eb into knative:master Sep 11, 2018
@scothis scothis deleted the bus-informers branch September 11, 2018 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants