Skip to content

[To be split] Make broker ingress/filter be able to use knative service#2425

Closed
yolocs wants to merge 11 commits into
knative:masterfrom
yolocs:f/ksvc
Closed

[To be split] Make broker ingress/filter be able to use knative service#2425
yolocs wants to merge 11 commits into
knative:masterfrom
yolocs:f/ksvc

Conversation

@yolocs
Copy link
Copy Markdown
Contributor

@yolocs yolocs commented Jan 22, 2020

Fixes #2301

The behavior would be: add env BROKER_RESOURCE_FLAVOR=knative to create broker resources (ingress/filter) as knative services. If the serving is not installed, the controller will fail to start. Without this env, the controller will still use svc+deployment for ingress/filter.

Most files are vendored serving code. Files in eventing that are meaningfully changed:

  • ./pkg/reconciler/service/*
  • ./pkg/reconciler/broker/*
  • ./pkg/reconciler/trigger/*

Proposed Changes

  • Added service reconcilers - one uses k8s abc+deployment, one uses ksvc
  • Used an “emptyable” informer to prevent controller failure when serving API doesn’t exist
  • Updated tests

Release Note


@knative-prow-robot knative-prow-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 22, 2020
@googlebot googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Jan 22, 2020
@knative-prow-robot knative-prow-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jan 22, 2020
@knative-prow-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: yolocs
To complete the pull request process, please assign grantr
You can assign the PR to them by writing /assign @grantr in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Comment thread cmd/broker/filter/main.go Outdated
Comment thread cmd/broker/ingress/main.go Outdated
@yolocs
Copy link
Copy Markdown
Contributor Author

yolocs commented Jan 22, 2020

/test pull-knative-eventing-unit-tests

@yolocs
Copy link
Copy Markdown
Contributor Author

yolocs commented Jan 22, 2020

/test pull-knative-eventing-integration-tests

@pmorie
Copy link
Copy Markdown
Contributor

pmorie commented Jan 22, 2020

How do you suggest that we coordinate serving and eventing releases if this were to be merged?

@yolocs
Copy link
Copy Markdown
Contributor Author

yolocs commented Jan 22, 2020

How do you suggest that we coordinate serving and eventing releases if this were to be merged?

Not sure what caveats are there. I'd imagine that eventing release branch will depend on the same serving release. Thoughts?

@pmorie
Copy link
Copy Markdown
Contributor

pmorie commented Jan 23, 2020

Thoughts?

I don't think there's a good way to do it that doesn't complicate the release process and life for users, which is why I have been against adopting a dependency on serving in the past. Is there a way to accomplish this without taking the dependeny?

@lionelvillard
Copy link
Copy Markdown
Contributor

The dependency is on the serving APIs. So there is no need to align the release. Eventing can always use the previous serving release.

@mattmoor
Copy link
Copy Markdown
Member

Given that we are already betting that folks are going to plug in alternate Brokers and Channels, let's lean into that. What Eventing needs upstream is a reference implementation, which validates the semantics for the purposes of establishing conformance. We should separate that broker implementation like we do the in-memory-channel, so that folks can opt out.

Honestly, having just done this for Ingress, nothing keeps you more honest about conformance than having to write the second thing, it really builds empathy for folks trying to integrate :)

I really dislike the idea of adding a moving part that toggles between serving/other based on what's installed (same goes for KEDA). If Serving is the right tool for a particular implementation of the abstraction, then that implementation should bet on it, but it should probably live out-of-tree.

To clarify: I don't think that all we should provide out-of-the-box is a toy, but maybe that's all we have in Eventing proper?

@yolocs
Copy link
Copy Markdown
Contributor Author

yolocs commented Jan 23, 2020 via email

@yolocs
Copy link
Copy Markdown
Contributor Author

yolocs commented Jan 23, 2020

@mattmoor

Given that we are already betting that folks are going to plug in alternate Brokers and Channels, let's lean into that. What Eventing needs upstream is a reference implementation, which validates the semantics for the purposes of establishing conformance. We should separate that broker implementation like we do the in-memory-channel, so that folks can opt out.

For the time being, as we're defining the broker conformance, we also need to work on a reference broker implementation that's production ready. It might not eventually belong to the eventing core repo (like you did for KIngress). IMO at the moment we can stage the work in the eventing core repo and later (once the broker conformance gets maturer) we can move it out.

@nachocano
Copy link
Copy Markdown
Contributor

@mattmoor

Given that we are already betting that folks are going to plug in alternate Brokers and Channels, let's lean into that. What Eventing needs upstream is a reference implementation, which validates the semantics for the purposes of establishing conformance. We should separate that broker implementation like we do the in-memory-channel, so that folks can opt out.

For the time being, as we're defining the broker conformance, we also need to work on a reference broker implementation that's production ready. It might not eventually belong to the eventing core repo (like you did for KIngress). IMO at the moment we can stage the work in the eventing core repo and later (once the broker conformance gets maturer) we can move it out.

+1 to @yolocs comment.

I think we shouldn't be blocked on the Broker conformance conversations. From past experience, those conversations take quite sometime, and we might not even reach a consensus in 0.13, although I hope we do.
We know that our current broker is not scalable, so if we can provide users a more scalable version I think we should go for it. It might (and probably will) change in the near future, but that's alright, these big things always require a few iterations to get right.

Given that folks have different opinions about the serving dependency, I think this is an excellent time to leverage all the great performance work that the folks in the community (e.g., @slinkydeveloper, @chizhg, @grantr, etc) have been doing. We should base our decision on data. If you can show that this change improves the throughput by some x, that events are not dropped due to the broker ingress pod being restarted when is bombarded, etc., then I think it will be a no-brainer...
Users that care about performance, can easily opt-in. And once we start having more users, we will get their feedback on what they actually need, instead of us guessing...

@slinkydeveloper
Copy link
Copy Markdown
Contributor

slinkydeveloper commented Jan 24, 2020

@nachocano I agree with the pragmatic vision of watching for data, but I'm skeptic that implementing the broker scale up gives a significative boost on performance. In my opinion we should work on several steps first:

  1. Make channels fast (and that's something we are progressing in cloudevents sdk-go first)
  2. Make channels scalable
  3. Make brokers scale in conjunction to channels

If we don't work on the first problem, then you're scaling something inefficient and the result will create just more traffic.
If we don't work on the second problem, you scale the broker to accept N rps, but then your channel implementation can't scale, so you'll just cause the channel dispatcher to crash continuously.
And, as soon as we maintain this architecture of broker depending on channels, the scaling will never be linear, because we have resource contention, where N brokers access to N channel dispatchers, scaled just as black boxes.

Anyway, I would be really glad to help testing the proposed solutions to check how they perform better than the actual code.

@lionelvillard
Copy link
Copy Markdown
Contributor

@mattmoor

I don't think that all we should provide out-of-the-box is a toy, but maybe that's all we have in Eventing proper?

Are you suggesting Eventing Core should only contain a reference implementation (aka a toy)? Are you suggesting an optimized and scalable broker implementation should live somewhere else (like eventing-contrib)?

@yolocs
Copy link
Copy Markdown
Contributor Author

yolocs commented Jan 27, 2020

@nachocano I agree with the pragmatic vision of watching for data, but I'm skeptic that implementing the broker scale up gives a significative boost on performance. In my opinion we should work on several steps first:

  1. Make channels fast (and that's something we are progressing in cloudevents sdk-go first)
  2. Make channels scalable
  3. Make brokers scale in conjunction to channels

If we don't work on the first problem, then you're scaling something inefficient and the result will create just more traffic.
If we don't work on the second problem, you scale the broker to accept N rps, but then your channel implementation can't scale, so you'll just cause the channel dispatcher to crash continuously.
And, as soon as we maintain this architecture of broker depending on channels, the scaling will never be linear, because we have resource contention, where N brokers access to N channel dispatchers, scaled just as black boxes.

Anyway, I would be really glad to help testing the proposed solutions to check how they perform better than the actual code.

There is no denying we also need to scale channels. And It's exciting to see you're actively working on scaling channels (starting with cloudevents sdk-go). But the work to scale the current broker implementation doesn't conflict with scaling channels. @grac3gao is also working on performance tests to see how much we can gain from ksvc.

@yolocs
Copy link
Copy Markdown
Contributor Author

yolocs commented Jan 27, 2020

Are you suggesting Eventing Core should only contain a reference implementation (aka a toy)? Are you suggesting an optimized and scalable broker implementation should live somewhere else (like eventing-contrib)?

I don't think the reference implementation should be a toy. To me, it's desirable to have a strong reference implementation that users can just grab and go. Introducing ksvc as a code dependency doesn't seem to be a strong enough argument to move the broker reference implementation out of the eventing core.

@yolocs
Copy link
Copy Markdown
Contributor Author

yolocs commented Jan 27, 2020

Pending #2451 for this PR to be ready.

@yolocs yolocs changed the title WIP: Make broker ingress/filter be able to use knative service Make broker ingress/filter be able to use knative service Jan 28, 2020
@knative-prow-robot knative-prow-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 28, 2020
@knative-metrics-robot
Copy link
Copy Markdown

The following is the coverage report on the affected files.
Say /test pull-knative-eventing-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/eventing/v1alpha1/broker_lifecycle.go 95.8% 88.5% -7.4
pkg/reconciler/broker/broker.go 83.8% 80.2% -3.7
pkg/reconciler/broker/controller.go 93.3% 91.7% -1.7
pkg/reconciler/broker/resources/ingress.go 0.0% 100.0% 100.0
pkg/reconciler/service/kube/kube.go Do not exist 88.1%
pkg/reconciler/service/serving/serving.go Do not exist 91.2%
pkg/reconciler/trigger/controller.go 100.0% 92.6% -7.4
pkg/reconciler/trigger/trigger.go 88.8% 88.7% -0.1

Comment thread Gopkg.toml
branch = "master"

[[override]]
name = "knative.dev/serving"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold

@knative-prow-robot knative-prow-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 29, 2020
@matzew
Copy link
Copy Markdown
Member

matzew commented Jan 29, 2020

There are a lot of comments here, let me try to add some notes..

I agree with several folks that the "reference implementation" that comes with "CORE" should not be a toy. We should aim for some good and solid implementation.

Would it be that bad if the RI depends on the API of Knative Serving (not an exact version)? that way we could have a more serverless approach of a broker.

Alternative brokers could be living in some other repo...

However, with alternative impl. of the broker I think we should perhaps consider what @n3wscott suggested to get rid of exposing the channel IMPL details ....

@matzew
Copy link
Copy Markdown
Member

matzew commented Jan 29, 2020

Additional point....

Would it be that bad if the RI depends on the API of Knative Serving (not an exact version)? that way we could have a more serverless approach of a broker.

So, that would be perhaps another moving part? and perhaps we should get the current broker done as a solid foundation, and why not move the ksvc based broker to contrib repo - as an alternative impl.

Now, in order to easily plug-in other IMPLs. should the broker move to be some generic API, with a default IMPL, like we have w/ the IMC on the "eventing core" repo?

and a KsvcBroker could be in the repo? IMO that would also validate the API of the Broker itself, and ensure its conformence ?

@devguyio
Copy link
Copy Markdown
Contributor

I like @matzew suggestion.

On one hand, I understand why users want scale to 0 semantics as you mentioned for your demo, but on another I’ve the feeling that we’re cramming more moving parts without solidifying the foundations first.

IMHO it would have made sense if the current implementation was more time tested and stable with less “fundamental” problems than it has now (checkout #2460 ), then we come to the point where we know it needs autoscaling (to 0 and up ) and we’d know what is it excatly improving.

@mattmoor
Copy link
Copy Markdown
Member

I want to pop up a level because I think we're debating one half of the dichotomy here.

We want ksvc because we are creating sole-tenancy Broker deployments today, which is appealing from the standpoint of isolation, but leads to this problem of how we scale the proliferation of these down to zero when they aren't in use (and up when they are). Serving is a powerful abstraction for this, and I expect that it will figure prominently in some implementations.

However, if the default Broker simply ran as a multi-tenant component, then the proliferation and scale questions get a lot easier, at the cost of having to deal with multi-tenancy. It would be relatively straightforward to replace the sole-tenancy deployments with ExternalName services that route through a multi-tenant deployment that dispatches on Host. I illustrated this here

My $0.02: I'd rather see us pursue the latter in-tree, with an out-of-tree implementation that uses Serving (and KEDA) to achieve sole-tenancy and scale.

@lionelvillard
Copy link
Copy Markdown
Contributor

I agree with everything but I have some concerns about this:

I'd rather see us pursue the latter in-tree, with an out-of-tree implementation

I'm worried about the "out of tree, out of sight", and that each vendor is going to work their own optimized implementation.

Any thoughts on that?

@matzew
Copy link
Copy Markdown
Member

matzew commented Jan 29, 2020

I'm worried about the "out of tree, out of sight", and that each vendor is going to work their own optimized implementation.

I'd think that "out of tree" here means eventing-contrib, which would be a good choice.

@mattmoor
Copy link
Copy Markdown
Member

Yeah, I don't mean out-of-org, just out-of-tree. We want the core of eventing to avoid a serving dependency. FWIW, I just shared a (maybe) crazy idea in #networking for how we might optimize things 😎

@lionelvillard
Copy link
Copy Markdown
Contributor

BTW most components are already multi-tenant by using ExternalName services. AFAIK, Only the broker is not.

@yolocs
Copy link
Copy Markdown
Contributor Author

yolocs commented Jan 29, 2020

I'd think that "out of tree" here means eventing-contrib, which would be a good choice.

@matzew To me, this is NOT another KsvcBroker. The essence of this broker is still the same - ingress + channel + filter. It provides a podspec expect an addressable service. To have it in eventing-contrib means duplicate like 99% of the broker code and we now need to maintain two copies.

IMHO it would have made sense if the current implementation was more time tested and stable with less “fundamental” problems than it has now (checkout #2460 ), then we come to the point where we know it needs autoscaling (to 0 and up ) and we’d know what is it excatly improving.

@Abd4llA There is no denying there are a lot of things to improve (and scaling is one of them). To me, some of of these "fundamental" problems may not be solved with this broker implementation. But scaling is definitely one we can solve. Unless you think this change has conflicts with these future improvements, I don't see why we shouldn't go for it.

My $0.02: I'd rather see us pursue the latter in-tree, with an out-of-tree implementation that uses Serving (and KEDA) to achieve sole-tenancy and scale.

@mattmoor Like you said, what you suggested is only good for certain use cases (when isolation is not a need). I don't understand why we can't solve both use cases (isolation and non-isolation) with this change.

@lberk
Copy link
Copy Markdown
Member

lberk commented Jan 29, 2020

In the past I've heard knative components described as being loosely coupled so that we can use individual pieces of knative and aren't forced to use every component. Doesn't this PR landing in eventing core (vs eventing contrib) dilute the decoupled characteristic of knative?

@yolocs
Copy link
Copy Markdown
Contributor Author

yolocs commented Jan 29, 2020

In the past I've heard knative components described as being loosely coupled so that we can use individual pieces of knative and aren't forced to use every component. Doesn't this PR landing in eventing core (vs eventing contrib) dilute the decoupled characteristic of knative?

Does different broker implementations (https://docs.google.com/document/d/1pBLkMptwbdEHUgjn1K5Obf7Xe-G9PFGChQcDCZE9vMQ/edit?ts=5dec0d81#) meet the "decoupled characteristic"? To me the eventing API is the essence of decoupling interface from implementation.

@lionelvillard
Copy link
Copy Markdown
Contributor

@yolocs is it possible to implement the necessary hooks in eventing and to move the KService dependency to eventing-contrib? That would solve the code duplication problem.

@yolocs
Copy link
Copy Markdown
Contributor Author

yolocs commented Jan 29, 2020

@yolocs is it possible to implement the necessary hooks in eventing and to move the KService dependency to eventing-contrib? That would solve the code duplication problem.

I can give it a try. Stay tuned.

@yolocs yolocs changed the title Make broker ingress/filter be able to use knative service [To be split] Make broker ingress/filter be able to use knative service Jan 29, 2020
@lionelvillard
Copy link
Copy Markdown
Contributor

@yolocs any progress on the split?

@yolocs
Copy link
Copy Markdown
Contributor Author

yolocs commented Feb 12, 2020

@yolocs any progress on the split?

I'm waiting for the broker/trigger reconciler merge to be done first (seems it's merged now). And will need to resolve conflicts in #2468 before I make change to eventing-contrib.

@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 24, 2020
@knative-prow-robot
Copy link
Copy Markdown
Contributor

@yolocs: PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@yolocs yolocs closed this Mar 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes Indicates the PR's author has signed the CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow eventing to make use of knative serving