Skip to content

Channel spec: iteration 1#1420

Merged
knative-prow-robot merged 1 commit into
knative:masterfrom
matzew:channel_spec
Jul 8, 2019
Merged

Channel spec: iteration 1#1420
knative-prow-robot merged 1 commit into
knative:masterfrom
matzew:channel_spec

Conversation

@matzew
Copy link
Copy Markdown
Member

@matzew matzew commented Jun 18, 2019

Fixes #1213

Proposed Changes

  • adding 0.6.0 (historic) status
  • adding 0.7.0raw spec

Conformance Question

what do we want in the conformance test section ?

Let's mention it requires

  • controller
  • dispatcher
  • ServiceAccounts and ClusterRoles (for both)
  • presence of labels:
    • "channable:true" on the channelable-manipulator ClusterRole
    • "subscribable:true" on the CRD definition file
  • usage of CloudEvents (version 0.2?) for in/out

Some of this is already in the spec doc... but perhaps worth to explicitly add it to some conformance section?

I'd think we word these out, and have a clear, human-readable section. From there we can easily create some TCK / test-conformance-kit ?

Please have a look:
@lberk @aslom @alanconway @n3wscott @Harwayne @nachocano

@knative-prow-robot knative-prow-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 18, 2019
@googlebot googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Jun 18, 2019
@knative-prow-robot knative-prow-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 18, 2019

A channel logically receives events on its input domain and forwards them to its subscribers. Below is a specification for the generic parts of each _Channel_.

A typical channel consists of a _Controller_ and a _Dispatcher_ pod.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nachocano what about the webhook, that we have in Kafka?

@matzew
Copy link
Copy Markdown
Member Author

matzew commented Jun 18, 2019

ping @Abd4llA

name: my-channel
```

Each _Channel Controller_ ensures the required tasks on the backing technology are applied. In this case a Kafka topic with the desired configuration is being created, backing all messages from the channel.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to define the behaviour of the status field to indicate that the channel has created the backing resources and is ready to receive messages?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes please. There's this bug to address the fact that Channelable Duck is not currently very useful for deciphering the Status of the Channel CRDs
#1375


Different Channels handle failures in different ways:
* gcp-pubsub - Exponential backoff, up to five minutes. Will continue retrying forever.
* kafka - Immediate retry, no backoff. Will continue retrying forever.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? I don't see it retrying: https://github.com/knative/eventing/blob/master/contrib/kafka/pkg/dispatcher/dispatcher.go#L292

The error is always ignored.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be something we want to configure genrically as part of the "endpoint" configuration for a Channel or Source.
A related issue is reliability - for any form of reliable delivery then we will need to be able to configure QoS settings on endpoints.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should then retry and other QoS speced for 0.8+? And tested?

@vaikas
Copy link
Copy Markdown
Contributor

vaikas commented Jun 19, 2019

Thanks for putting this together. I focused mainly on the 0.7 and jotted down some thoughts / questions.


```
apiVersion: messaging.knative.dev/v1alpha1
kind: InMemoryChannel
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent is off here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
kind: InMemoryChannel
kind: InMemoryChannel

name: my-channel
```

Each _Channel Controller_ ensures the required tasks on the backing technology are applied. In this case a Kafka topic with the desired configuration is being created, backing all messages from the channel.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes please. There's this bug to address the fact that Channelable Duck is not currently very useful for deciphering the Status of the Channel CRDs
#1375


Each _Channel Controller_ ensures the required tasks on the backing technology are applied. In this case a Kafka topic with the desired configuration is being created, backing all messages from the channel.

#### Aggregated Channelable ClusterRole
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe:
Aggregated Channelable Manipulator ClusterRole


#### Aggregated Channelable ClusterRole

Every CRD must create a corresponding ClusterRole, that will be aggregated into the `channelable-manipulator` ClusterRole. This ClusterRole must include permissions to create, read, patch, and update the CRD's custom objects. Below is an example for the `KafkaChannel`:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe read=> list, watch?
Also maybe:
... update the CRD's custom objects and their status.

Comment thread docs/spec/channel/version_070.md Outdated

Channels MAY NOT alter an event that goes through them. All CloudEvent attributes, including the data attribute, MUST be received at the subscriber identical to how they were received by the Channel.

Channels MUST attach a bearer token to all outgoing requests, likely in the form of a JWT. This bearer token MUST use an identity associated with the Channel, not the individual Subscription. No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this jive with the security requirements that @mikehelmick outlined in:
#705
That's geared towards Eventing constructs, and just want to make sure that this is compatible with that approach.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be something that we rely on the (optional) mesh to handle for us instead?
Just curious who handles the auth and how the creds get distributed, etc.


TODO

### Data Plane
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we say something about (even if a TODO for the first draft) about metrics that a channel should expose (failed deliveries, malformed incoming events, etc.). as well as some sort of perf metrics?

@vaikas
Copy link
Copy Markdown
Contributor

vaikas commented Jun 19, 2019

Looks like there's a dangling link?
I0619 08:48:10.948] docs/spec/channel/version_060.md
I0619 08:48:10.948] ERROR ../broker/README.md
I0619 08:48:10.949] Stat docs/spec/broker/README.md: no such file or directory


### The ClusterChannelProvisioner

Describes an abstract configuration of a Source system which produces events or a Channel system that receives and delivers events.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should "Source" be "Importer" or is that change still pending?
Are we folding configuration of source and channel together? If so it seems like it might be time to identify some common "endpoint" configuration type that describes either the client or server end of some protocol connection, where the direction of event flow is not assumed to be client->server (it will be for HTTP, but not for other protocols)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's the old spec ... matching old terms ?


### The ClusterChannelProvisioner

Describes an abstract configuration of a Source system which produces events or a Channel system that receives and delivers events.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe clarify what the provisioner does - it's not just descriptive, it creates new channel instances does it not?


Different Channels handle failures in different ways:
* gcp-pubsub - Exponential backoff, up to five minutes. Will continue retrying forever.
* kafka - Immediate retry, no backoff. Will continue retrying forever.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be something we want to configure genrically as part of the "endpoint" configuration for a Channel or Source.
A related issue is reliability - for any form of reliable delivery then we will need to be able to configure QoS settings on endpoints.


##### Generic

If a Channel receives an event queueing request and is unable to parse a valid CloudEvent, then it MUST reject the request.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last time I checked there was no validation done (at least for the InMemoryChannel)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be for 0.8?

Comment thread docs/spec/channel/version_060.md Outdated
Comment thread docs/spec/channel/version_060.md Outdated

```
apiVersion: messaging.knative.dev/v1alpha1
kind: InMemoryChannel
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
kind: InMemoryChannel
kind: InMemoryChannel

```
apiVersion: messaging.knative.dev/v1alpha1
kind: InMemoryChannel
metadata:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
metadata:
metadata:

apiVersion: messaging.knative.dev/v1alpha1
kind: InMemoryChannel
metadata:
name: my-channel
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name: my-channel
name: my-channel

Comment thread docs/spec/channel/version_070.md Outdated

Every event queueing request to the Channel will come with a bearer token, likely a JWT. The bearer token MUST be validated before any other work is done on the request. The specifics of how and what to validate will be identical to Broker ingress verification, which is being [defined](https://github.com/knative/eventing/issues/705#issuecomment-496722527).

The Channel MUST pass through all tracing information as CloudEvents attributes. In particular, it MUST translate any incoming OpenTracing or B3 headers to the [Distributed Tracing Extension](https://github.com/cloudevents/spec/blob/v0.2/extensions/distributed-tracing.md). The Channel SHOULD sample and write traces to the location specified in [`config-tracing`](https://github.com/cloudevents/spec/blob/v0.2/extensions/distributed-tracing.md).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config-tracing link is incorrect, although I don't know the correct one for now.

Comment thread docs/spec/channel/version_070.md Outdated
Comment thread docs/spec/channel/version_070.md Outdated

If a Channel receives an event queueing request and is unable to parse a valid CloudEvent, then it MUST reject the request.

Every event queueing request to the Channel will come with a bearer token, likely a JWT. The bearer token MUST be validated before any other work is done on the request. The specifics of how and what to validate will be identical to Broker ingress verification, which is being [defined](https://github.com/knative/eventing/issues/705#issuecomment-496722527).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still a bit speculative. We'll have to think about how to craft correct policies in this area.

There is overlap in that the channel egress will need to be able to craft a new JWT with an audience of the subscriber.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might want to mark that this will be defined for 0.8?


#### Output

Channels MUST output CloudEvents. The output MUST be via a binding specified in the [CloudEvents specification](https://github.com/cloudevents/spec/tree/v0.2#cloudevents-documents). Every Channel MUST support sending events via Structured Content Mode HTTP Transport Binding.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when will be update channel spec to cloudevents 0.3?

@aslom
Copy link
Copy Markdown
Contributor

aslom commented Jul 2, 2019

Should version_070.md be renamed to 071 or 080 as changes are made or we just keep channel/spec.md with list of changes?

@aslom
Copy link
Copy Markdown
Contributor

aslom commented Jul 2, 2019

Should there be some kind of migration guide form 0.6? Especially about default channels?

Comment thread docs/spec/channel/version_060.md Outdated
Comment thread docs/spec/channel/version_070.md Outdated
Comment thread docs/spec/channel/version_070.md Outdated

Channels MUST output CloudEvents. The output MUST be via a binding specified in the [CloudEvents specification](https://github.com/cloudevents/spec/tree/v0.2#cloudevents-documents). Every Channel MUST support sending events via Structured Content Mode HTTP Transport Binding.

Channels MAY NOT alter an event that goes through them. All CloudEvent attributes, including the data attribute, MUST be received at the subscriber identical to how they were received by the Channel.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Channels MAY NOT alter an event that goes through them. All CloudEvent attributes, including the data attribute, MUST be received at the subscriber identical to how they were received by the Channel.
Channels MAY NOT alter an event that goes through them. All CloudEvent attributes, including the data attribute, MUST be received at the subscriber identical to how they were received by the Channel.

Comment thread docs/spec/channel/version_070.md Outdated
@matzew
Copy link
Copy Markdown
Member Author

matzew commented Jul 3, 2019

/test pull-knative-eventing-build-tests

@matzew matzew changed the title WIP: Channel spec Channel spec: iteration 1 Jul 3, 2019
@knative-prow-robot knative-prow-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 3, 2019
@Harwayne
Copy link
Copy Markdown
Contributor

Harwayne commented Jul 8, 2019

/lgtm
/approve

Let's submit this as-is and continue improving it here, rather than it @matzew's personal repo. It is marked as in-progress, so should not be used by others for now.

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 8, 2019
@knative-prow-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Harwayne, matzew

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 8, 2019
@knative-prow-robot knative-prow-robot merged commit 4299512 into knative:master Jul 8, 2019
lberk pushed a commit to lberk/eventing that referenced this pull request Jul 15, 2019
matzew added a commit to matzew/eventing that referenced this pull request Sep 28, 2021
Signed-off-by: Matthias Wessendorf <mwessend@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cla: yes Indicates the PR's author has signed the CLA. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Define a Spec for Channels, control and data plane