Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
263 changes: 160 additions & 103 deletions specs/eventing/data-plane.md
Original file line number Diff line number Diff line change
@@ -1,125 +1,182 @@
# Knative Eventing Data Plane Contracts
# Knative Eventing Data Plane Contract

## Introduction

Developers using Knative Eventing need to know what is supported for delivery to
user provided components that receive events. Knative Eventing defines contract
for data plane components and we have listed them here.

## Conformance

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
interpreted as described in RFC2119.

## Data plane contract for Sinks
## Terminology

A **Sink** MUST be able to handle duplicate events.
This document discusses communication between two parties:

A **Sink** is an [_addressable_](./interfaces.md#addressable) resource that
takes responsibility for the event. A Sink could be a consumer of events, or
middleware. A Sink MUST be able to receive CloudEvents over HTTP and HTTPS.
- **Event Senders** initiate an HTTP POST to deliver a CloudEvent.
- **Event Recipients** receive an HTTP POST and accept (or reject) a CloudEvent.

A **Sink** MAY be [_callable_](./interfaces.md#callable) resource that
represents an Addressable endpoint which receives an event as input and
optionally returns an event to forward downstream.
Additionally, these roles can be combined in different ways:

Almost every component in Knative Eventing may be a Sink providing
composability.
- **Event Processors** can be event senders, event recipients, or both.
- **Event Sources** are exclusively event senders, and never act as recipients.
- **Event Sinks** are exclusively event recipients, and never act as senders.

Every Sink MUST support HTTP Protocol Binding for CloudEvents
[version 1.0](https://github.com/cloudevents/spec/blob/v1.0/http-protocol-binding.md)
and
[version 0.3](https://github.com/cloudevents/spec/blob/v0.3/http-transport-binding.md)
with restrictions and extensions specified below.

### HTTP Support

This section adds restrictions on
[requirements in HTTP Protocol Binding for CloudEvents](https://github.com/cloudevents/spec/blob/v1.0/http-protocol-binding.md#12-relation-to-http).
## Introduction

Sinks MUST accept HTTP requests with POST method and MAY support other HTTP
methods. If a method is not supported Sink MUST respond with HTTP status code
`405 Method Not Supported`. Non-event requests (e.g. health checks) are not
constrained.
Late-binding event senders and recipients (composing applications using
configuration) only works when all event senders and recipients speak a common
protocol. In order to enable wide support for senders and recipients, Knative
Eventing extends the
[CloudEvents HTTP bindings](https://github.com/cloudevents/spec/blob/v1.0.1/http-protocol-binding.md)
with additional semantics for the following reasons:

The URL used by a Sink MUST correspond to a single, unique endpoint at any given
moment in time. This MAY be done via the host, path, query string, or any
combination of these. This mapping is handled exclusively by the
[Addressable control-plane](./interfaces.md#control-plane) exposed via the
`status.address.url`.
- Knative Eventing aims to enable at least once event processing; hence it
prefers duplicate delivery to discarded events. The CloudEvents spec does not
take a stance here.

If an HTTP request's URL does not correspond to an existing endpoint, then the
Sink MUST respond with `404 Not Found`.
- The CloudEvents HTTP bindings provide a relatively simple and efficient
network protocol which can easily be supported in a wide variety of
programming languages leveraging existing library investments in HTTP. The
CloudEvents project has already written these libraries for many popular
languages.

Every non-Callable Sink MUST respond with `202 Accepted` if the request is
accepted.
- Knative Eventing assumes a sender-driven (push) event delivery system. That
is, each recipient is actively responsible for an event until it is handled
(or affirmatively delivered to all following recipients).

If Sink is Callable it MAY respond with `200 OK` and a single event in the HTTP
response. A returned event is not required to be related to the received event.
The Callable should return a successful response if the event was processed
successfully. If there is no event to send back then Callable Sink MUST respond
with 2xx HTTP and with empty body.
- Knative Eventing aims to make [event sources](./overview.md#event-source) and
event-processing software easier to write; as such, it imposes higher
standards on system components like [brokers](./overview.md#broker) and
[channels](./overview.md#channel) than on edge components.

If a Sink receives a request and is unable to parse a valid CloudEvent, then it
MUST respond with `400 Bad Request`.
This contract defines a mechanism for a single event sender to reliably deliver
a single event to a single recipient. Building from this primitive, chains of
reliable event delivery and event-driven applications can be built.

### Content Modes Supported
## Background

A Sink MUST support `Binary Content Mode` and `Structured Content Mode` as
described in
[HTTP Message Mapping section of HTTP Protocol Binding for CloudEvents](https://github.com/cloudevents/spec/blob/master/http-protocol-binding.md#3-http-message-mapping)
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
interpreted as described in
[RFC2119](https://datatracker.ietf.org/doc/html/rfc2119).

A Sink MAY support `Batched Content Mode` but that mode is not used in Knative
Eventing currently (that may change in future).
When not specified in this document, the
[CloudEvents HTTP bindings, version 1.0](https://github.com/cloudevents/spec/blob/v1.0.1/http-protocol-binding.md)
and [HTTP 1.1 protocol](https://tools.ietf.org/html/rfc7230) standards MUST be
followed (with the CloudEvents bindings taking precedence in the case of
conflict).

### Retries
The current version of this document does not describe protocol negotiation or
any delivery mechanism other than HTTP 1.1. Future versions might define
protocol negotiation to optimize delivery; compliant implementations SHOULD aim
to interoperate by ignoring unrecognized negotiation options (such as
[HTTP `Upgrade` headers](https://datatracker.ietf.org/doc/html/rfc7230#section-6.7)).

Sinks should expect that retries and accept possibility that duplicate events
may be delivered.
## Event Delivery

### Error handling
### Minimum supported protocol

If Sink is not returning HTTP success header (200 or 202) then the event may be
sent again. If the event can not be delivered then some sources of events (such
as Knative sources, brokers or channels) MAY support
[dead letter sink or channel](https://github.com/knative/eventing/blob/main/docs/delivery/README.md) for events that can not be
delivered.
All senders and recipients MUST support the CloudEvents 1.0 protocol and the
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New term "recipients"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I use recipient 17 times, and receiver 5 times. Switched receiver to recipient unless you prefer the other way around.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either is fine, I'm more interested in consistency

[binary](https://github.com/cloudevents/spec/blob/v1.0.1/http-protocol-binding.md#31-binary-content-mode)
and
[structured](https://github.com/cloudevents/spec/blob/v1.0.1/http-protocol-binding.md#32-structured-content-mode)
content modes of the CloudEvents HTTP binding. Senders which do not advertise
the ability to accept [reply events](#derived-reply-events) MAY implement only
one content mode, as the recipient is not allowed to negotiate the content mode.

### HTTP Verbs

In the absence of specific delivery preferences, the sender MUST initiate
delivery of the event to the recipient using the HTTP POST verb, using either
the structured or binary encoding of the event (sender's choice). This delivery
MUST be performed using the CloudEvents HTTP Binding, version 1.x.

Senders MAY probe the recipient with an
[HTTP OPTIONS request](https://tools.ietf.org/html/rfc7231#section-4.3.7); if
implemented, the recipient MUST indicate support for the POST verb using the
[`Allow` header](https://tools.ietf.org/html/rfc7231#section-7.4.1). Senders
which receive an error when probing with HTTP OPTIONS SHOULD proceed using the
HTTP POST mechanism.
Comment on lines +86 to +91
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually implemented by somebody? At least in our ref implementation, it's not implemented at all. Can you remove it now and maybe we add it later?

The reason I ask is that there is another CE spec, called Webhook, which I believe overlaps with this sentence knative/eventing#3092

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, I found it in one of the existing eventing docs. This seems to be compatible with https://github.com/cloudevents/spec/blob/master/http-webhook.md.

The CE WebHook specification provides the following for webhook authorization:

Any system that allows registration of and delivery of notifications to arbitrary HTTP endpoints ...

  • Sender sends an HTTP OPTIONS request with Webhook-Request-Origin (which should match the Origin header sent on each event delivery) and optionally Webhook-Request-Callback and Webhook-Request-Rate.
  • Responder replies with Allow: POST and WebHook-Allowed-Origin and preferably WebHook-Allowed-Rate (required if WebHook-Request-Rate is set). If the responder needs to do some manual setup and Webhook-Request-Callback is provided, the responder can respond without headers and then use the callback URL later to grant access.

Since this is "sender MAY" and "if implemented, recipient MUST respond with Allow: POST, I think that lines up. The "receive an error when probing ... SHOULD" allows for the webhook authorization behavior, but doesn't require it to be implemented.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of this is implemented, neither supported in any way right now... So I prefer to remove it now and add it later, when we'll have discussed and implemented it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. If the spec is now rewritten in a way mandating certain implementation work, that's not there, is IMO not flying well.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting; it appears that the current Brokers are not compliant with https://datatracker.ietf.org/doc/html/rfc7231#section-7.4.1

The actual set of allowed methods is defined by the origin server at
the time of each request. An origin server MUST generate an Allow
field in a 405 (Method Not Allowed) response and MAY do so in any
other response. An empty Allow field value indicates that the
resource allows no methods, which might occur in a 405 response if
the resource has been temporarily disabled by configuration.

curl -vv -X OPTIONS -H "Host: broker-ingress.knative-eventing.svc.cluster.local" http://127.0.0.1:8090/default/example-broker
*   Trying 127.0.0.1:8090...
* Connected to 127.0.0.1 (127.0.0.1) port 8090 (#0)
> OPTIONS /default/example-broker HTTP/1.1
> Host: broker-ingress.knative-eventing.svc.cluster.local
> User-Agent: curl/7.71.1
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 405 Method Not Allowed
< Date: Thu, 20 May 2021 18:29:47 GMT
< Content-Length: 0
< 

In any case, the current Broker implementation matches this clause: my sender sent an HTTP OPTIONS, and the recipient did not support HTTP OPTIONS, so I proceed with POST delivery.


### Event Acknowledgement and Delivery Retry

Event recipients MUST use the HTTP response code to indicate acceptance of an
event. The recipient SHOULD NOT return a response accepting the event until it
has handled the event (processed the event or stored it in stable storage). The
following response codes are explicitly defined; event recipients MAY also
respond with other response codes. A response code not in this table SHOULD be
treated as a retriable error.

| Response code | Meaning | Retry | Delivery completed | Error |
| ------------- | ------------------------------------------------- | ----- | ------------------ | ----- |
| `1xx` | (Unspecified) | No\* | No\* | Yes\* |
| `200` | [Accepted, event in reply](#derived-reply-events) | No | Yes | No |
| `202` | Event accepted | No | Yes | No |
| other `2xx` | (Unspecified) | No | Yes | No |
| `3xx` | (Unspecified) | No\* | No\* | Yes\* |
| `400` | Unparsable event | No | No | Yes |
| `404` | Endpoint does not exist | Yes | No | Yes |
| `409` | Conflict / Processing in progress | Yes | No | Yes |
| `429` | Too Many Requests / Overloaded | Yes | No | Yes |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec should explicitly reference https://datatracker.ietf.org/doc/html/rfc7231#section-7.1.3 as stated here: knative-extensions/eventing-kafka#666 (comment)

The CloudEvent Webhook spec says:

... The sender MUST observe the value of the Retry-After header and refrain from sending further requests until the indicated time.

With a proper reference to 7231 we should be good, right?

| other `4xx` | Error | No | No | Yes |
| `5xx` | Error | Yes | No | Yes |

\* Unspecified `1xx`, `2xx`, and `3xx` response codes are **reserved for future
extension**. Event recipients SHOULD NOT send these response codes in this spec
version, but event senders MUST handle these response codes as errors or success
as appropriate and implement described success or failure behavior.

Recipients MUST accept duplicate delivery of events, but they are not REQUIRED
to detect that they are duplicates. If duplicate detection is implemented, then
as specified in the
[CloudEvents specification](https://github.com/cloudevents/spec/blob/v1.0.1/primer.md#id),
event recipients MUST use the
[`source` and `id` attributes](https://github.com/cloudevents/spec/blob/v1.0.1/spec.md#required-attributes)
to identify duplicate events. This specification does not describe state
requirements for recipients which need to detect duplicate events. In general,
senders MAY add or update other CloudEvent attributes on each delivery attempt;
see [observability](#observability) for an example case.

Where possible, event senders SHOULD re-attempt delivery of events where the
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think SHOULD may be too strong here. In serving we specifically stopped the networking layer from retrying on 5xx errors because it ended up spamming the ksvc when the ksvc was correctly returning a 503. Even with a MAY I'd still like to see some text that talks about how the sender needs to be selective about it to avoid DDOSing the sink.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the mention of congestion control in the next sentence sufficient?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good thing to mention, but it still pushes people towards resending... just slower. I keep having flashbacks to wasting my time debugging a ksvc trying to figure out why I was seeing 3 times the errors in its logs... and it was because of the network retries. I think it's sufficient to talk about how it's ok to retry if the sender thinks a new result will happen, but not just because it failed with a 5xx

HTTP request returned a retryable status code. It is RECOMMENDED that event
senders implement some form of congestion control (such as exponential backoff)
and delivery throttling when managing retry timing. Congestion control MAY cause
event delivery to fail or MAY include not retrying failed delivery attempts.
This specification does not document any specific congestion control algorithm
or parameters. [Brokers](./overview.md#broker) and
[Channels](./overview.md#channel) MUST implement congestion control and MUST
implement retries.

### Observability

CloudEvents received by Sink MAY have
[Distributed Tracing Extension Attribute](https://github.com/cloudevents/spec/blob/v1.0/extensions/distributed-tracing.md).

### Event reply contract

An event sender supporting event replies SHOULD include a `Prefer: reply` header
in delivery requests to indicate to the sink that event reply is supported. An
event sender MAY ignore an event reply in the delivery response if the
`Prefer: reply` header was not included in the delivery request.

An example is that a Broker supporting event reply sends events with an
additional header `Prefer: reply` so that the sink connected to the Broker knows
event replies will be accepted. While a source sends events without the header,
in which case the sink may assume that any event reply will be dropped without
error or retry attempt. If a sink wishes to ensure the reply events will be
delivered, it can check for the existence of the `Prefer: reply` header in the
delivery request and respond with an error code if the header is not present.

### Data plane contract for Sources

See [Source Delivery specification](sources.md#source-event-delivery)
for details.

### Data plane contract for Channels

See [Channel Delivery specification](channel.md#data-plane) for details.

### Data plane contract for Brokers

See [Broker Delivery specification](broker.md)

## Changelog

- 2020-04-20: `0.13.x release`: initial version that documents common contract
for sinks, sources, channels and brokers.
Event senders MAY add or update CloudEvents attributes before sending to
implement observability features such as tracing; in particular, the
`traceparent` and `tracestate` distributed tracing attributes defined by
[W3C](https://www.w3.org/TR/trace-context/) and
[CloudEvents](https://github.com/cloudevents/spec/blob/v1.0/extensions/distributed-tracing.md)
MAY be modified in this way for each delivery attempt of the same event.

This specification does not mandate any particular logging or metrics
aggregation, nor a method of exposing observability information to users
configuring the resources. Platform administrators SHOULD expose event-delivery
telemetry to users through platform-specific interfaces, but such interfaces are
beyond the scope of this document.

### Derived (Reply) Events

In some applications, an event recipient MAY emit an event in reaction to a
received event. Senders MAY choose to support this pattern by accepting an
encoded CloudEvent in the HTTP response.

An event sender MAY document support for this pattern by including a
`Prefer: reply` header in the HTTP POST request. This header indicates to the
event recipient that the caller will accept a
[`200` response](#event-acknowledgement-and-repeat-delivery) which includes a
CloudEvent encoded using the binary or structured formats.
[Brokers](./overview.md#broker) and [Channels](./overview.md#channel) MUST
indicate support for replies using the `Prefer: reply` header when sending to
the `spec.subscriber` address.

A recipient MAY reply to any HTTP POST with a `200` response to indicate that
the event was processed successfully, with or without a response payload. If the
recipient will _never_ provide a response payload, the `202` response code is
also acceptable. Responses with a `202` response code MUST NOT be processed as
reply events.

If a recipient chooses to reply to a sender with a `200` response code and a
reply event in the absence of a `Prefer: reply` header from the sender, the
sender SHOULD treat the event as accepted, and MAY log an error about the
unexpected payload. If a sender will process a reply event it MUST include the
`Prefer: reply` header on the POST request.
Loading