Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,6 @@ verify:
@echo Running the spec phrase checker:
@tools/verify-specs.sh -v spec.md documented-extensions.md json-format.md \
http-transport-binding.md http-webhook.md mqtt-transport-binding.md \
nats-transport-binding.md
nats-transport-binding.md kafka-transport-binding.md
@echo Running the doc phrase checker:
@tools/verify-docs.sh -v .
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ The following specifications are available:
| **NATS Transport Binding** | - | [master](https://github.com/cloudevents/spec/blob/master/nats-transport-binding.md) |
| **AMQP Event Format** | - | [master](https://github.com/cloudevents/spec/blob/master/amqp-format.md) |
| **AMQP Transport Binding** | - | [master](https://github.com/cloudevents/spec/blob/master/amqp-transport-binding.md) |
| **Kafka Transport Binding** | - | [master](https://github.com/cloudevents/spec/blob/master/kafka-transport-binding.md) |

There is also the
[CloudEvents documented extension attributes](https://github.com/cloudevents/spec/blob/master/documented-extensions.md)
Expand Down
277 changes: 277 additions & 0 deletions kafka-transport-binding.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,277 @@
# Kafka Transport Binding for CloudEvents

## Abstract

The [Kafka][Kafka] Transport Binding for CloudEvents defines how events are
mapped to [Kafka messages][Kafka-Message-Format].

## Status of this document

This document is a working draft.

## Table of Contents

1. [Introduction](#1-introduction)
- 1.1. [Conformance](#11-conformance)
- 1.2. [Relation to Kafka](#12-relation-to-kafka)
- 1.3. [Content Modes](#13-content-modes)
- 1.4. [Event Formats](#14-event-formats)
- 1.5. [Security](#15-security)
2. [Use of CloudEvents Attributes](#2-use-of-cloudevents-attributes)
- 2.1. [contentType Attribute](#21-contenttype-attribute)
- 2.2. [data Attribute](#22-data-attribute)
3. [kafka Message Mapping](#3-kafka-message-mapping)
- 3.2. [Binary Content Mode](#31-binary-content-mode)
- 3.1. [Structured Content Mode](#32-structured-content-mode)
4. [References](#4-references)

## 1. Introduction

[CloudEvents][CE] is a standardized and transport-neutral definition of the
structure and metadata description of events. This specification defines how
the elements defined in the CloudEvents specification are to be used in the
Kafka protocol as [Kafka messages][Kafka-Message-Format] (aka Kafka records).

### 1.1. Conformance

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
interpreted as described in [RFC2119][RFC2119].

### 1.2. Relation to Kafka

This specification does not prescribe rules constraining transfer or settlement
of event messages with Kafka; it solely defines how CloudEvents are expressed
in the Kafka protocol as [Kafka messages][Kafka-Message-Format].

### 1.3. Content Modes

The specification defines two content modes for transferring events:
*structured* and *binary*.

The *binary* mode *only* applies to Kafka 0.11.0.0 and above, because of Kafka
0.10.x.x and below are lack of support for message level headers.

In the *structured* content mode, event metadata attributes and event data are
placed into the Kafka message value section
using an [event format](#14-event-formats).

In the *binary* content mode, the value of the event `data` attribute is placed
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we need an RFC2119 keyword in here, perhaps s/is placed/MUST be placed/ ??

into the Kafka message's value section as-is, with
the `kafka_contentType` header value declaring its media type; all other event
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does kafka_contentType come from? As far as I can work out, this is unique to this spec. While there isn't a standard header in Kafka for content type, it does appear that there are libraries out there that do put a content type header, and it appears to me that there is some consensus around contentType. Even without that, I don't think kafka_contentType makes sense, since of course it's Kafka. If this header is specific to CloudEvents, and you want to namespace it, then the prefix should be couldevents, since a CloudEvents spec like this has no business defining a header in the kafka namespace.

Copy link
Copy Markdown
Author

@longit644 longit644 Oct 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, there isn't a standard header in Kafka for the content type. So I used the Spring's format for kafka_contentType at here: https://github.com/spring-projects/spring-kafka/blob/996f8649d562cb7a2b15af42ab0bea64b510d961/spring-kafka/src/main/java/org/springframework/kafka/support/KafkaHeaders.java#L32

attributes are mapped to the Kafka message's
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe: s/are mapped/MUST be mapped/

[header section][Kafka-Message-Header].

### 1.4. Event Formats

Event formats, used with the *stuctured* content mode, define how an event is
expressed in a particular data format. All implementations of this
specification MUST support the [JSON event format][JSON-format].

### 1.5. Security

This specification does not introduce any new security features for Kafka, or
mandate specific existing features to be used.

## 2. Use of CloudEvents Attributes

This specification does not further define any of the [CloudEvents][CE] event
attributes.

### 2.1. contentType Attribute

The `contentType` attribute is assumed to contain a media-type expression
compliant with [RFC2046][RFC2046].

### 2.2. data Attribute

The `data` attribute is assumed to contain opaque application data that is
encoded as declared by the `contentType` attribute.

An application is free to hold the information in any in-memory representation
of its choosing, but as the value is transposed into Kafka as defined in this
specification, core Kafka provides data available as a sequence of bytes.

For instance, if the declared `contentType` is
`application/json;charset=utf-8`, the expectation is that the `data` attribute
value is made available as [UTF-8][RFC3629] encoded JSON text.

## 3. Kafka Message Mapping

With Kafka 0.11.0.0 and above, the content mode is chosen by the sender of the
event. Protocol usage patterns that might allow solicitation of events using a
particular content mode might be defined by an application, but are not defined
here.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra space before "here"


The receiver of the event can distinguish between the two content modes by
inspecting the `kafka_contentType` property of the Kafka message. If the value
of the `kafka_contentType` property is prefixed with the CloudEvents media type
`application/cloudevents`, indicating the use of a known [event
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken, this has been copied from the HTTP binding, which means we now have two different specs specifying this MIME type. That's not ideal, it should only be specified in one place. So I think this should be added to the main spec to say what its MIME type is, and then both this and the HTTP binding can reference that, plus the JSON spec should be the authoritative spec that says that it is the format for application/cloudevents+json, not this and the HTTP binding.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I borrowed it from HTTP specs. I think it should be moved to main specs too.

format](#14-event-formats), the receiver uses *structured* mode, otherwise it
defaults to *binary* mode.

If a receiver finds a CloudEvents media type as per the above rule, but with an
event format that it cannot handle, for instance
`application/cloudevents+avro`, it MAY still treat the event as binary and
forward it to another party as-is.

### 3.1. Binary Content Mode

The *binary* content mode accommodates any shape of event data, and allows for
efficient transfer and without transcoding effort.

#### 3.1.1. Kafka Content Type

For the *binary* mode, the `kafka_contentType` property MUST be mapped directly
to the CloudEvents `contentType` attribute.

#### 3.1.2. Event Data Encoding

The [`data` attribute](#22-data-attribute) byte-sequence MUST be used as the
value of the Kafka message.

#### 3.1.3. Metadata Headers

All [CloudEvents][CE] attributes with exception of `data` MUST be individually
mapped to and from the Header fields in the Kafka message.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any rule needed w.r.t. the names used? Case of the names?


##### 3.1.3.1 User Property Names
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between "User Properties" and "Metadata Headers" ?


Cloud Event attributes are prefixed with "cloudEvents_" for use in the
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra space between "Cloud" and "Event"

[message-headers][Kafka-Message-Header] section with exception of `contentType`
MUST be `kafka_contentType`

Examples:

* `eventTime` maps to `cloudEvents_eventTime`
* `eventID` maps to `cloudEvents_eventID`
* `cloudEventsVersion` maps to `cloudEvents_cloudEventsVersion`

##### 3.1.3.2 User Property Values

The value for each Kafka header is constructed from the respective
header's Kafka representation, compliant with the [Kafka message
format][Kafka-Message-Format] specification.


#### 3.1.4 Examples
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit but since there's just one example you can remove the 's' at the end


This example shows the *binary* mode mapping of an event into the
Kafka message. The CloudEvents `contentType` attribute is mapped to
the Kafka `kafka_contentType` header; all other CloudEvents attributes
are mapped to Kafka Header fields with prefix `cloudEvents_`.

Mind that `cloudEvents_` here does refer to the event `data`
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this line, what do you mean by "cloudEvents_ here" ?

Copy link
Copy Markdown
Author

@longit644 longit644 Oct 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cloudEvents_ is the prefix of CloudEvents context attributes, I want to mark these attributes as CloudEvents context attributes, except for contentType because it's a common attribute.

content carried in the payload.

``` text
------------------ Message -------------------

Topic Name: mytopic

------------------- key ----------------------

Key: mykey

------------------ headers -------------------

kafka_contentType: application/avro
cloudEvents_cloudEventsVersion: "0.1"
cloudEvents_eventType: "com.example.someevent"
cloudEvents_eventTime: "2018-04-05T03:56:24Z"
cloudEvents_eventID: "1234-1234-1234"
cloudEvents_source: "/mycontext/subcontext"
.... further attributes ...

------------------- value --------------------

... application data ...

-----------------------------------------------
```

### 3.2. Structured Content Mode

The *structured* content mode keeps event metadata and data together in the
payload, allowing simple forwarding of the same event across multiple routing
hops, and across multiple transports.

#### 3.2.1. Kafka Content-Type

The [Kafka `kafka_contentType`] property field is set to the media type
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe s/is set/MUST be set/

of an [event format](#14-event-formats).

Example for the [JSON format][JSON-format]:

``` text
content-type: application/cloudevents+json; charset=UTF-8
```

#### 3.2.2. Event Data Encoding

The chosen [event format](#14-event-formats) defines how all attributes,
including the `data` attribute, are represented.

The event metadata and data is then rendered in accordance with the event
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/is/are/ I think

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we be using from RFC2119 keywords in here some place?

format specification and the resulting data becomes the Kafka application
[data][data] section.

#### 3.2.3. Metadata Headers

Implementations MAY include the same Kafka headers as defined for the
[binary mode](#313-metadata-headers).

#### 3.2.4 Examples
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one example


This example shows a JSON event format encoded event:

``` text
------------------ Message -------------------

Topic Name: mytopic

------------------- key ----------------------

Key: mykey

------------------ headers -------------------

kafka_contentType: application/cloudevents+json; charset=UTF-8

------------------- value --------------------

{
"cloudEventsVersion" : "0.1",
"eventType" : "com.example.someevent",

... further attributes omitted ...

"data" : {
... application data ...
}
}

-----------------------------------------------
```

## 4. References

- [Kafka][Kafka] The distributed stream platform
- [Kafka-Message-Format][Kafka-Message-Format] The Kafka format message
- [RFC2046][RFC2046] Multipurpose Internet Mail Extensions (MIME) Part Two:
Media Types
- [RFC2119][RFC2119] Key words for use in RFCs to Indicate Requirement Levels
- [RFC3629][RFC3629] UTF-8, a transformation format of ISO 10646
- [RFC7159][RFC7159] The JavaScript Object Notation (JSON) Data Interchange
Format

[CE]: ./spec.md
[JSON-format]: ./json-format.md
[Kafka]: https://kafka.apache.org
[Kafka-Message-Format]: https://kafka.apache.org/documentation/#messageformat
[Kafka-Message-Header]: https://kafka.apache.org/documentation/#recordheader
[JSON-Value]: https://tools.ietf.org/html/rfc7159#section-3
[RFC2046]: https://tools.ietf.org/html/rfc2046
[RFC2119]: https://tools.ietf.org/html/rfc2119
[RFC3629]: https://tools.ietf.org/html/rfc3629
[RFC7159]: https://tools.ietf.org/html/rfc7159