diff --git a/docs/streams/core-concepts.html b/docs/streams/core-concepts.html index 884b39898ed9b..d9a2851e2713f 100644 --- a/docs/streams/core-concepts.html +++ b/docs/streams/core-concepts.html @@ -309,7 +309,7 @@
- Besides the guarantee that each record will be processed exactly-once, another issue that many stream processing application will face is how to + Besides the guarantee that each record will be processed exactly-once, another issue that many stream processing applications will face is how to handle out-of-order data that may impact their business logic. In Kafka Streams, there are two causes that could potentially result in out-of-order data arrivals with respect to their timestamps:
@@ -328,13 +328,16 @@See the semantics overview at the bottom of this section for a detailed description.
@@ -2884,6 +2903,9 @@For each input record on the left side that does not have any match on the right side, the ValueJoiner will be called with ValueJoiner#apply(leftRecord.value, null);
this explains the row with timestamp=3 in the table below, which lists [A, null] in the LEFT JOIN column.
null.See the semantics overview at the bottom of this section for a detailed description.
@@ -3609,6 +3631,52 @@By default, tables in Kafka Streams use offset-based semantics. When multiple records arrive for the same key, the one with the largest record offset + is considered the latest record for the key, and is the record that appears in aggregation and join results computed on the table. This is true even + in the event of out-of-order data. The record with the + largest offset is considered to be the latest record for the key, even if this record does not have the largest timestamp.
+An alternative to offset-based semantics is timestamp-based semantics. With timestamp-based semantics, the record with the largest timestamp is + considered the latest record, even if there is another record with a larger offset (and smaller timestamp). If there is no out-of-order data (per key), + then offset-based semantics and timestamp-based semantics are equivalent; the difference only appears when there is out-of-order data.
+Starting with Kafka Streams 3.5, Kafka Streams supports timestamp-based semantics through the use of + versioned state stores. + When a table is materialized with a versioned state store, it is a versioned table and will result in different processor semantics in the presence of + out-of-order data.
+count
+ and reduce operations as well, in addition to
+ aggregate operations.null
+ records downstream of the filter than compared to when filtering an unversioned table. This is done in order to preserve a complete version history downstream,
+ in the event of out-of-order data.suppress operations are not allowed on versioned tables, as this would collapse the version history
+ and lead to undefined behavior.Once a table is materialized with a versioned store, downstream tables are also considered versioned until any of the following occurs:
+The results of certain processors should not be materialized with versioned stores, as these processors do not produce a complete older version history, + and therefore materialization as a versioned table would lead to unpredictable results:
+aggregate,
+ count and reduce operations.For more on versioned stores and how to start using them in your application, see here.
+Any streams and tables may be (continuously) written back to a Kafka topic. As we will describe in more detail below, the output data might be diff --git a/docs/streams/developer-guide/processor-api.html b/docs/streams/developer-guide/processor-api.html index 586e55f6d0b1a..ccb03ce7b50c1 100644 --- a/docs/streams/developer-guide/processor-api.html +++ b/docs/streams/developer-guide/processor-api.html @@ -437,7 +437,7 @@
Versioned stores do not support caching or interactive queries at this time. - Also, window stores may not be versioned.
+ Also, window stores and global tables may not be versioned. Upgrade note: Versioned state stores are opt-in only; no automatic upgrades from non-versioned to versioned stores will take place.Upgrades are supported from persistent, non-versioned key-value stores diff --git a/docs/streams/upgrade-guide.html b/docs/streams/upgrade-guide.html index b668ba5c150b6..16614db7f1538 100644 --- a/docs/streams/upgrade-guide.html +++ b/docs/streams/upgrade-guide.html @@ -139,14 +139,15 @@
A new state store type, versioned key-value stores, was introduced in - KIP-889. + KIP-889 and + KIP-914. Rather than storing a single record version (value and timestamp) per key, versioned state stores may store multiple record versions per key. This allows versioned state stores to support timestamped retrieval operations to return the latest record (per key) as of a specified timestamp. For more information, including how to upgrade from a non-versioned key-value store to a versioned store in an existing application, see the - Developer Guide section. + Developer Guide. Versioned key-value stores are opt-in only; existing applications will not be affected upon upgrading to 3.5 without explicit code changes.
diff --git a/docs/upgrade.html b/docs/upgrade.html index bec4cc07c5ffa..58dd1a8e12a13 100644 --- a/docs/upgrade.html +++ b/docs/upgrade.html @@ -35,6 +35,7 @@