KAFKA-9707: Fix InsertField.Key should apply to keys of tombstone records#8280
Conversation
* Fix typo that hardcoded .value() instead of abstract operatingValue * Add test for Key transform that was previously not tested Signed-off-by: Greg Harris <gregh@confluent.io>
ncliang
left a comment
There was a problem hiding this comment.
Good catch, @gharris1727 ! It would be bad if we transformed the record but not its corresponding tombstone representing the delete.
|
cc @rhauch @kkonstantine for review |
kkonstantine
left a comment
There was a problem hiding this comment.
This behavior, of applying the SMT but leaving the tombstone records unaffected, seems to be intentional if you read the details of KAFKA-8523
Tbh, while I was reading this PR I also found counter-intuitive that someone would use this SMT to transform tombstones to non-tombstone records. Indeed, tombstone records are special in that they represent deletion in several downstream systems as well as Kafka often times.
In the light of the previous discussions, any reasons to consider keeping this fix @gharris1727 @ncliang ?
|
@kkonstantine Yes, by the literal description, "leaving the tombstone record unaffected" is the current behavior, and not what this PR would implement. However, I think that the (1) "leave the tombstone record unaffected" strategy is being used in contrast to the other strategies discussed in the previous issue, which were to (2) drop the tombstone, or (3) create a new empty map with only the inserted fields present. I think the decision to implement Strategy (1) among these three choices is correct, since (2) and (3) discard the meaning of the tombstone while (1) keeps that meaning in the output stream. I think that decision was made without considering the behavior of InsertField.Key, which coincidentally had no unit tests. I only happened to notice this error because I copied this fix to another SMT which did have key tests, and those tests were broken by this behavior. I think the behavior that this PR implements is another option (4) "leave tombstone values unaffected, leave null keys unaffected, and otherwise transform keys". With the current behavior, tombstones in the output stream would have different keys than non-tombstone values that had the same initial key. If the output stream were written back to a topic, it would not trigger compaction. If the system consuming the output stream assumed that the inserted field would exist in every record, tombstone events would violate that assumption. I think these behaviors are not correct semantically, and are bugs, not features. |
|
Oh, that's right. I missed that this referred only to the key of a record. It's interesting that this distinction was missed in the previous PR. I agree in that case. We should do the right thing and transform the key of a tombstone record, if that's what the configuration is aiming to do. Given that this is not a regression, I'll wait for 2.5.0 to be released before I merge and backport as far back as it makes sense. |
|
retest this please |
rhauch
left a comment
There was a problem hiding this comment.
Nice work, @gharris1727. However, I think we should clean the previous fix up even more, and more fully test the changes.
|
|
||
| private boolean isTombstoneRecord(R record) { | ||
| return record.value() == null; | ||
| return operatingValue(record) == null; |
There was a problem hiding this comment.
Can we please clean up the logic? This method no longer returns true if it's just a tombstone record; it also returns true if the key is null for InsertField$Key.
I'd suggest removing this method altogether and just changing the point where this method is called to simply be:
if (operatingValue(record) == null) {
There was a problem hiding this comment.
Plus one for this; confusion around the term tombstone has already caused plenty of miscommunication around this issue and we should be careful to use it correctly.
…ed record. Signed-off-by: Greg Harris <gregh@confluent.io>
| final SourceRecord transformedRecord = xformKey.apply(record); | ||
|
|
||
| assertEquals(null, transformedRecord.key()); | ||
| assertEquals(42L, ((Map<?, ?>) transformedRecord.value()).get("magic")); |
There was a problem hiding this comment.
Wouldn't it be sufficient to replace these two asserts with the following?
assertSame(record, transformedRecord);
This is a bit more correct, since we want to assert that in this case the transform returns the original record. WDYT?
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
rhauch
left a comment
There was a problem hiding this comment.
LGTM. Thanks for finding and fixing this regression, @gharris1727.
|
retest this please |
|
@gharris1727 please fix the checkstyle issue (see I can trigger a build right after. |
Signed-off-by: Greg Harris <gregh@confluent.io>
|
retest this please |
|
kkonstantine
left a comment
There was a problem hiding this comment.
Latest builds failed abruptly on unrelated issues.
We had a green build and the changes are straightforward.
LGTM
Merging as is. Thanks @gharris1727 for catching this bug.
…ords (#8280) * KAFKA-9707: Fix InsertField.Key not applying to tombstone events * Fix typo that hardcoded .value() instead of abstract operatingValue * Add test for Key transform that was previously not tested Signed-off-by: Greg Harris <gregh@confluent.io> * Add null value assertion to tombstone test * Remove mis-named function and add test for passing-through a null-keyed record. Signed-off-by: Greg Harris <gregh@confluent.io> * Simplify unchanged record assertion Signed-off-by: Greg Harris <gregh@confluent.io> * Replace assertEquals with assertSame Signed-off-by: Greg Harris <gregh@confluent.io> * Fix checkstyleTest indent issue Signed-off-by: Greg Harris <gregh@confluent.io>
…ords (#8280) * KAFKA-9707: Fix InsertField.Key not applying to tombstone events * Fix typo that hardcoded .value() instead of abstract operatingValue * Add test for Key transform that was previously not tested Signed-off-by: Greg Harris <gregh@confluent.io> * Add null value assertion to tombstone test * Remove mis-named function and add test for passing-through a null-keyed record. Signed-off-by: Greg Harris <gregh@confluent.io> * Simplify unchanged record assertion Signed-off-by: Greg Harris <gregh@confluent.io> * Replace assertEquals with assertSame Signed-off-by: Greg Harris <gregh@confluent.io> * Fix checkstyleTest indent issue Signed-off-by: Greg Harris <gregh@confluent.io>
…ords (#8280) * KAFKA-9707: Fix InsertField.Key not applying to tombstone events * Fix typo that hardcoded .value() instead of abstract operatingValue * Add test for Key transform that was previously not tested Signed-off-by: Greg Harris <gregh@confluent.io> * Add null value assertion to tombstone test * Remove mis-named function and add test for passing-through a null-keyed record. Signed-off-by: Greg Harris <gregh@confluent.io> * Simplify unchanged record assertion Signed-off-by: Greg Harris <gregh@confluent.io> * Replace assertEquals with assertSame Signed-off-by: Greg Harris <gregh@confluent.io> * Fix checkstyleTest indent issue Signed-off-by: Greg Harris <gregh@confluent.io>
…ords (#8280) * KAFKA-9707: Fix InsertField.Key not applying to tombstone events * Fix typo that hardcoded .value() instead of abstract operatingValue * Add test for Key transform that was previously not tested Signed-off-by: Greg Harris <gregh@confluent.io> * Add null value assertion to tombstone test * Remove mis-named function and add test for passing-through a null-keyed record. Signed-off-by: Greg Harris <gregh@confluent.io> * Simplify unchanged record assertion Signed-off-by: Greg Harris <gregh@confluent.io> * Replace assertEquals with assertSame Signed-off-by: Greg Harris <gregh@confluent.io> * Fix checkstyleTest indent issue Signed-off-by: Greg Harris <gregh@confluent.io>
…ords (#8280) * KAFKA-9707: Fix InsertField.Key not applying to tombstone events * Fix typo that hardcoded .value() instead of abstract operatingValue * Add test for Key transform that was previously not tested Signed-off-by: Greg Harris <gregh@confluent.io> * Add null value assertion to tombstone test * Remove mis-named function and add test for passing-through a null-keyed record. Signed-off-by: Greg Harris <gregh@confluent.io> * Simplify unchanged record assertion Signed-off-by: Greg Harris <gregh@confluent.io> * Replace assertEquals with assertSame Signed-off-by: Greg Harris <gregh@confluent.io> * Fix checkstyleTest indent issue Signed-off-by: Greg Harris <gregh@confluent.io>
…ords (#8280) * KAFKA-9707: Fix InsertField.Key not applying to tombstone events * Fix typo that hardcoded .value() instead of abstract operatingValue * Add test for Key transform that was previously not tested Signed-off-by: Greg Harris <gregh@confluent.io> * Add null value assertion to tombstone test * Remove mis-named function and add test for passing-through a null-keyed record. Signed-off-by: Greg Harris <gregh@confluent.io> * Simplify unchanged record assertion Signed-off-by: Greg Harris <gregh@confluent.io> * Replace assertEquals with assertSame Signed-off-by: Greg Harris <gregh@confluent.io> * Fix checkstyleTest indent issue Signed-off-by: Greg Harris <gregh@confluent.io>
…ords (#8280) * KAFKA-9707: Fix InsertField.Key not applying to tombstone events * Fix typo that hardcoded .value() instead of abstract operatingValue * Add test for Key transform that was previously not tested Signed-off-by: Greg Harris <gregh@confluent.io> * Add null value assertion to tombstone test * Remove mis-named function and add test for passing-through a null-keyed record. Signed-off-by: Greg Harris <gregh@confluent.io> * Simplify unchanged record assertion Signed-off-by: Greg Harris <gregh@confluent.io> * Replace assertEquals with assertSame Signed-off-by: Greg Harris <gregh@confluent.io> * Fix checkstyleTest indent issue Signed-off-by: Greg Harris <gregh@confluent.io>
…ords (#8280) * KAFKA-9707: Fix InsertField.Key not applying to tombstone events * Fix typo that hardcoded .value() instead of abstract operatingValue * Add test for Key transform that was previously not tested Signed-off-by: Greg Harris <gregh@confluent.io> * Add null value assertion to tombstone test * Remove mis-named function and add test for passing-through a null-keyed record. Signed-off-by: Greg Harris <gregh@confluent.io> * Simplify unchanged record assertion Signed-off-by: Greg Harris <gregh@confluent.io> * Replace assertEquals with assertSame Signed-off-by: Greg Harris <gregh@confluent.io> * Fix checkstyleTest indent issue Signed-off-by: Greg Harris <gregh@confluent.io>
|
The PR was merged to Unfortunately when I tried to |
…ords (apache#8280) * KAFKA-9707: Fix InsertField.Key not applying to tombstone events * Fix typo that hardcoded .value() instead of abstract operatingValue * Add test for Key transform that was previously not tested Signed-off-by: Greg Harris <gregh@confluent.io> * Add null value assertion to tombstone test * Remove mis-named function and add test for passing-through a null-keyed record. Signed-off-by: Greg Harris <gregh@confluent.io> * Simplify unchanged record assertion Signed-off-by: Greg Harris <gregh@confluent.io> * Replace assertEquals with assertSame Signed-off-by: Greg Harris <gregh@confluent.io> * Fix checkstyleTest indent issue Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris gregh@confluent.io
Committer Checklist (excluded from commit message)