KAFKA-3209: KIP-66: more single message transforms#2374
KAFKA-3209: KIP-66: more single message transforms#2374shikhar wants to merge 23 commits intoapache:trunkfrom shikhar:more-smt
Conversation
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
| public abstract class InsertField<R extends ConnectRecord<R>> implements Transformation<R> { | ||
|
|
||
| public interface Keys { | ||
| public interface ConfigName { |
There was a problem hiding this comment.
Seems this can be private instead? Ditto for other classes. I thought it was possibly for tests but it doesn't seem to be used anywhere anyway.
| /** | ||
| * Set the schema name, version or both on the record's key schema. | ||
| */ | ||
| public static class Key<R extends ConnectRecord<R>> extends SetSchemaMetadata<R> { |
There was a problem hiding this comment.
Lack of multiple inheritance is really annoying in cases like this....
There was a problem hiding this comment.
Curious, how would multiple inheritance help?
There was a problem hiding this comment.
All these classes use the same pattern are defining the same overrides for operatingSchema and updatedRecord. If you put that into a single abstract class, then the definition of the Key and Value types would just be one line -- the class declaration that inherits from both SetSchemaMetadata and this new abstract class that provides the overrides.
It's obviously not a big deal here since the implementations are trivial, just kind of annoying code bloat.
| public R apply(R record) { | ||
| final Schema valueSchema = record.valueSchema(); | ||
| if (valueSchema == null) { | ||
| if (!(record.value() instanceof Map)) { |
There was a problem hiding this comment.
We end up doing this same set of checks in a bunch of places. Any way we could reduce the duplicated code?
There was a problem hiding this comment.
I'll factor out a couple of helper functions to call into requireMap and requireStructSchema
|
|
||
| Schema keySchema = valueToKeySchemaCache.get(valueSchema); | ||
| if (keySchema == null) { | ||
| final SchemaBuilder keySchemaBuilder = SchemaBuilder.struct(); |
There was a problem hiding this comment.
We can probably skip it for now, but a cache for this conversion might be a nice improvement eventually. I'd guess that with schemas you're probably mostly seeing the same conversion.
There was a problem hiding this comment.
There is a cache here though, valueToKeySchemaCache
There was a problem hiding this comment.
Clearly I was reading this without my 👓
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
@ewencp should be ready for review.
|
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
| final String topic = matcher.replaceFirst(replacement); | ||
| return record.newRecord(topic, record.kafkaPartition(), record.keySchema(), record.key(), record.valueSchema(), record.value(), record.timestamp()); | ||
| } | ||
| return record; |
There was a problem hiding this comment.
Wondering if it may be useful here to have an option to drop records (return null here) that don't match the regex against the record.topic().
There was a problem hiding this comment.
Hmm, not sure. I suspect I'd actually prefer an error in that case.
|
@ewencp addressed comments |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
@ewencp sneaked in |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
|
||
| import java.util.Map; | ||
|
|
||
| public class SchemaUtil { |
There was a problem hiding this comment.
Are transforms ending up in the javadocs? Just curious if we're making stuff like this appear as public APIs when we really just want this to be an internal utility.
Renames `HoistToStruct` SMT to `HoistField`. Adds the following SMTs: `ExtractField` `MaskField` `RegexRouter` `ReplaceField` `SetSchemaMetadata` `ValueToKey` Adds HTML doc generation and updates to `connect.html`. Author: Shikhar Bhushan <shikhar@confluent.io> Reviewers: Ewen Cheslack-Postava <ewen@confluent.io> Closes #2374 from shikhar/more-smt
Renames `HoistToStruct` SMT to `HoistField`. Adds the following SMTs: `ExtractField` `MaskField` `RegexRouter` `ReplaceField` `SetSchemaMetadata` `ValueToKey` Adds HTML doc generation and updates to `connect.html`. Author: Shikhar Bhushan <shikhar@confluent.io> Reviewers: Ewen Cheslack-Postava <ewen@confluent.io> Closes apache#2374 from shikhar/more-smt
Renames
HoistToStructSMT toHoistField.Adds the following SMTs:
ExtractFieldMaskFieldRegexRouterReplaceFieldSetSchemaMetadataValueToKeyAdds HTML doc generation and updates to
connect.html.