[Proposal] Add batching support in kafka-indexing-service

It would benefit some users if kafka-indexing service supported the capability to let users put multiple druid InputRow inside single kafka record.
This allows users to do batching while still using kafka sync producer which allows only one kafka record at a time.

I would imagine adding following method to `InputRowParser` interface.
```
  default List<InputRow> parseBatch(T input)
  {
    return ImmutableList.of(parse(input));
  }
```

current `InputRow parse(T input)` method would be deprecated and all of druid code would be adjusted to use `parseBatch(input)` instead.

kafka-indexing service will need to persist <offset,row-number-in-record> pair instead of just offset to support exactly once. I believe, this will end up being the biggest change really.

This approach would add batching support in general and not just for kafka-indexing .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Add batching support in kafka-indexing-service #4373

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Proposal] Add batching support in kafka-indexing-service #4373

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions