Skip to content

Data loader (sampler component) - Kafka/Kinesis samplers#7566

Merged
clintropolis merged 3 commits intoapache:masterfrom
dclim:kafka-kinesis-sampler-only
May 17, 2019
Merged

Data loader (sampler component) - Kafka/Kinesis samplers#7566
clintropolis merged 3 commits intoapache:masterfrom
dclim:kafka-kinesis-sampler-only

Conversation

@dclim
Copy link
Copy Markdown
Contributor

@dclim dclim commented Apr 28, 2019

Implementation of the sampler component of #7502.
Depends on #7531.

Adds additional implementations to support sampling from Kafka and Kinesis.

Copy link
Copy Markdown
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm 👍

{
insertData(generateRecords(TOPIC));

KafkaSupervisorSpec supervisorSpec = new KafkaSupervisorSpec(DATA_SCHEMA, null, new KafkaSupervisorIOConfig(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: formatting looks off


replayAll();

KinesisSupervisorSpec supervisorSpec = new KinesisSupervisorSpec(DATA_SCHEMA, null, new KinesisSupervisorIOConfig(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: formatting

private void assignAndSeek() throws InterruptedException
{
final Set<StreamPartition<PartitionIdType>> partitions = recordSupplier
.getPartitionIds(ioConfig.getStream()).stream()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: .stream() should probably be on newline

private final RecordSupplier<PartitionIdType, SequenceOffsetType> recordSupplier;

private Iterator<OrderedPartitionableRecord<PartitionIdType, SequenceOffsetType>> recordIterator;
private Iterator<byte[]> interRecordIterator;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the starting bit of nextRowWithRaw might be a bit clearer if this variable was named something like recordBytesIterator or recordDataIterator?

@Override
public SamplerResponse sample()
{
return firehoseSampler.sample(new FirehoseFactory()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: formatting

@vogievetsky
Copy link
Copy Markdown
Contributor

Been testing and using this feature as a "user", seems to work really well. Check it out here: https://youtu.be/tAEp5BXVHYE

Copy link
Copy Markdown
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 👍

We really need builders for those indexing config classes, but not this PR

@clintropolis clintropolis merged commit d384579 into apache:master May 17, 2019
@dclim dclim deleted the kafka-kinesis-sampler-only branch May 17, 2019 05:32
clintropolis pushed a commit to clintropolis/druid that referenced this pull request May 17, 2019
* implement Kafka/Kinesis sampler

* add KafkaSamplerSpecTest and KinesisSamplerSpecTest

* code review changes
jihoonson pushed a commit to implydata/druid-public that referenced this pull request Jun 25, 2019
* implement Kafka/Kinesis sampler

* add KafkaSamplerSpecTest and KinesisSamplerSpecTest

* code review changes
jihoonson pushed a commit to implydata/druid-public that referenced this pull request Jun 26, 2019
* implement Kafka/Kinesis sampler

* add KafkaSamplerSpecTest and KinesisSamplerSpecTest

* code review changes
gianm pushed a commit to implydata/druid-public that referenced this pull request Jul 3, 2019
* implement Kafka/Kinesis sampler

* add KafkaSamplerSpecTest and KinesisSamplerSpecTest

* code review changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants