Skip to content

fix segment info in Kafka indexing service docs#5390

Merged
gianm merged 2 commits intoapache:masterfrom
pjain1:kis_doc
Feb 15, 2018
Merged

fix segment info in Kafka indexing service docs#5390
gianm merged 2 commits intoapache:masterfrom
pjain1:kis_doc

Conversation

@pjain1
Copy link
Copy Markdown
Member

@pjain1 pjain1 commented Feb 14, 2018

Fixes #5384

@pjain1 pjain1 added this to the 0.12.0 milestone Feb 14, 2018
@pjain1 pjain1 requested review from dclim and gianm February 14, 2018 23:38
will not be sufficient to keep the number of segments at an optimal level. It is recommended that scheduled re-indexing
tasks be run to merge segments together into new segments of an ideal size (in the range of ~500-700 MB per segment).
Each Kafka Indexing Task puts events consumed from Kafka partitions assigned to it in a single segment for each segment
granular interval. Kafka Indexing Task also does incremental hand-offs which means that all the segments created by a
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this isn't strictly true,

Each Kafka Indexing Task puts events consumed from Kafka partitions assigned to it in a single segment for each segment granular interval.

Rather, it's going to be one segment at a time, but it might be more than one segment is maxRowsPerSegment is hit.

This means that the task can run for longer durations of time without accumulating old segments locally on Middle Manager
nodes and it is encouraged to do so.

Kafka Indexing Service may still produce some small segments. Lets say the task duration is 4 hours, segment granulairty
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

granularity (spelling)

@pjain1
Copy link
Copy Markdown
Member Author

pjain1 commented Feb 15, 2018

@gianm does this look ok now ?

Copy link
Copy Markdown
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks @pjain1!

@gianm
Copy link
Copy Markdown
Contributor

gianm commented Feb 15, 2018

Going to merge this without CI completing since there is no way it will complain about a doc change.

@gianm gianm merged commit b9b3be6 into apache:master Feb 15, 2018
gianm pushed a commit to gianm/druid that referenced this pull request Feb 15, 2018
* fix segment info in Kafka indexing service docs

* review updates
@gianm
Copy link
Copy Markdown
Contributor

gianm commented Feb 15, 2018

Backport in #5393

@pjain1 pjain1 deleted the kis_doc branch February 15, 2018 18:07
jon-wei pushed a commit that referenced this pull request Feb 15, 2018
* fix segment info in Kafka indexing service docs

* review updates
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants