Hi all,
We're using the Kafka Indexing Service to ingest data to Druid (using version 0.10.0). Recently we're seeing it lag every hour - exactly at each hour, the end time of the data source stops advancing for 5-6 min, and after 2-3 min it fully recovers.
This is most likely related to segment handoff somehow, however -
- We have plenty of capacity in terms of workers - most of the time we have 4 kafka tasks x 2 data sources, and we have 4 servers with a capacity of 5 per server. So, assuming we need x2 tasks during handoff, we certainly have it.
- Looking at the performance metrics (load, cpu, memory) of the middle managers at the time of the lag - they don't seem to be working hard.
Are there any parameters on the overlord/middle managers/kafka tasks that should be updated to solve this?
Will happily provide any additional info that can help
Thank you!
Eran
Hi all,
We're using the Kafka Indexing Service to ingest data to Druid (using version 0.10.0). Recently we're seeing it lag every hour - exactly at each hour, the end time of the data source stops advancing for 5-6 min, and after 2-3 min it fully recovers.
This is most likely related to segment handoff somehow, however -
Are there any parameters on the overlord/middle managers/kafka tasks that should be updated to solve this?
Will happily provide any additional info that can help
Thank you!
Eran