-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Existing behavior
Sometimes broker sees high zk-latency due to gc pauses or high load on zookeeper, and it may cause brokers to lose zk-session which can create cold restart situation and that subsequently creates high pressure on zk as cold restart loads high number of concurrent topics and lookup-requests.
Broker has capability to throttle concurrent lookup-request and topic-loading which helps to reduce back-pressure on zookeeper. with PR #320 Broker will also have capability to monitor zk-latency at real time as well.
Change
So, broker should dynamically control throttling if it sees zk-latency is going higher than configured threshold (eg. threshold=0.75*zkSessionTimeOut) which can reduce zk-load and helps broker to keep zk-session alive and avoid cold-restart. So, we can implement controller which monitors zk-stats and takes appropriate actions (can be enhanced to consider additional variables).
@merlimat @saandrews any thought?