fix huge number of watches in zk issue#6683
fix huge number of watches in zk issue#6683kaijianding wants to merge 1 commit intoapache:masterfrom
Conversation
There was a problem hiding this comment.
If a production Druid cluster has issues I don't think cluster operators will be happy to see this humour in the logs.
There was a problem hiding this comment.
Announcer should be a javadoc link
There was a problem hiding this comment.
"which is different with Announcer" the idea is not complete. Please elaborate.
There was a problem hiding this comment.
which is different with Announcer means NodeAnnouncer announces single node on Zookeeper and only watch this node, while Announcer watches all child path, not only this node
There was a problem hiding this comment.
The name and the purpose of this field is not clear
There was a problem hiding this comment.
Maybe it is clearer to call it pathsCreatedInThisAnnouncer? Is there anything specific about them being "parents"? They are parents of what?
There was a problem hiding this comment.
This list is iterated only once. There is no win in using CopyOnWriteArrayList rather than a simple ArrayList protected by a lock.
There was a problem hiding this comment.
How is this different from toAnnounce?
There was a problem hiding this comment.
toUpdate is added when update() called before zk is connected, the path to update may or may not be added to toAnnounce before.
Most part of NodeAnnouncer's code logic is following the original Announcer.java
There was a problem hiding this comment.
Please annotate @GuadedBy("toAnnounce").
There was a problem hiding this comment.
stop is the stop() method annotated with LifecycleStop
There was a problem hiding this comment.
the curator.xxx.forPath(parentPath) throws Exception in method signature, so we have to catch Exception
There was a problem hiding this comment.
the NodeCache.start() throws Exception in method signature, so we have to catch Exception
|
From #6647,
This seems strange & unnecessary, do you know why the tasks are creating watches for each other? They shouldn't be watching each other, I think, since they don't have a good reason to. Only the coordinators and brokers should be watching tasks, since they're the only ones that care about them (for lookups & segment discovery purposes). Does Announcer set up too many watches, for some reason? |
|
@gianm I'm not familiar with curator code, but it seems because we watch the parent path to get child events, curator adds watch for each child path |
|
Hi @kaijianding, I still don't understand why you have such many watchers. In Druid, only master nodes (brokers, coordinator, overlord if you use zk-based remoteTaskRunner) watch such announcements. For example, |
|
@jihoonson Announcer.java is also used in realtime node to re-announce itself when temporary disconnecting to zk, Then a realtime node also watches |
|
@kaijianding most answers to my questions that you given should be transformed into code comments, or the code should be altered in some other way. The objective is to approach the state that if I read the code for the first time, none of my questions were arisen. |
|
@kaijianding thanks. Would you please fix the conflicts and check @leventov's last comment? |
|
@jihoonson @leventov @gianm conflicts are resolved and code comments are updated, do you have any more comment? |
|
@kaijianding this comment: #6683 (comment) is mostly not addressed. Please don't force push: https://github.com/apache/incubator-druid/blob/master/CONTRIBUTING.md#if-your-pull-request-shows-conflicts-with-master |
|
@leventov I added comments to toAnnounce, toUpdate, parentsIBuilt, and modified some other comments to help better understanding. Could you tell me which part of comments you think can improve? |
|
|
||
| private final CuratorFramework curator; | ||
|
|
||
| // incase a path is added to `toAnnounce` in announce() before zk is connected, |
There was a problem hiding this comment.
Please use Javadoc /** ... */ comments for fields.
There was a problem hiding this comment.
Is this right that it's implied in this comment that announce() may be called concurrently by somebody? How is this possible before start()? Could you please add concurrent control flow documentation to announce() and start() methods?
| // incase a path is added to `toAnnounce` in announce() before zk is connected, | ||
| // should remember the path and do announce in start() later | ||
| private final List<Announceable> toAnnounce = new ArrayList<>(); | ||
| // incase a path is added to `toUpdate` in update() before zk is connected, |
|
|
||
| // incase a path is added to `toAnnounce` in announce() before zk is connected, | ||
| // should remember the path and do announce in start() later | ||
| private final List<Announceable> toAnnounce = new ArrayList<>(); |
There was a problem hiding this comment.
Maybe it's clearer to call it toAnnounceInStart?
There was a problem hiding this comment.
Please annotate @GuadedBy("toAnnounce"). Could use create a separate lock object in a field called "lock"?
| * NodeAnnouncer announces single node on Zookeeper and only watches this node, | ||
| * while {@link Announcer} watches all child paths, not only this node | ||
| */ | ||
| public class NodeAnnouncer |
There was a problem hiding this comment.
NodeAnnouncer and Announcer share a lot of code. Is it possible to extract a common base class and either extend it as Announcer and NodeAnnouncer or just add a flag to that basic class's state and use something like BaseAnnouncer(watchAll=true) as Announcer and BaseAnnouncer(watchAll=true) as NodeAnnouncer?
There was a problem hiding this comment.
Could you please add concurrent control flow documentation to this class? From what threads (thread pools, executors) each method may be called?
| private final List<Announceable> toUpdate = new ArrayList<>(); | ||
| private final ConcurrentMap<String, NodeCache> listeners = new ConcurrentHashMap<>(); | ||
| private final ConcurrentMap<String, byte[]> announcedPaths = new ConcurrentHashMap<>(); | ||
| // only who creates the parent path can drop the parent path, so should remmeber the created parents |
There was a problem hiding this comment.
"only who creates the parent path can drop the parent path" - "Who" here is a thread? A class? Is this rule imposed by the Curator framework?
There was a problem hiding this comment.
Please annotate @GuadedBy("toAnnounce").
| synchronized (toAnnounce) { | ||
| try { | ||
| if (!Arrays.equals(oldBytes, bytes)) { | ||
| announcedPaths.put(path, bytes); |
There was a problem hiding this comment.
Looks like a race condition is possible between announcedPaths.get() and announcedPaths.put(), is that benign? If not, please rewrite the code using map.compute().
|
|
||
| boolean created = false; | ||
| synchronized (toAnnounce) { | ||
| if (started) { |
There was a problem hiding this comment.
It seems that started flag is already checked at the beginning of this method, why it is checked again here?
| } | ||
| ); | ||
|
|
||
| if (started) { |
There was a problem hiding this comment.
It seems that started flag is already checked at the beginning of this method, why it is checked again here?
| } | ||
| CloseQuietly.close(closer); | ||
|
|
||
| for (String announcementPath : announcedPaths.keySet()) { |
There was a problem hiding this comment.
Entries are removed via map.remove() while iterating over the map. I cannot find proof that this is safe; it would be better to remove them via iterator.remove().
|
@kaijianding do you plan to continue this work? |
|
@kaijianding would you please fix the conflicts and check @leventov's latest comments? BTW, I'm untagging milestone since this issue is not necessarily a release blocker. Feel free to let me know if you think this should be. |
|
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions. |
|
This pull request/issue has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. |
|
This pull request/issue is no longer marked as stale. |
|
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions. |
|
This pull request/issue has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. |

fix #6647 .
I guess the origin Announcer.java is better performance for segment announcement, so I just keep it.
I guess the root cause of this issue is that each child path has a watch on zookeeper server even we only add listener at the parent path. When the path is like /druid/prod/announces/$host, the number of watches is n * n (each server watches every host including itself). Plus the path for lookup, the number is n * n * 2.
I test it in my product environment, the number of watches is reduced as expected.