Optional long-polling based segment announcement via HTTP instead of Zookeeper#3902
Optional long-polling based segment announcement via HTTP instead of Zookeeper#3902himanshug merged 8 commits intoapache:masterfrom
Conversation
a391840 to
8e2b1be
Compare
|
If I understand correctly, you update segments by calling httpClient.go(url) for each server in the If the cluster is not a busy cluster, can a connection is in |
bd2f8de to
373feff
Compare
|
@kaijianding thanks for taking a look. you are right about the connection usage and yes coordinator/broker would be using one connection from HttpClient per historical/realtime all the time. btw, I have also made few updates in SegmentListerResource to make it async and not hold any jetty threads while in wait. |
|
Related #2368 |
|
removing the discovery of historical/realtime nodes from zookeeper is gonna relate with #2312 (as described in #2312 (comment) ) . I am looking into that and that would be done in a separate PR. |
|
@himanshug according to your idea, through looping asking segments from historical/realtime nodes background ,how to avoid holding stale segment infos on coordinator/brokers side if not looped in time while a query occurred ? |
|
@weijietong brokers/coordinators would not have stale segment info because they are running inifinite (no wait) loop to get latest segment information from historical/realtime nodes.... historical/realtime nodes "hold" the request till there is new information to provide or timeout provided in the request is reached. |
|
@himanshug a general comment - it would be good to put some comments on important classes, variables and methods. For example, a high level design comment on I have gone through the code at high level and it looks good me to so far. |
3962f92 to
824631a
Compare
|
@pjain1 added some docs for clarification |
|
@himanshug, I know I said on the dev sync today that I thought this should remain undocumented, but I changed my mind. I'm still ok with it only being reviewed by people from your organization, but I think it should be documented as experimental. That way other folks could help try it out too. |
824631a to
2295cf4
Compare
There was a problem hiding this comment.
It feels weird to me that this is taking a CuratorFramework. I think it would be better for this to depend on something closer to what it actually needs (i.e. something that allows for discovering nodes in the cluster). The implementation for that could remain essentially the same as what exists here, but I think it would be nice to not clutter this class with that code.
There was a problem hiding this comment.
refactored server notifications in separate class .
There was a problem hiding this comment.
"segment callbacks" reads to me as if it should be called on every single segment. The actual callback being called here seems like it shouldn't be called quite that often (maybe once?). Is something named weird or am I mistaken about what's happening?
There was a problem hiding this comment.
yes , it is for individual segments and once at inventory initialization. this naming and behavior is retained from CuratorBasedServerInventoryView ( called ServerInventoryView earlier)
There was a problem hiding this comment.
Why does this block on the fetch?
There was a problem hiding this comment.
not needed anymore , it was needed in an older version of code ... removed now
There was a problem hiding this comment.
Why are we guarding a final?
There was a problem hiding this comment.
added comment in code, this is to ensure consistent segment state stored in DruidServer and counter managed.
There was a problem hiding this comment.
This might be pre-mature optimization, but I worry about how this data structure is being used. One option would be to view the list as a revolving buffer and essentially maintain an index of the oldest "valid" item. You can replace and increment that on add and start your search from there wrapping back around.
There was a problem hiding this comment.
added a CircularBuffer impl instead of using list as a circular buffer.
There was a problem hiding this comment.
Please include the actual values that came in here, just the message as it exists won't really help much with debugging if it actually gets thrown.
There was a problem hiding this comment.
This is an optimization, maybe not so important, but I think you could actually compute the index into the array with some maths and the counters.
There was a problem hiding this comment.
not really and also as you said not so important.
There was a problem hiding this comment.
Why are we removing the isAnnounced check?
There was a problem hiding this comment.
updated BatchDataSegmentAnnouncer.announce(..) to always do the check so that it gets done in other places as well.
There was a problem hiding this comment.
If you do what I wrote in a previous comment and separate the lookup management into a different implementation of DataSegmentAnnouncer, then have that concrete type injected here instead of the interface (if you mark it @Nullable Guice should be fine even if it's not bound because the http thing isn't being used)
There was a problem hiding this comment.
this now directly uses BatchDataSegmentAnnouncer
|
@cheddar I also introduced |
|
@gianm I still think we should keep it undocumented for now and not confuse regular users with a feature which is just to optimize things and haven't been used at scale yet. We will try and document it in next release probably. |
|
👍 |
…etty qos filters can be configured easily when needed
…segment list fetch has been succeeded from all servers
…ce themselves and not all peon processes
* Optional segment announcement via HTTP (apache#3902) * BatchServerInventoryView is created twice (apache#3244)
This PR introduces following configurations.
I'm keeping those undocumented for now. We will use them on our big internal clusters first before asking users to use.
changes -
All nodes, that serve segments e.g. historicals and realtime indexing tasks, provide a "/druid-internal/v1/segments" HTTP endpoint that can incrementally provide the list of segments being served by that node, this uses Async IO so jetty threads are not held waiting. To power this endpoint, following method is added to DataSegmentAnnouncer interface.
On coordinator/brokers side, HttpServerInventoryView class is introduced which is equivalent to BatchServerInventoryView but syncs the segment inventory using the mentioned HTTP endpoint. Server discovery
is still done via zookeeper. ServerInventoryView is made an interface and its old contents are moved to AbstractCuratorServerInventoryView .