Conversation
jon-wei
left a comment
There was a problem hiding this comment.
Had some minor comments, otherwise LGTM
| "Failed to replicate segment[%s] to server[%s] due to insufficient available space", | ||
| segment.getIdentifier(), | ||
| holder.getServer().getHost() | ||
| ) |
There was a problem hiding this comment.
is the formatting off here?
There was a problem hiding this comment.
No.. This is druid's format if i didn't make a mistake. Maybe better if i use a separate string value for the error message like below?
String message = String.format(...);
log.makeAlert(message)
.addData("segmentSize", segment.getSize())
.addData("availableSize", holder.getAvailableSize())
.emit();| { | ||
| static final String TYPE = "broadcastForever"; | ||
|
|
||
| private final String colocateDataSource; |
There was a problem hiding this comment.
suggest renaming this here and elsewhere to colocatedDatasource
weijietong
left a comment
There was a problem hiding this comment.
WIP or not support for replicator ?
| .addData("availableSize", holder.getAvailableSize()) | ||
| .emit(); | ||
| } else { | ||
| holder.getPeon().loadSegment( |
There was a problem hiding this comment.
here not found replicator support , or next submit ?
There was a problem hiding this comment.
You mean ReplicationThrottler?
ReplicationThrottler is used to throttle replication speed, and I think it's not necessary for broadcast rules because users usually want that broadcast completes as soon as possible.
gianm
left a comment
There was a problem hiding this comment.
Generally lgtm, just a few comments.
@jihoonson if you think the load rule API here is going to be stable as you continue working, please document it where other load rules are documented. Otherwise I'd say you can leave it undocumented until it's stable.
| @JsonProperty("colocatedDatasource") String colocatedDatasource | ||
| ) | ||
| { | ||
| this.colocatedDatasource = Objects.requireNonNull(colocatedDatasource); |
There was a problem hiding this comment.
Most other Druid code uses Preconditions.checkNotNull but I guess there's no real need to be consistent there.
| // Find servers which holds the segments of co-located data source | ||
| final List<ServerHolder> targetServerHolders = params | ||
| .getDruidCluster().getAllServers().stream() | ||
| .filter(eachHolder -> eachHolder.getServer().getDataSource(getColocatedDatasource()) != null) |
There was a problem hiding this comment.
It seems to me like it'd be valuable to have colocatedDataSources be a list, and maybe if it's empty then broadcast to all servers. That'd give some more flexibilty to the user about where to broadcast their data sources.
There was a problem hiding this comment.
Thanks. Sounds nice.
| stats.addToGlobalStat(LoadRule.ASSIGNED_COUNT, 0); | ||
|
|
||
| for (ServerHolder holder : serverHolderList) { | ||
| if (segment.getSize() > holder.getAvailableSize()) { |
There was a problem hiding this comment.
Some sites use the main alert text to collect and summarize similar alerts. So it'd be nicer for this description to be ("Failed to broadcast segment for[%s]", dataSource) with the segment id and specific server as additional data.
| return stats; | ||
| } | ||
|
|
||
| public abstract String getColocatedDatasource(); |
There was a problem hiding this comment.
Please spell this as getColocatedDataSource (or dataSources) not datasource.
| } | ||
|
|
||
| @Override | ||
| public boolean equals(Object o) |
There was a problem hiding this comment.
Should have hashCode overridden too so it is consistent with equals.
There was a problem hiding this comment.
Please override toString too, I think there's some code that prints rules.
There was a problem hiding this comment.
Would you give me some examples where toString is used? It's not overridden for other rules as well.
There was a problem hiding this comment.
Ah, nevermind, you're right.
| } | ||
|
|
||
| @Override | ||
| public boolean equals(Object o) |
|
@jihoonson I have read the code of this part PR . The 'BroadcastDistributionRule' is a really good way to express DataSource distribution for the join target . But another thing needs to be care about is My solution is to rewrite the |
|
@gianm thanks. Addressed your comments. I'll document soon. |
|
@gianm documented broadcast rules. |
|
The travis failure seems not related to this patch. @jon-wei, would you restart the build? I don't have permission. |
|
I updated the patch to drop segments which are not co-located anymore. |
|
|
||
| final CoordinatorStats stats = new CoordinatorStats(); | ||
|
|
||
| return stats.accumulate(assign(loadServerHolders, segment)) |
There was a problem hiding this comment.
im wondering if the broadcast rules should label their segments a little differently, or at least be able to distinguish from actual Druid segments
There was a problem hiding this comment.
It'll definitely make reading the logs easier
There was a problem hiding this comment.
I think broadcasted segments should be regarded as same as other segments. Every types of queries can be queried on broadcasted segments, but the broadcast information is required by only joins.
The information of which sources are broadcasted or not is required by brokers and historicals (and realtimes in the future). Brokers decides a given query should be distributed to which nodes, and this is determined by selecting the nodes holding the segments of non-broadcasted tables. Historicals, as proposed in #4032, join broadcasted segments first and then subsequently join the result with non-broadcasted segments in parallel.
I'm currently thinking that brokers are able to figure out which tables are broadcasted by looking at BroadcastDistributionRules, and they will add this information to QueryContext for historicals. If this works, broadcast segments can be regarded as normal.
What do you think?
There was a problem hiding this comment.
@jihoonson Okay that makes sense to me. Can broadcast segments be created dynamically?
There was a problem hiding this comment.
Maybe possible in the future when we support the feature of broadcasting tables on the fly, but not yet.
|
I found two problems when this patch is applied.
The solution I'm currently considering is as follows.
Since BroadcastDistributionRules are just introduced here, and not used in anywhere, I think it would be ok to fix these problems in follow-up PRs. |
|
👍 |
|
@jihoonson, on the two points you raised, it sounds like the issue is that (a) loading new segments, and (b) moving existing segments for balancing, could break broadcasting if a segment from the colocated datasource is moved onto a historical node that didn't already have any segments from that datasource. I guess in practice this is potentially an issue, but won't be a huge one, since odds are that a distributed datasource will have at least one segment on all historical nodes in a given tier anyway (unless it has a number of segments smaller than the number of historical nodes). To keep things (very) simple, what do you think about removing the "colocatedDataSources" concept, and instead broadcasting the broadcast datasources to every historical node the given tier for the rule? I think in practice this achieves mostly the same thing (given the reason above) as having colocatedDataSources. It also avoids bugs due to this effect, since every historical in a tier will have the broadcast segments and there's no need to worry. |
|
Or at least, making that the behavior if colocatedDataSources is empty, so users can leave it empty and not have to worry. |
|
The docs should mention this caveat. |
|
@gianm sounds good, but I'm concerned with the usability. If we support broadcasting to tiers instead of broadcasting to data sources, users will need two steps to design their data distribution. That is, they first need to check each data source to be joined is involved which tiers, and then decide which data sources need to be broadcasted to which tiers. Broadcasting to data sources will be simpler because users need to decide which sources are broadcasted to which source. |
|
@gianm if colocatedDataSources is null or empty, segments are broadcasted to all nodes. |
|
Maybe we can support two different types of broadcast rules for 1) broadcasting to tiers and 2) colocating data sources. |
|
The latter suggestion sounds good. I think in practice many users will choose to broadcast to tiers -- since if you're already very aware of your tier structure (as many users are, if they use tiers) then it should be pretty easy to choose the right tier for a broadcast. It also works around the issue you brought up, which is nice. The colocated data source option is a nice alternative that would make configuration simpler, just should come with a caveat about the issue you brought up. |
|
Ok, then I'll make another PR for broadcasting to tiers. |
|
@gianm, I added a caveat |
|
|
||
| public Set<ServerHolder> getAllServers() | ||
| { | ||
| final Set<ServerHolder> servers = new HashSet<>(); |
There was a problem hiding this comment.
Is there a reason this is a Set? Set seems suspicious to me (vs. List) for two reasons.
- I think DruidCluster is meant to only have a single copy of each server anyway
- ServerHolder
equalsincludes a call to LoadQueuePeon.equals, which is based on reference equality, so it seems unreliable to use it to dedupe servers
There was a problem hiding this comment.
I misunderstood that the same server can be involved in different tiers which it can't. I changed to list.
There was a problem hiding this comment.
@gianm I found ServerHolder.equals() calls ImmutableDruidServer.equals() and LoadQueuePeon.equals() which are not implemented. Maybe this is not a critical problem for now because it is not called so far.. Anyway, I'll fix it.
Btw, to check the equality of two ServerHolders, I think it would be enough to compare ImmutableDruidServer.getHost(), ImmutableDruidServer.getType(), and ImmutableDruidServer.getTier(). What do you think?
| The interval of a segment will be compared against the specified period. The period is from some time in the past to the current time. The rule matches if the period contains the interval. | ||
|
|
||
| <div class="note caution"> | ||
| broadcastToDataSources rules don't guarantee that segments of the data sources are always co-located. |
There was a problem hiding this comment.
This comment would be better if it explained why. Maybe adding "because segments for the colocated data sources are not loaded together atomically".
|
|
||
| ```json | ||
| { | ||
| "type" : "broadcastToDataSourcesForever", |
There was a problem hiding this comment.
I think this should just be called broadcastForever, since the dataSources part is optional. And similar comment for the other kinds of broadcast rules.
| stats.addToGlobalStat(LoadRule.ASSIGNED_COUNT, 0); | ||
|
|
||
| for (ServerHolder holder : serverHolders) { | ||
| if (segment.getSize() > holder.getAvailableSize()) { |
There was a problem hiding this comment.
I think that like CostBalancerStrategy, this should check if holder.isLoadingSegment(segment) and skip if so.
There was a problem hiding this comment.
Thanks! I added it.
| if (colocatedDataSources.stream() | ||
| .anyMatch(source -> eachHolder.getServer().getDataSource(source) != null)) { | ||
| loadServerHolders.add(eachHolder); | ||
| } else if (eachHolder.isServingSegment(segment)) { |
There was a problem hiding this comment.
I think this should skip holders that already have the segment in their drop lists (peon.getSegmentsToDrop()), like DruidCoordinator.moveSegment and DruidCoordinatorCleanupUnneeded.
There was a problem hiding this comment.
Thanks! I added it.
First patch for #4032.
This change is