Add paramter to loadstatus API to compute underdeplication against cluster view#11056
Conversation
…uster view This change adds a query parameter `computeUsingClusterView` to loadstatus apis that if specified have the coordinator compute undereplication for segments based on the number of services available within cluster that the segment can be replicated on, instead of the configured replication count configured in load rule. A default load rule is created in all clusters that specified that all segments should be replicated 2 times. As replicas are forced to be on separate nodes in the cluster, this causes the loadstatus api to report that there are under-replicated segments when there is only 1 data server in the cluster. In this case, calling loadstatus api without this new query parameter will always result in a response indicating under-replication of segments
| } | ||
|
|
||
| @Override | ||
| public void updateUnderReplicatedWithClusterView( |
There was a problem hiding this comment.
Can you add an implementation for BroadcastDistributionRule as well? I think the implementation there could just call updateUnderReplicated and ignore the cluster parameter, since it's already looking at the available servers in the cluster.
| break; | ||
| } | ||
|
|
||
| if (computeUsingClusterView && (rule instanceof LoadRule)) { |
There was a problem hiding this comment.
The (rule instanceof LoadRule) should be removed, since BroadcastDistributionRule can also load segments (it also returns true for canLoadSegments(), which is already checked before this)
There was a problem hiding this comment.
Changed accordingly. Before though, for broadcast rule, it was falling through to else branch and effectively doing same thing. Thought it was weird to add implementation for Broadcast rule that didn't use the parameter, but I guess not a big deal.
There was a problem hiding this comment.
Cool, thanks, I thought it would be cleaner to have the new method be supported by all segment-loading rules, vs. the behavior specific to just LoadRule
| { | ||
| DataSourcesResource dataSourcesResource = new DataSourcesResource(inventoryView, segmentsMetadataManager, null, null, null, null); | ||
| Response response = dataSourcesResource.getDatasourceLoadstatus("datasource1", null, null, null, null); | ||
| Response response = dataSourcesResource.getDatasourceLoadstatus("datasource1", null, null, null, null, null); |
There was a problem hiding this comment.
Can you add tests that use the new parameter?
|
@techdocsmith can you take a look at the change to docs. |
techdocsmith
left a comment
There was a problem hiding this comment.
That last sentence was complicated. I am not sure if I parsed it out correctly, but mabye my suggestions can help lead to some clarification.
Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>
There was a problem hiding this comment.
Some edits to the new bit and found some typos in the previous bit. I am still not 100% sure about this bit:
Or the number of replicas for the segment in each tier where it is configured to be replicated equals the available nodes of a service type that are currently allowed to load the segment in the tier.
I feel like it needs to be simplified or broken down somehow.
Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>
Description
This change adds a query parameter
computeUsingClusterViewto loadstatus apisthat if specified have the coordinator compute undereplication for segments based
on the number of services available within cluster that the segment can be replicated
on, instead of the configured replication count configured in load rule. This query
parameter is only available when also requesting full loadstatus. A default load rule is
created in all clusters that specified that all segments should be replicated 2 times. As
replicas are forced to be on separate nodes in the cluster, this causes the loadstatus api
to report that there are under-replicated segments when there is only 1 data server in the
cluster. In this case, calling loadstatus api without this new query parameter will always
result in a response indicating under-replication of segments
during the period of time before the coordinator run loop has executed at least once, as this
object is set during every instantiation of the coordinator run loop.
This PR has: