Skip to content

Fix Router race condition and use default broker service name for invalid priority#5050

Merged
himanshug merged 10 commits intoapache:masterfrom
metamx:router-race-condition-and-stuff
Nov 9, 2017
Merged

Fix Router race condition and use default broker service name for invalid priority#5050
himanshug merged 10 commits intoapache:masterfrom
metamx:router-race-condition-and-stuff

Conversation

@jisookim0513
Copy link
Copy Markdown
Contributor

This PR suggests two changes:

  1. I believe there's a race condition in TieredBrokerHostSelector, specifically in NodeHolder.pick() where multiple threads can access roundRobinIndex. For example, two threads can simultaneously access roundRobinIndex when roundRobinIndex == currNodes.size() - 1, but then one of the threads will get index out of bounds error when calling currNodes.get(roundRobinIndex++) because the other thread had increased the value of roundRobinIndex by 1 already. I have encountered this situation in production, for example:
 WARN [qtp658824366-145] o.e.j.s.ServletHandler - /druid/v2/ - []
java.lang.IndexOutOfBoundsException: index (x) must be less than size (x)
    at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:313) ~[guava-16.0.1.jar:?]
    at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:295) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.RegularImmutableList.get(RegularImmutableList.java:65) ~[guava-16.0.1.jar:?]
    at io.druid.server.router.TieredBrokerHostSelector$NodesHolder.pick(TieredBrokerHostSelector.java:319) ~[druid-server-0.11.0-mmx19.jar:0.11.0-mmx19]
    at io.druid.server.router.TieredBrokerHostSelector.getDefaultLookup(TieredBrokerHostSelector.java:260) ~[druid-server-0.11.0-mmx19.jar:0.11.0-mmx19]
    at io.druid.server.router.QueryHostFinder.findDefaultServer(QueryHostFinder.java:60) ~[druid-server-0.11.0-mmx19.jar:0.11.0-mmx19]
    at io.druid.server.router.QueryHostFinder.getDefaultServer(QueryHostFinder.java:90) ~[druid-server-0.11.0-mmx19.jar:0.11.0-mmx19]
    at io.druid.server.AsyncQueryForwardingServlet.service(AsyncQueryForwardingServlet.java:180) ~[druid-server-0.11.0-mmx19.jar:0.11.0-mmx19]
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) ~[javax.servlet-api-3.1.0.jar:3.1.0]
  1. I think it might make more sense to use DefaultBrokerServiceName instead of the last value of tierToBrokerMap when the priority is out of range, as I understand that the default value is the one that's used whenever there's no specific broker service name for the given query. What is the reason behind using the last entry of tierToBrokerMap when the priority is out of range?

Copy link
Copy Markdown
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Left a trivial comment. +1 after travis.

tierConfig.getTierToBrokerMap().values(),
tierConfig.getDefaultBrokerServiceName()
)
tierConfig.getDefaultBrokerServiceName()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging here might be helpful.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TieredBrokerSelectorStrategy doesn't have a logger. Should I add a logger to this class?

Copy link
Copy Markdown
Contributor

@himanshug himanshug Nov 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the log for queries which don't fall in min-max priority range configured ? that log might end up being too noisy. in many clusters, cluster operators and user teams are different and users might send such queries.

@@ -50,18 +49,10 @@ public Optional<String> getBrokerServiceName(TieredBrokerConfig tierConfig, Quer

if (priority < minPriority) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two branches of this if block are the same now.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

combined the two branches into one

)
tierConfig.getDefaultBrokerServiceName()
);
} else if (priority >= maxPriority) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it should be >= or >

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I guess it should be >?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm fine with `>'

{
ImmutableList<Server> currNodes = nodes;

int index = roundRobinIndex.getAndIncrement();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will produce negative results when index overflows, suggested to replace with a method which wraps around currNodes.size() explicitly, with use of compareAndSet(), for sanity

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or allow the overflow, but make sure the chosen index is positive (use absolute value).

@himanshug himanshug added this to the 0.11.0 milestone Nov 7, 2017
@himanshug
Copy link
Copy Markdown
Contributor

added 0.11.0 milestone due to race condition fix.

}

return currNodes.get(roundRobinIndex++);
return currNodes.get(index);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now this index could be incorrect, because it was set during a pervious call of this method.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added index %= currNodes.size() back to fix this, is there a better way to do it?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to return to this version: ab61f83 and just use nextIndex instead of index below in the code, it's not a problem if indexing will starts from 1 instead of 0. Or the initial value in roundRobinIndex could be set -1.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I wasn't really sure about using nextIndex (for the name's sake) but I will revert the change to that and set the initial value of roundRobinIndex to -1.


while (true) {
int nextIndex = index + 1;
if (nextIndex < 0) nextIndex %= currNodes.size();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't help, the result of mod operation when the first argument is negative is zero or negative

int index = roundRobinIndex.get();
int nextIndex = index + 1;

while (true) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please extract this block as a method, it will allow to not repeat

int index = roundRobinIndex.get();
int nextIndex = index + 1;

block

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

int nextIndex = index + 1;

while (true) {
if (nextIndex < 0 || nextIndex >= currNodes.size()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nextIndex < 0 is always false

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, will remove that

@himanshug
Copy link
Copy Markdown
Contributor

👍

@gianm
Copy link
Copy Markdown
Contributor

gianm commented Nov 9, 2017

CI failed because http://www.us.apache.org/dist/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz is no longer available. #5060 includes a fix.

@himanshug himanshug merged commit 1bf253f into apache:master Nov 9, 2017
@drcrallen drcrallen deleted the router-race-condition-and-stuff branch November 9, 2017 17:42
gianm pushed a commit to gianm/druid that referenced this pull request Nov 9, 2017
…alid priority (apache#5050)

* use default brokerServiceName when priority is not valid

* use AtomicInteger for NodesHolder.roundRobinIndex

* revert inspectionProfiles change

* adjust TieredBrokerHostSelectorTest

* combine if statements and ensure index does not become negative

* set next index with mod if overflows

* fix codestyle

* use nextIndex

* extract the while loop to a method
leventov pushed a commit that referenced this pull request Nov 10, 2017
…alid priority (#5050) (#5066)

* use default brokerServiceName when priority is not valid

* use AtomicInteger for NodesHolder.roundRobinIndex

* revert inspectionProfiles change

* adjust TieredBrokerHostSelectorTest

* combine if statements and ensure index does not become negative

* set next index with mod if overflows

* fix codestyle

* use nextIndex

* extract the while loop to a method
leventov pushed a commit to metamx/druid that referenced this pull request Nov 13, 2017
…alid priority (apache#5050) (apache#5066)

* use default brokerServiceName when priority is not valid

* use AtomicInteger for NodesHolder.roundRobinIndex

* revert inspectionProfiles change

* adjust TieredBrokerHostSelectorTest

* combine if statements and ensure index does not become negative

* set next index with mod if overflows

* fix codestyle

* use nextIndex

* extract the while loop to a method
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants