Skip to content

add scheme to AsyncQueryForwardingServlet#4688

Merged
jon-wei merged 2 commits intoapache:masterfrom
pjain1:tls_router
Aug 28, 2017
Merged

add scheme to AsyncQueryForwardingServlet#4688
jon-wei merged 2 commits intoapache:masterfrom
pjain1:tls_router

Conversation

@pjain1
Copy link
Copy Markdown
Member

@pjain1 pjain1 commented Aug 14, 2017

Missed in #4270

@pjain1 pjain1 added this to the 0.11.0 milestone Aug 14, 2017
@himanshug
Copy link
Copy Markdown
Contributor

👍

@pjain1 pjain1 closed this Aug 14, 2017
@pjain1 pjain1 reopened this Aug 14, 2017
@pjain1 pjain1 closed this Aug 15, 2017
@pjain1 pjain1 reopened this Aug 15, 2017
@jon-wei
Copy link
Copy Markdown
Contributor

jon-wei commented Aug 15, 2017

I tried running a query against the router with this patch and TLS only, I'm seeing an NPE:

2017-08-15T22:53:55,134 INFO [CoordinatorRuleManager-Exec--0] io.druid.server.router.CoordinatorRuleManager - Got [1] rules
2017-08-15T22:54:55,142 INFO [CoordinatorRuleManager-Exec--0] io.druid.server.router.CoordinatorRuleManager - Got [1] rules
2017-08-15T22:55:01,184 ERROR [AsyncQueryForwardingServlet-5fd8dd66-165] io.druid.server.AsyncQueryForwardingServlet - Exception handling request: {class=io.druid.server.AsyncQueryForwardingServlet, exceptionType=class java.lang.NullPointerException, exceptionMessage=null, exception=java.lang.NullPointerException, query=TimeBoundaryQuery{dataSource='wikiticker', querySegmentSpec=MultipleIntervalSegmentSpec{intervals=[0000-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z]}, duration=PT94670899200S, bound=, dimFilter=null}, peer=0:0:0:0:0:0:0:1}
java.lang.NullPointerException
	at org.eclipse.jetty.io.ssl.SslClientConnectionFactory.newConnection(SslClientConnectionFactory.java:59) ~[jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.client.AbstractHttpClientTransport$ClientSelectorManager.newConnection(AbstractHttpClientTransport.java:187) ~[jetty-client-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.io.ManagedSelector.createEndPoint(ManagedSelector.java:422) ~[jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.io.ManagedSelector.access$1600(ManagedSelector.java:56) ~[jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.io.ManagedSelector$CreateEndPoint.run(ManagedSelector.java:605) [jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.execute(ExecuteProduceConsume.java:100) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.io.ManagedSelector.run(ManagedSelector.java:147) [jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
	at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]

Making this change in JettyHttpClientModule appears to fix the NPE:

       final HttpClient httpClient;
       if (getSslContextBinding() != null) {
         final SslContextFactory sslContextFactory = new SslContextFactory();
         sslContextFactory.setSslContext(getSslContextBinding().getProvider().get());
         httpClient = new HttpClient(sslContextFactory);
       } else {
-        httpClient = new HttpClient();
+        httpClient = new HttpClient(new SslContextFactory());
       }

Copy link
Copy Markdown
Contributor

@jon-wei jon-wei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check that router NPE issue on your side?

@jon-wei
Copy link
Copy Markdown
Contributor

jon-wei commented Aug 15, 2017

If I create the HttpClient with new SslContextFactory(), I see the following exception:

Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
	at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:397) ~[?:1.8.0_144]
	at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:302) ~[?:1.8.0_144]
	at sun.security.validator.Validator.validate(Validator.java:260) ~[?:1.8.0_144]
	at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324) ~[?:1.8.0_144]
	at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:281) ~[?:1.8.0_144]
	at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:136) ~[?:1.8.0_144]
	at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1501) ~[?:1.8.0_144]
	at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) ~[?:1.8.0_144]
	at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1026) ~[?:1.8.0_144]
	at sun.security.ssl.Handshaker$1.run(Handshaker.java:966) ~[?:1.8.0_144]
	at sun.security.ssl.Handshaker$1.run(Handshaker.java:963) ~[?:1.8.0_144]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_144]
	at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1416) ~[?:1.8.0_144]
	at org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.fill(SslConnection.java:676) ~[jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
	... 16 more
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
	at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141) ~[?:1.8.0_144]
	at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126) ~[?:1.8.0_144]
	at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280) ~[?:1.8.0_144]
	at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:392) ~[?:1.8.0_144]
	at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:302) ~[?:1.8.0_144]
	at sun.security.validator.Validator.validate(Validator.java:260) ~[?:1.8.0_144]
	at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324) ~[?:1.8.0_144]
	at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:281) ~[?:1.8.0_144]
	at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:136) ~[?:1.8.0_144]
	at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1501) ~[?:1.8.0_144]
	at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) ~[?:1.8.0_144]
	at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1026) ~[?:1.8.0_144]
	at sun.security.ssl.Handshaker$1.run(Handshaker.java:966) ~[?:1.8.0_144]
	at sun.security.ssl.Handshaker$1.run(Handshaker.java:963) ~[?:1.8.0_144]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_144]
	at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1416) ~[?:1.8.0_144]
	at org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.fill(SslConnection.java:676) ~[jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
	... 16 more

Creating the SslContextFactory with a trust store path works:

        httpClient = new HttpClient(new SslContextFactory("/Users/jw/ssl_test/root.jks"));

@jon-wei jon-wei added the Bug label Aug 15, 2017
@pjain1
Copy link
Copy Markdown
Member Author

pjain1 commented Aug 17, 2017

@jon-wei If TLS is enabled then the users need to include an extension that provides a properly configured SSLContext object to be used by internal http-client for communication within Druid nodes. You can try including the simple-client-sslcontextextension and set appropriate configs for it like trust store which contains root certificates that can be used to validate server's certificate.

This was noted in the point 7 of the #4270 PR description. Also noted in the documentation here - https://github.com/druid-io/druid/pull/4270/files#diff-2ac62993fee64079aa2e42f45dd2311eR7

@jon-wei
Copy link
Copy Markdown
Contributor

jon-wei commented Aug 17, 2017

I do have the simple-client-sslcontext extension enabled with a configured trust store, I'm able to access task logs from the overlord console and run queries if I send them to a broker directly with TLS only, so I don't think that's the issue I was seeing (i.e., I think this is specific to the router)

Are you able to reproduce what I saw locally?

@pjain1
Copy link
Copy Markdown
Member Author

pjain1 commented Aug 17, 2017

ok..let me check

@pjain1
Copy link
Copy Markdown
Member Author

pjain1 commented Aug 21, 2017

@jon-wei there was indeed a problem, thanks for reporting. I fixed it, please try now. SslContext binding for Router annotation was missing in the extension.

@pjain1 pjain1 closed this Aug 21, 2017
@pjain1 pjain1 reopened this Aug 21, 2017
@pjain1 pjain1 closed this Aug 21, 2017
@pjain1 pjain1 reopened this Aug 21, 2017
@pjain1 pjain1 closed this Aug 22, 2017
@pjain1 pjain1 reopened this Aug 22, 2017
@pjain1
Copy link
Copy Markdown
Member Author

pjain1 commented Aug 28, 2017

rebased with master. @jon-wei can you please try it now

@himanshug
Copy link
Copy Markdown
Contributor

@jon-wei @pjain1 still LGTM for me

@jon-wei
Copy link
Copy Markdown
Contributor

jon-wei commented Aug 28, 2017

@pjain1 checking the patch now

@jon-wei
Copy link
Copy Markdown
Contributor

jon-wei commented Aug 28, 2017

Just verified that a router query works with this patch, LGTM

@jon-wei jon-wei merged commit 594a66f into apache:master Aug 28, 2017
@pjain1 pjain1 deleted the tls_router branch September 11, 2017 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants