-
Notifications
You must be signed in to change notification settings - Fork 7.3k
ZOOKEEPER-4453: NettyServerCnxnFactory: allow to configure the early TLS connection drop feature #1799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZOOKEEPER-4453: NettyServerCnxnFactory: allow to configure the early TLS connection drop feature #1799
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -522,7 +522,10 @@ private void receiveMessage(ByteBuf message) { | |
| } | ||
| ZooKeeperServer zks = this.zkServer; | ||
| if (zks == null || !zks.isRunning()) { | ||
| throw new IOException("ZK down"); | ||
| LOG.warn("Closing connection to {} because the server is not ready", | ||
| getRemoteSocketAddress()); | ||
| close(DisconnectReason.IO_EXCEPTION); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: I know we were throwing IOException before which resulted in DisconnectReason.IO_EXCEPTION, but now that we do proper close here, we might use some better DisconnectReason. I see e.g. DisconnectReason.SERVER_SHUTDOWN. Although I'm not sure when this part is triggered... I guess this can be triggered either during initialization or shutdown. Maybe DisconnectReason.IO_EXCEPTION is good enough. Also: before this change we did log the 'getRemoteSocketAddress()' in line 534, which might be handy to add to this warning above. (if someone is trying to figure out in the ZK log why a session terminated)
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we have the same "ZK down" exception in line 477. Maybe the logic should be changed there as well?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is the same behaviour as before.
|
||
| return; | ||
| } | ||
| // checkRequestSize will throw IOException if request is rejected | ||
| zks.checkRequestSizeWhenReceivingMessage(len); | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -85,6 +85,7 @@ public class NettyServerCnxnFactory extends ServerCnxnFactory { | |
| * Allow client-server sockets to accept both SSL and plaintext connections | ||
| */ | ||
| public static final String PORT_UNIFICATION_KEY = "zookeeper.client.portUnification"; | ||
| public static final String EARLY_DROP_SECURE_CONNECTION_HANDSHAKES = "zookeeper.netty.server.earlyDropSecureConnectionHandshakes"; | ||
| private final boolean shouldUsePortUnification; | ||
|
|
||
| /** | ||
|
|
@@ -227,11 +228,14 @@ public void channelActive(ChannelHandlerContext ctx) throws Exception { | |
|
|
||
| // Check the zkServer assigned to the cnxn is still running, | ||
| // close it before starting the heavy TLS handshake | ||
| if (!cnxn.isZKServerRunning()) { | ||
| LOG.warn("Zookeeper server is not running, close the connection before starting the TLS handshake"); | ||
| ServerMetrics.getMetrics().CNXN_CLOSED_WITHOUT_ZK_SERVER_RUNNING.add(1); | ||
| channel.close(); | ||
| return; | ||
| if (secure && !cnxn.isZKServerRunning()) { | ||
| boolean earlyDropSecureConnectionHandshakes = Boolean.getBoolean(EARLY_DROP_SECURE_CONNECTION_HANDSHAKES); | ||
|
symat marked this conversation as resolved.
|
||
| if (earlyDropSecureConnectionHandshakes) { | ||
| LOG.warn("Zookeeper server is not running, close the connection before starting the TLS handshake"); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: I know we had warning level before, but what do you think about INFO level instead? I like to have logs here, just not sure if this is really something the user should worry.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is exactly the same thing we printed before in the "catch" block below, but without spamming the logs with a stacktrace and with a meaning less message. So I did this way in order to not change the behaviour too much but at least removing the stacktrace
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. changed to INFO |
||
| ServerMetrics.getMetrics().CNXN_CLOSED_WITHOUT_ZK_SERVER_RUNNING.add(1); | ||
| channel.close(); | ||
| return; | ||
| } | ||
| } | ||
|
|
||
| if (handshakeThrottlingEnabled) { | ||
|
|
@@ -510,6 +514,7 @@ private ServerBootstrap configureBootstrapAllocator(ServerBootstrap bootstrap) { | |
| x509Util = new ClientX509Util(); | ||
|
|
||
| boolean usePortUnification = Boolean.getBoolean(PORT_UNIFICATION_KEY); | ||
|
|
||
| LOG.info("{}={}", PORT_UNIFICATION_KEY, usePortUnification); | ||
| if (usePortUnification) { | ||
| try { | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -29,6 +29,7 @@ | |
| import static org.junit.jupiter.api.Assertions.fail; | ||
| import static org.mockito.Mockito.doNothing; | ||
| import static org.mockito.Mockito.mock; | ||
| import static org.mockito.Mockito.times; | ||
| import static org.mockito.Mockito.when; | ||
| import io.netty.channel.Channel; | ||
| import io.netty.channel.ChannelFuture; | ||
|
|
@@ -180,42 +181,70 @@ public void testNonMTLSRemoteConn() throws Exception { | |
| when(zks.isRunning()).thenReturn(true); | ||
| ServerStats.Provider providerMock = mock(ServerStats.Provider.class); | ||
| when(zks.serverStats()).thenReturn(new ServerStats(providerMock)); | ||
| testNonMTLSRemoteConn(zks); | ||
| testNonMTLSRemoteConn(zks, false, false); | ||
| } | ||
|
|
||
| @Test | ||
| public void testNonMTLSRemoteConnZookKeeperServerNotReady() throws Exception { | ||
| testNonMTLSRemoteConn(null); | ||
| testNonMTLSRemoteConn(null, false, false); | ||
| } | ||
|
|
||
| @Test | ||
| public void testNonMTLSRemoteConnZookKeeperServerNotReadyEarlyDropEnabled() throws Exception { | ||
| testNonMTLSRemoteConn(null, false, true); | ||
| } | ||
|
|
||
| @Test | ||
| public void testMTLSRemoteConnZookKeeperServerNotReadyEarlyDropEnabled() throws Exception { | ||
| testNonMTLSRemoteConn(null, true, true); | ||
| } | ||
|
|
||
| @Test | ||
| public void testMTLSRemoteConnZookKeeperServerNotReadyEarlyDropDisabled() throws Exception { | ||
| testNonMTLSRemoteConn(null, true, true); | ||
| } | ||
|
|
||
| @SuppressWarnings("unchecked") | ||
| private void testNonMTLSRemoteConn(ZooKeeperServer zks) throws Exception { | ||
| Channel channel = mock(Channel.class); | ||
| ChannelId id = mock(ChannelId.class); | ||
| ChannelFuture success = mock(ChannelFuture.class); | ||
| ChannelHandlerContext context = mock(ChannelHandlerContext.class); | ||
| ChannelPipeline channelPipeline = mock(ChannelPipeline.class); | ||
|
|
||
| when(context.channel()).thenReturn(channel); | ||
| when(channel.pipeline()).thenReturn(channelPipeline); | ||
| when(success.channel()).thenReturn(channel); | ||
| when(channel.closeFuture()).thenReturn(success); | ||
|
|
||
| InetSocketAddress address = new InetSocketAddress(0); | ||
| when(channel.remoteAddress()).thenReturn(address); | ||
| when(channel.id()).thenReturn(id); | ||
| NettyServerCnxnFactory factory = new NettyServerCnxnFactory(); | ||
| factory.setZooKeeperServer(zks); | ||
| Attribute atr = mock(Attribute.class); | ||
| Mockito.doReturn(atr).when(channel).attr( | ||
| Mockito.any() | ||
| ); | ||
| doNothing().when(atr).set(Mockito.any()); | ||
| factory.channelHandler.channelActive(context); | ||
|
|
||
| if (zks != null) { | ||
| assertEquals(0, zks.serverStats().getNonMTLSLocalConnCount()); | ||
| assertEquals(1, zks.serverStats().getNonMTLSRemoteConnCount()); | ||
| private void testNonMTLSRemoteConn(ZooKeeperServer zks, boolean secure, boolean earlyDrop) throws Exception { | ||
| System.setProperty(NettyServerCnxnFactory.EARLY_DROP_SECURE_CONNECTION_HANDSHAKES, earlyDrop + ""); | ||
| try { | ||
| Channel channel = mock(Channel.class); | ||
| ChannelId id = mock(ChannelId.class); | ||
| ChannelFuture success = mock(ChannelFuture.class); | ||
| ChannelHandlerContext context = mock(ChannelHandlerContext.class); | ||
| ChannelPipeline channelPipeline = mock(ChannelPipeline.class); | ||
|
|
||
| when(context.channel()).thenReturn(channel); | ||
| when(channel.pipeline()).thenReturn(channelPipeline); | ||
| when(success.channel()).thenReturn(channel); | ||
| when(channel.closeFuture()).thenReturn(success); | ||
|
|
||
| InetSocketAddress address = new InetSocketAddress(0); | ||
| when(channel.remoteAddress()).thenReturn(address); | ||
| when(channel.id()).thenReturn(id); | ||
| NettyServerCnxnFactory factory = new NettyServerCnxnFactory(); | ||
| factory.setSecure(secure); | ||
| factory.setZooKeeperServer(zks); | ||
| Attribute atr = mock(Attribute.class); | ||
| Mockito.doReturn(atr).when(channel).attr( | ||
| Mockito.any() | ||
| ); | ||
| doNothing().when(atr).set(Mockito.any()); | ||
| factory.channelHandler.channelActive(context); | ||
|
|
||
| if (zks != null) { | ||
| assertEquals(0, zks.serverStats().getNonMTLSLocalConnCount()); | ||
| assertEquals(1, zks.serverStats().getNonMTLSRemoteConnCount()); | ||
| } else { | ||
| if (earlyDrop && secure) { | ||
| // the channel must have been forcibly closed | ||
| Mockito.verify(channel, times(1)).close(); | ||
| } else { | ||
| Mockito.verify(channel, times(0)).close(); | ||
| } | ||
| } | ||
| } finally { | ||
| System.clearProperty(NettyServerCnxnFactory.EARLY_DROP_SECURE_CONNECTION_HANDSHAKES); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: if you clear this in the finally block, then let's move the setProperty call into the try block. (now if the test killed just before the try block, the property won't be cleared. - very-very unlikely, but still...) Alternatively I would be OK to put the clearProperty call to the afterEach() method and then you don't need the try-finally block. Also just double-checking: we don't run multiple test classes (or methods in the same class) paralel in the same JVM, right? Let's make sure we avoid some flaky execution.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| } | ||
| } | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this to be false by default. However, AFACT this behaviour was enabled in 3.7.0 by default. Maybe we should mention it in the documentation. (and also in the release doc of 3.7.1 and 3.8.0 if we don't forget)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that "feature" was enabled in 3.6. this is way Pulsar and Pravega users are not able to upgrade ZK.
I can update the docs and explain the story.
I would like this patch to land to 3.6. 3.7 and 3.8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated