Skip to content

Nodes won't die if zookeeper went away. #2016

@KenjiTakahashi

Description

@KenjiTakahashi

Scenario:

  1. Everything is up and running.
  2. Zookeeper goes away (for whatever reason).
  3. Nodes start spamming:
2015-11-27T01:44:08,767 INFO [Curator-Framework-0-SendThread(monitowl-dev:2181)] org.apache.zookeeper.ClientCnxn - Opening socket connection to server monitowl-dev/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2015-11-27T01:44:08,767 WARN [Curator-Framework-0-SendThread(monitowl-dev:2181)] org.apache.zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_66]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_66]
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) [zookeeper-3.4.6.jar:3.4.6-1569965]

(Which is "fine", I guess.)
4. I do kill -TERM <node_pid>.
5. Node says:

2015-11-27T01:44:09,446 INFO [Thread-42] com.metamx.common.lifecycle.Lifecycle - Running shutdown hook
2015-11-27 01:44:09,464 FATAL Unable to register shutdown hook because JVM is shutting down.

(Which seems to be normal, they always say that before dying.)
6. But... it does not die. Process is still displayed as running (in htop, etc.), no further logs or anything, though.

Using 0.8.2, happens for all kinds of nodes AFAICT.

Possibly worth noting that when I deliberately start a node when zookeeper is down, it exits fine on SIGTERM.

Probably wouldn't even notice it, but this confuses our systemd configs quite a bit :-/.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions