Skip to content

ZOOKEEPER-3706: ZooKeeper.close() would leak SendThread when the netw…#1235

Closed
yfxhust wants to merge 1 commit intoapache:masterfrom
yfxhust:ZOOKEEPER-3706
Closed

ZOOKEEPER-3706: ZooKeeper.close() would leak SendThread when the netw…#1235
yfxhust wants to merge 1 commit intoapache:masterfrom
yfxhust:ZOOKEEPER-3706

Conversation

@yfxhust
Copy link
Copy Markdown

@yfxhust yfxhust commented Jan 25, 2020

…ork is broken

  • add unit test to verify the bug
  • bypass the SendThread.startConnect() by throw RuntimeExcepth if state.isAlive is false

Author: Fangxi Yin yinfangxi@kuaishou.com

@maoling
Copy link
Copy Markdown
Member

maoling commented Jan 31, 2020

@yfxhust
Great issue description in JIRA. this issue is a little hidden. Do you observe it in the production?

Comment thread zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java
Comment thread zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java Outdated
@yfxhust
Copy link
Copy Markdown
Author

yfxhust commented Feb 3, 2020

@yfxhust
Great issue description in JIRA. this issue is a little hidden. Do you observe it in the production?

We observed occasional SendThread leak case in our production. But the root cause is not 100 percent sure. But from source code and some network clues, we believe that this issue is a potential cause.

Comment thread zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java Outdated
Comment thread zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java Outdated
@hanm
Copy link
Copy Markdown
Contributor

hanm commented Feb 3, 2020

nice work - left two comments. since the request timeout is a feature disabled by default, the impact of this bug is limited to those who enabled this feature only.

@yfxhust
Copy link
Copy Markdown
Author

yfxhust commented Feb 4, 2020

nice work - left two comments. since the request timeout is a feature disabled by default, the impact of this bug is limited to those who enabled this feature only.

Thanks for your comments

@hanm
Copy link
Copy Markdown
Contributor

hanm commented Feb 5, 2020

left two more comments - current approach should work but I am wondering if my proposal would sound better in term of more structurally solve the race condition around the state variable; also was a little bit concerning around RuntimeException we throw and the implications it brought (w.r.t resource leaks)

@yfxhust
Copy link
Copy Markdown
Author

yfxhust commented Feb 6, 2020

left two more comments - current approach should work but I am wondering if my proposal would sound better in term of more structurally solve the race condition around the state variable; also was a little bit concerning around RuntimeException we throw and the implications it brought (w.r.t resource leaks)

Thank you for your comments. I adapt my code with your suggestions. Please review the latest patch.

@yfxhust yfxhust force-pushed the ZOOKEEPER-3706 branch 3 times, most recently from f787f2b to 95accad Compare February 10, 2020 08:18
Comment thread zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java Outdated
Comment thread zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java
Comment thread zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java
@yfxhust yfxhust force-pushed the ZOOKEEPER-3706 branch 2 times, most recently from d1e235b to 16500a6 Compare February 17, 2020 03:48
Copy link
Copy Markdown
Member

@maoling maoling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hanm Do you have other concerns?

@hanm
Copy link
Copy Markdown
Contributor

hanm commented Feb 19, 2020

I just have two nits - one on the sync approach and the other on unit test.

@yfxhust
Copy link
Copy Markdown
Author

yfxhust commented Feb 19, 2020

I just have two nits - one on the sync approach and the other on unit test.

Thank you for your comments. Adapt my code with your suggestion. Please see the latest patch.

@yfxhust
Copy link
Copy Markdown
Author

yfxhust commented Feb 25, 2020

@hanm Any further comments ?

Copy link
Copy Markdown
Contributor

@hanm hanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Any feedback from other committers? If not, I'll land this in next few days.

Copy link
Copy Markdown
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work.
I left a few comments

Comment thread zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java Outdated
Comment thread zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java
@yfxhust
Copy link
Copy Markdown
Author

yfxhust commented Mar 2, 2020

@eolivelli Thank you for your comments. I adapt the patch with your suggestions. Please review the latest patch.

…ork is broken

- add unit test to verify the bug
- wrap state modification in SendThread.changeZkState()
- SendThread.startConnect() could be interrupted by IOException caused by SendThread.changeZkState()

Author: Fangxi Yin <yinfangxi@kuaishou.com>
Copy link
Copy Markdown
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hanm please go ahead and merge if you are okay with the latest version

@asfgit asfgit closed this in d3e72c6 Mar 3, 2020
@hanm
Copy link
Copy Markdown
Contributor

hanm commented Mar 3, 2020

merged to master - thanks for driving this @yfxhust

stickyhipp pushed a commit to stickyhipp/zookeeper that referenced this pull request Aug 19, 2020
…ork is broken

- add unit test to verify the bug
- bypass the SendThread.startConnect() by throw RuntimeExcepth if state.isAlive is false

Author: Fangxi Yin <yinfangxikuaishou.com>

Author: yinfangxi <yinfangxi@kuaishou.com>

Reviewers: Michael Han <hanm@apache.org>, Enrico Olivelli <eolivelli@gmail.com>, maoling

Closes apache#1235 from yfxhust/ZOOKEEPER-3706
asfgit pushed a commit that referenced this pull request May 17, 2022
…ork is broken

- add unit test to verify the bug
- bypass the SendThread.startConnect() by throw RuntimeExcepth if state.isAlive is false

Author: Fangxi Yin <yinfangxikuaishou.com>

Author: yinfangxi <yinfangxi@kuaishou.com>

Reviewers: Michael Han <hanm@apache.org>, Enrico Olivelli <eolivelli@gmail.com>, maoling

Closes #1235 from yfxhust/ZOOKEEPER-3706
RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Aug 31, 2022
…ork is broken

- add unit test to verify the bug
- bypass the SendThread.startConnect() by throw RuntimeExcepth if state.isAlive is false

Author: Fangxi Yin <yinfangxikuaishou.com>

Author: yinfangxi <yinfangxi@kuaishou.com>

Reviewers: Michael Han <hanm@apache.org>, Enrico Olivelli <eolivelli@gmail.com>, maoling

Closes apache#1235 from yfxhust/ZOOKEEPER-3706
RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Aug 31, 2022
…ork is broken

- add unit test to verify the bug
- bypass the SendThread.startConnect() by throw RuntimeExcepth if state.isAlive is false

Author: Fangxi Yin <yinfangxikuaishou.com>

Author: yinfangxi <yinfangxi@kuaishou.com>

Reviewers: Michael Han <hanm@apache.org>, Enrico Olivelli <eolivelli@gmail.com>, maoling

Closes apache#1235 from yfxhust/ZOOKEEPER-3706
RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Aug 31, 2022
…ork is broken

- add unit test to verify the bug
- bypass the SendThread.startConnect() by throw RuntimeExcepth if state.isAlive is false

Author: Fangxi Yin <yinfangxikuaishou.com>

Author: yinfangxi <yinfangxi@kuaishou.com>

Reviewers: Michael Han <hanm@apache.org>, Enrico Olivelli <eolivelli@gmail.com>, maoling

Closes apache#1235 from yfxhust/ZOOKEEPER-3706
RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Sep 3, 2022
…ork is broken

- add unit test to verify the bug
- bypass the SendThread.startConnect() by throw RuntimeExcepth if state.isAlive is false

Author: Fangxi Yin <yinfangxikuaishou.com>

Author: yinfangxi <yinfangxi@kuaishou.com>

Reviewers: Michael Han <hanm@apache.org>, Enrico Olivelli <eolivelli@gmail.com>, maoling

Closes apache#1235 from yfxhust/ZOOKEEPER-3706
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants