Skip to content

KAFKA-7660: fix streams and Metrics memory leaks#5981

Merged
mjsax merged 2 commits intoapache:1.1from
vvcephei:1.1-memory-leaks
Dec 6, 2018
Merged

KAFKA-7660: fix streams and Metrics memory leaks#5981
mjsax merged 2 commits intoapache:1.1from
vvcephei:1.1-memory-leaks

Conversation

@vvcephei
Copy link
Copy Markdown
Contributor

Backport two memory-leak fixes (#5974 and #5953) (see also 2.1: #5979, 2.0: #5980 )

It looks like we already had the parentSensors fix in 1.1 and lost it in 2.0. The change in
this PR just tidies it up a little for consistency with later branches.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@vvcephei
Copy link
Copy Markdown
Contributor Author

@mjsax @guozhangwang , This is an even further backport of a couple of small memory-leak fixes from trunk.

I'm not sure if you want to keep them as separate commits in the backport as well...

A heap dump provided by Patrik Kleindl in https://issues.apache.org/jira/browse/KAFKA-7660 identifies the childrenSensors map in Metrics as keeping references to sensors alive after they have been removed.

This PR fixes it and adds a test to be sure.

Reviewers: Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
@vvcephei vvcephei changed the title 1.1 memory leaks KAFKA-7660: fix streams and Metrics memory leaks Nov 30, 2018
@vvcephei
Copy link
Copy Markdown
Contributor Author

@ijuma The 1.1 branch builder also looks misconfigured:

JDK10 (same failure as #5980 (comment)):

00:00:59.333 * What went wrong:
00:00:59.333 Task 'findbugsMain' not found in root project 'kafka-pr-jdk10-scala2.12'.

JDK7:

00:00:40.343 /tmp/jenkins8716077406695918171.sh: line 4: /home/jenkins/tools/gradle/4.4/bin/gradle: No such file or directory

Any idea how we can get this fixed?

@ijuma
Copy link
Copy Markdown
Member

ijuma commented Dec 2, 2018

@vvcephei I replied to the first issue in another PR (btw, it's not necessary to point out the same problem in lots of different PRs :)). For the second one, it looks like Gradle 4.4 was removed from the build machines. I updated the job to use Gradle 4.10.2 to bootstrap.

@ijuma
Copy link
Copy Markdown
Member

ijuma commented Dec 2, 2018

retest this please

1 similar comment
@ijuma
Copy link
Copy Markdown
Member

ijuma commented Dec 2, 2018

retest this please

@vvcephei
Copy link
Copy Markdown
Contributor Author

vvcephei commented Dec 3, 2018

java 7 failure looks unrelated:

kafka.admin.DeleteConsumerGroupTest.testConsumptionOnRecreatedTopicAfterTopicWideDeleteInZK

kafka.consumer.ConsumerTimeoutException
	at kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:70)
	at kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:34)
	at kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:64)
	at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:56)
	at kafka.utils.IteratorTemplate.next(IteratorTemplate.scala:36)
	at kafka.consumer.ConsumerIterator.next(ConsumerIterator.scala:47)
	at kafka.admin.DeleteConsumerGroupTest$$anonfun$consumeEvents$1.apply(DeleteConsumerGroupTest.scala:218)
	at kafka.admin.DeleteConsumerGroupTest$$anonfun$consumeEvents$1.apply(DeleteConsumerGroupTest.scala:218)
	at scala.collection.immutable.Range.foreach(Range.scala:160)
	at kafka.admin.DeleteConsumerGroupTest.consumeEvents(DeleteConsumerGroupTest.scala:218)
	at kafka.admin.DeleteConsumerGroupTest.testConsumptionOnRecreatedTopicAfterTopicWideDeleteInZK(DeleteConsumerGroupTest.scala:182)

stdout:
[2018-12-02 20:32:18,952] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472)
[2018-12-02 20:32:19,150] WARN Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn:1162)
java.net.ConnectException: Connection refused
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
...

Retest this, please.

@mjsax mjsax merged commit 0a21ee3 into apache:1.1 Dec 6, 2018
@vvcephei vvcephei deleted the 1.1-memory-leaks branch December 7, 2018 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants