Skip to content

MINOR: Add thread dumps if broker node cannot be stopped#5373

Merged
hachikuji merged 2 commits intoapache:trunkfrom
wicknicks:thread-dump-broker-stop-fail-tests
Jul 20, 2018
Merged

MINOR: Add thread dumps if broker node cannot be stopped#5373
hachikuji merged 2 commits intoapache:trunkfrom
wicknicks:thread-dump-broker-stop-fail-tests

Conversation

@wicknicks
Copy link
Copy Markdown
Contributor

In system tests, it is useful to have the thread dumps if a broker cannot be stopped using SIGTERM.

Signed-off-by: Arjun Satish arjun@confluent.io

Signed-off-by: Arjun Satish <arjun@confluent.io>
@wicknicks
Copy link
Copy Markdown
Contributor Author

@ijuma, @xvrl could you please look at this change? thanks!

@wicknicks
Copy link
Copy Markdown
Contributor Author

@ijuma could we merge this to v1.1? thanks!

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, just a couple comments


try:
wait_until(lambda: len(self.pids(node)) == 0, timeout_sec=60, err_msg="Kafka node failed to stop")
except Exception:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should log this exception in case thread_dump raises an unexpected error. We don't want to lose the original error.

Comment thread tests/kafkatest/services/kafka/kafka.py Outdated

def thread_dump(self, node):
for pid in self.pids(node):
node.account.signal(pid, signal.SIGQUIT, allow_fail=False)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's fine to allow failure since this is just for debugging?

Signed-off-by: Arjun Satish <arjun@confluent.io>
@wicknicks wicknicks force-pushed the thread-dump-broker-stop-fail-tests branch from 3482913 to 9665796 Compare July 18, 2018 18:52
@wicknicks
Copy link
Copy Markdown
Contributor Author

@hachikuji fixed those issues. thanks!

Copy link
Copy Markdown
Member

@ijuma ijuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, LGTM

@hachikuji hachikuji merged commit 0b5fd99 into apache:trunk Jul 20, 2018
hachikuji pushed a commit that referenced this pull request Jul 20, 2018
In system tests, it is useful to have the thread dumps if a broker cannot be stopped using SIGTERM.

Reviewers: Xavier Léauté <xl+github@xvrl.net>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
hachikuji pushed a commit that referenced this pull request Jul 25, 2018
In system tests, it is useful to have the thread dumps if a broker cannot be stopped using SIGTERM.

Reviewers: Xavier Léauté <xl+github@xvrl.net>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
allenxwang pushed a commit to allenxwang/kafka that referenced this pull request Aug 24, 2018
…:1.1.1-sync to 1.1-nflx

* commit '9611672e287c1a7933a78590e3f381da2ae7d136': (57 commits)
  MINOR: increase dev version from 1.1.1-SNAPSHOT to 1.1.2-SNAPSHOT (apache#5409)
  MINOR: Add thread dumps if broker node cannot be stopped (apache#5373)
  MINOR: update release.py
  MINOR: fix upgrade docs for Streams (apache#5392)
  MINOR: improve docs version numbers (apache#5372)
  Update version on the branch to 1.1.2-SNAPSHOT
  KAFKA-6292; Improve FileLogInputStream batch position checks to avoid type overflow (apache#4928)
  HOTFIX: Fix checkstyle errors in MetricsTest (apache#5345)
  KAFKA-7136: Avoid deadlocks in synchronized metrics reporters (apache#5341)
  MINOR: Close timing window in SimpleAclAuthorizer startup (apache#5318)
  MINOR: Use kill_java_processes when killing ConsoleConsumer in system tests (apache#5297)
  KAFKA-7104: More consistent leader's state in fetch response (apache#5305)
  Revert "MINOR: Avoid coarse lock in Pool#getAndMaybePut (apache#5258)"
  MINOR: Avoid coarse lock in Pool#getAndMaybePut (apache#5258)
  MINOR: bugfix streams total metrics (apache#5277)
  KAFKA-7082: Concurrent create topics may throw NodeExistsException (apache#5259)
  MINOR: Upgrade to Gradle 4.8.1
  KAFKA-7012: Don't process SSL channels without data to process (apache#5237)
  KAFKA-7058: Comparing schema default values using Objects#deepEquals()
  KAFKA-7047: Added SimpleHeaderConverter to plugin isolation whitelist
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants