Skip to content

KAFKA-14035: Fix NPE caused by missing null check in SnapshottableHashTable::mergeFrom()#12371

Merged
hachikuji merged 1 commit intoapache:trunkfrom
niket-goel:kafka-14035
Jul 1, 2022
Merged

KAFKA-14035: Fix NPE caused by missing null check in SnapshottableHashTable::mergeFrom()#12371
hachikuji merged 1 commit intoapache:trunkfrom
niket-goel:kafka-14035

Conversation

@niket-goel
Copy link
Copy Markdown
Contributor

Also adds a simple test case that fails without the check

@niket-goel niket-goel changed the title KMETA-213: Fix NPE caused by missing null check in SnapshottableHashTable::mergeFrom() KAFKA-14035: Fix NPE caused by missing null check in SnapshottableHashTable::mergeFrom() Jun 30, 2022
…hTable::mergeFrom()

Also adds a simple test case that fails without the check
Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

@hachikuji hachikuji merged commit c19398e into apache:trunk Jul 1, 2022
hachikuji pushed a commit that referenced this pull request Jul 1, 2022
The NPE causes the kraft controller to be in an inconsistent state. 

Reviewers: Jason Gustafson <jason@confluent.io>
hachikuji pushed a commit that referenced this pull request Jul 1, 2022
The NPE causes the kraft controller to be in an inconsistent state. 

Reviewers: Jason Gustafson <jason@confluent.io>
hachikuji pushed a commit that referenced this pull request Jul 10, 2022
The NPE causes the kraft controller to be in an inconsistent state. 

Reviewers: Jason Gustafson <jason@confluent.io>
@iblislin
Copy link
Copy Markdown
Member

iblislin commented Jan 4, 2023

Hi,

I still ran into the issue of NPE on a 3.3.1 cluster with 3 nodes.

The actual line throwing the NPE is here: https://github.com/apache/kafka/blob/3.3.1/metadata/src/main/java/org/apache/kafka/timeline/SnapshottableHashTable.java#L125

But I cannot speak in Java, no idea about the root cause.

The traceback:

kafka-k3-1  | [2023-01-04 06:23:20,276] ERROR Encountered fatal fault: exception while renouncing leadership (org.apache.kafka.server.fault.ProcessExitingFaultHandler)
kafka-k3-1  | java.lang.NullPointerException
kafka-k3-1  |   at org.apache.kafka.timeline.SnapshottableHashTable$HashTier.mergeFrom(SnapshottableHashTable.java:125)
kafka-k3-1  |   at org.apache.kafka.timeline.Snapshot.mergeFrom(Snapshot.java:68)
kafka-k3-1  |   at org.apache.kafka.timeline.SnapshotRegistry.deleteSnapshot(SnapshotRegistry.java:236)
kafka-k3-1  |   at org.apache.kafka.timeline.SnapshotRegistry$SnapshotIterator.remove(SnapshotRegistry.java:67)
kafka-k3-1  |   at org.apache.kafka.timeline.SnapshotRegistry.revertToSnapshot(SnapshotRegistry.java:214)
kafka-k3-1  |   at org.apache.kafka.controller.QuorumController.renounce(QuorumController.java:1232)
kafka-k3-1  |   at org.apache.kafka.controller.QuorumController.access$3300(QuorumController.java:150)
kafka-k3-1  |   at org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$handleLeaderChange$3(QuorumController.java:1076)
kafka-k3-1  |   at org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$appendRaftEvent$4(QuorumController.java:1101)
kafka-k3-1  |   at org.apache.kafka.controller.QuorumController$ControlEvent.run(QuorumController.java:496)

@iblislin
Copy link
Copy Markdown
Member

iblislin commented Jan 4, 2023

There are related discussions on StackOverflow: https://stackoverflow.com/questions/74679400.

@hoangphuocbk
Copy link
Copy Markdown

@iblislin, I had same problem. Did your solution work?

@iblislin
Copy link
Copy Markdown
Member

@hoangphuocbk yeah, that patch works fine for me. I have a kraft cluster with that patch eanbled up for 45 days already.

@showuon
Copy link
Copy Markdown
Member

showuon commented May 15, 2023

@iblislin @hoangphuocbk , this issue will be fixed in v3.4.1 and v3.5.0 via this patch: #13653 . Thanks for reporting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants