-
Notifications
You must be signed in to change notification settings - Fork 15.1k
KAFKA-14203 Don't make snapshots on broker after metadata errors #12596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
cc18580
ad798ca
0ac599f
e1dda63
3883451
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -28,6 +28,8 @@ import org.apache.kafka.server.common.ApiMessageAndVersion | |||||||||||||||||
| import org.apache.kafka.server.fault.FaultHandler | ||||||||||||||||||
| import org.apache.kafka.snapshot.SnapshotReader | ||||||||||||||||||
|
|
||||||||||||||||||
| import java.util.concurrent.atomic.AtomicBoolean | ||||||||||||||||||
|
|
||||||||||||||||||
|
|
||||||||||||||||||
| object BrokerMetadataListener { | ||||||||||||||||||
| val MetadataBatchProcessingTimeUs = "MetadataBatchProcessingTimeUs" | ||||||||||||||||||
|
|
@@ -41,8 +43,22 @@ class BrokerMetadataListener( | |||||||||||||||||
| val maxBytesBetweenSnapshots: Long, | ||||||||||||||||||
| val snapshotter: Option[MetadataSnapshotter], | ||||||||||||||||||
| brokerMetrics: BrokerServerMetrics, | ||||||||||||||||||
| metadataLoadingFaultHandler: FaultHandler | ||||||||||||||||||
| _metadataLoadingFaultHandler: FaultHandler | ||||||||||||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is unrelated to this diff but while reviewing this code, I noticed that we don't handle or count errors when generating a
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The errors handled by this FaultHandler are non-fatal. AFAIK only the active controller will experience fatal metadata errors. For BrokerMetadataListener, we just log an ERROR and increment the metrics. We call this fault handler in three cases in BrokerMetadataListener:
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Failures in this code seems to be fatal given the latest code: kafka/core/src/main/scala/kafka/server/metadata/BrokerMetadataListener.scala Lines 323 to 330 in cc18580
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, I see what you mean. It does stand to reason that an error when handling a record would lead to an error when building the MetadataImage itself. Seems like we could cover the
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, the apply errors should be handled by |
||||||||||||||||||
| ) extends RaftClient.Listener[ApiMessageAndVersion] with KafkaMetricsGroup { | ||||||||||||||||||
|
|
||||||||||||||||||
| private val metadataFaultOccurred = new AtomicBoolean(false) | ||||||||||||||||||
| private val metadataLoadingFaultHandler: FaultHandler = new FaultHandler() { | ||||||||||||||||||
| override def handleFault(failureMessage: String, cause: Throwable): RuntimeException = { | ||||||||||||||||||
| // If the broker has any kind of error handling metadata records or publishing a new image | ||||||||||||||||||
| // we will disable taking new snapshots in order to preserve the local metadata log. Once we | ||||||||||||||||||
| // encounter a metadata processing error, the broker will be in an undetermined state. | ||||||||||||||||||
| if (metadataFaultOccurred.compareAndSet(false, true)) { | ||||||||||||||||||
| error("Disabling metadata snapshots until this broker is restarted.") | ||||||||||||||||||
| } | ||||||||||||||||||
| _metadataLoadingFaultHandler.handleFault(failureMessage, cause) | ||||||||||||||||||
| } | ||||||||||||||||||
| } | ||||||||||||||||||
|
|
||||||||||||||||||
| private val logContext = new LogContext(s"[BrokerMetadataListener id=$brokerId] ") | ||||||||||||||||||
| private val log = logContext.logger(classOf[BrokerMetadataListener]) | ||||||||||||||||||
| logIdent = logContext.logPrefix() | ||||||||||||||||||
|
|
@@ -147,7 +163,9 @@ class BrokerMetadataListener( | |||||||||||||||||
|
|
||||||||||||||||||
| private def maybeStartSnapshot(): Unit = { | ||||||||||||||||||
| snapshotter.foreach { snapshotter => | ||||||||||||||||||
| if (snapshotter.maybeStartSnapshot(_highestTimestamp, _delta.apply())) { | ||||||||||||||||||
| if (metadataFaultOccurred.get()) { | ||||||||||||||||||
| trace("Not starting metadata snapshot since we previously had an error") | ||||||||||||||||||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I went with TRACE here to avoid spamming the logs with this message on each metadata commit
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Honestly I think this should be WARN or ERROR to draw our attention to it.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My concern is that in an emergency scenario when things are going badly, we could lose important logs due to log rotation if we're spamming a bunch of the same error/warn messages.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah. That's fair. I guess it's OK like it is for now. I am going to refactor this as part of the effort to unify Image handling between broker + controller. |
||||||||||||||||||
| } else if (snapshotter.maybeStartSnapshot(_highestTimestamp, _delta.apply())) { | ||||||||||||||||||
| _bytesSinceLastSnapshot = 0L | ||||||||||||||||||
| } | ||||||||||||||||||
| } | ||||||||||||||||||
|
|
@@ -307,11 +325,21 @@ class BrokerMetadataListener( | |||||||||||||||||
|
|
||||||||||||||||||
| private def publish(publisher: MetadataPublisher): Unit = { | ||||||||||||||||||
| val delta = _delta | ||||||||||||||||||
| _image = _delta.apply() | ||||||||||||||||||
| try { | ||||||||||||||||||
| _image = _delta.apply() | ||||||||||||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that it is possible for
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rewinding and re-applying does sound useful for some kind of automatic error mitigation, but I think it will be a quite a bit of work. As it stands, I believe the broker can only process metadata going forward. I can think of a degenerate case we have today where |
||||||||||||||||||
| } catch { | ||||||||||||||||||
|
mumrah marked this conversation as resolved.
|
||||||||||||||||||
| case t: Throwable => | ||||||||||||||||||
| // If we cannot apply the delta, this publish event will throw and we will not publish a new image. | ||||||||||||||||||
| // The broker will continue applying metadata records and attempt to publish again. | ||||||||||||||||||
| throw metadataLoadingFaultHandler.handleFault(s"Error applying metadata delta $delta", t) | ||||||||||||||||||
| } | ||||||||||||||||||
|
|
||||||||||||||||||
| _delta = new MetadataDelta(_image) | ||||||||||||||||||
| if (isDebugEnabled) { | ||||||||||||||||||
| debug(s"Publishing new metadata delta $delta at offset ${_image.highestOffsetAndEpoch().offset}.") | ||||||||||||||||||
| } | ||||||||||||||||||
|
|
||||||||||||||||||
| // This publish call is done with its own try-catch and fault handler | ||||||||||||||||||
| publisher.publish(delta, _image) | ||||||||||||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this correct? If there was an error in Is there a way we can write tests for this code and scenarios so that we can increase our confidence that this code behaves as we expect?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, good catch. I missed a I agree that more tests would be very useful as we harden this code path. I'll see what I can come up with for this PR and we can continue adding more tests after 3.3 |
||||||||||||||||||
|
|
||||||||||||||||||
| // Update the metrics since the publisher handled the lastest image | ||||||||||||||||||
|
|
||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment applies to this code:
I think this code attempts to read and replay the entire committed log. I wonder if this code should be more conservative if it encounters an error replaying a record and only read the current batch before updating the image.
Note that this code is used for both snapshots and log segments. For snapshots, the entire snapshot needs to be in one
deltaupdate.