KAFKA-8586: Fail source tasks when producers fail to send records#6993
KAFKA-8586: Fail source tasks when producers fail to send records#6993rhauch merged 5 commits intoapache:trunkfrom
Conversation
|
@wicknicks would you mind taking a look? |
|
@rayokota would you mind taking a look? |
|
@C0urante do we know what errors cause the |
|
@wicknicks no, this is not the result of overriding any producer configs. Like is detailed in the Jira, the producer only retries on retriable errors, and |
wicknicks
left a comment
There was a problem hiding this comment.
some questions/comments
|
@wicknicks I've rebased on the latest |
wicknicks
left a comment
There was a problem hiding this comment.
Modulo some comments where you wanted external inputs, LGTM.
|
Thanks @wicknicks! @rhauch could you take a look at this when you get a chance? |
rhauch
left a comment
There was a problem hiding this comment.
Looks pretty good, though I have one comment and one question about potentially removing the second call to maybeThrowProducerSendException().
) Changed Connect's `WorkerSourceTask` to capture non-retriable exceptions from the `producer.send(...)` (e.g., authentication or authorization errors) and to fail the connector task when such an error is encountered. Modified the existing unit tests to verify this functionality. Note that most producer errors are retriable, and Connect will (by default) set up each producer with 1 max in-flight message and infinite retries. This change only affects non-retriable errors.
) Changed Connect's `WorkerSourceTask` to capture non-retriable exceptions from the `producer.send(...)` (e.g., authentication or authorization errors) and to fail the connector task when such an error is encountered. Modified the existing unit tests to verify this functionality. Note that most producer errors are retriable, and Connect will (by default) set up each producer with 1 max in-flight message and infinite retries. This change only affects non-retriable errors.
) Changed Connect's `WorkerSourceTask` to capture non-retriable exceptions from the `producer.send(...)` (e.g., authentication or authorization errors) and to fail the connector task when such an error is encountered. Modified the existing unit tests to verify this functionality. Note that most producer errors are retriable, and Connect will (by default) set up each producer with 1 max in-flight message and infinite retries. This change only affects non-retriable errors.
) Changed Connect's `WorkerSourceTask` to capture non-retriable exceptions from the `producer.send(...)` (e.g., authentication or authorization errors) and to fail the connector task when such an error is encountered. Modified the existing unit tests to verify this functionality. Note that most producer errors are retriable, and Connect will (by default) set up each producer with 1 max in-flight message and infinite retries. This change only affects non-retriable errors.
) Changed Connect's `WorkerSourceTask` to capture non-retriable exceptions from the `producer.send(...)` (e.g., authentication or authorization errors) and to fail the connector task when such an error is encountered. Modified the existing unit tests to verify this functionality. Note that most producer errors are retriable, and Connect will (by default) set up each producer with 1 max in-flight message and infinite retries. This change only affects non-retriable errors.
) Changed Connect's `WorkerSourceTask` to capture non-retriable exceptions from the `producer.send(...)` (e.g., authentication or authorization errors) and to fail the connector task when such an error is encountered. Modified the existing unit tests to verify this functionality. Note that most producer errors are retriable, and Connect will (by default) set up each producer with 1 max in-flight message and infinite retries. This change only affects non-retriable errors.
…rs fail to send records (apache#6993) TICKET = KAFKA-8586 LI_DESCRIPTION = EXIT_CRITERIA = HASH [1a5062c] ORIGINAL_DESCRIPTION = Changed Connect's `WorkerSourceTask` to capture non-retriable exceptions from the `producer.send(...)` (e.g., authentication or authorization errors) and to fail the connector task when such an error is encountered. Modified the existing unit tests to verify this functionality. Note that most producer errors are retriable, and Connect will (by default) set up each producer with 1 max in-flight message and infinite retries. This change only affects non-retriable errors. (cherry picked from commit 1a5062c)
…ache#6993) Changed Connect's `WorkerSourceTask` to capture non-retriable exceptions from the `producer.send(...)` (e.g., authentication or authorization errors) and to fail the connector task when such an error is encountered. Modified the existing unit tests to verify this functionality. Note that most producer errors are retriable, and Connect will (by default) set up each producer with 1 max in-flight message and infinite retries. This change only affects non-retriable errors. (cherry picked from commit 237e83d)
…ords (apache#6993)" This reverts commit 80f5799
…ache#6993) Changed Connect's `WorkerSourceTask` to capture non-retriable exceptions from the `producer.send(...)` (e.g., authentication or authorization errors) and to fail the connector task when such an error is encountered. Modified the existing unit tests to verify this functionality. Note that most producer errors are retriable, and Connect will (by default) set up each producer with 1 max in-flight message and infinite retries. This change only affects non-retriable errors. (cherry picked from commit 3aa9f99)
…ache#6993) (#239) Changed Connect's `WorkerSourceTask` to capture non-retriable exceptions from the `producer.send(...)` (e.g., authentication or authorization errors) and to fail the connector task when such an error is encountered. Modified the existing unit tests to verify this functionality. Note that most producer errors are retriable, and Connect will (by default) set up each producer with 1 max in-flight message and infinite retries. This change only affects non-retriable errors. (cherry picked from commit 3aa9f99)
Jira
Previously, if the producer for a source task failed to send a record with a non-retriable error, the record would be silently skipped over. The source task would be allowed to commit offsets for the skipped record, and its status would remain at
RUNNING.The changes here cause source tasks to transition to the
FAILEDstate if their producers fail to send a record with a non-retriable error, and they also change the logic for offset commits to wait for confirmation that records have made it to Kafka before their offsets can be committed.Tested by running Connect unit tests locally.
Committer Checklist (excluded from commit message)