Skip to content

KAFKA-15556: Remove NetworkClientDelegate methods isUnavailable, maybeThrowAuthFailure, and tryConnect#15020

Closed
Phuc-Hong-Tran wants to merge 3 commits intoapache:trunkfrom
Phuc-Hong-Tran:trunk
Closed

KAFKA-15556: Remove NetworkClientDelegate methods isUnavailable, maybeThrowAuthFailure, and tryConnect#15020
Phuc-Hong-Tran wants to merge 3 commits intoapache:trunkfrom
Phuc-Hong-Tran:trunk

Conversation

@Phuc-Hong-Tran
Copy link
Copy Markdown
Contributor

Change:

*Refactored AbstractFetch so only Fetcher will check for node status when creating fetch requests.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@Phuc-Hong-Tran
Copy link
Copy Markdown
Contributor Author

Phuc-Hong-Tran commented Dec 15, 2023

@philipnee, I closed the other PR since closing and opening it again doesn't trigger the CI to start. Here is the context for this PR: #14406 (comment)

@Phuc-Hong-Tran
Copy link
Copy Markdown
Contributor Author

@philipnee, all builds succeed, test failures looks unrelated. PTAL, thanks.

@Phuc-Hong-Tran
Copy link
Copy Markdown
Contributor Author

@philipnee @kirktrue, Please take a look if you guys have some free time. Thanks in advance.

Copy link
Copy Markdown
Contributor

@philipnee philipnee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Phuc-Hong-Tran - thanks for taking time raising this PR. I apologize for not writing the the jira ticket well here, so I recommend to rethink how you would approach this refactor. In short - we don't need to handle isUnavailable and auth failure in the request manager, so it is ok to make the default implementation to just return false, and perform no ops for both methods. The reason is networkClientDelegate already handle these scenario, just in a slight different order to the LegacyKafkaConsumer.

}

protected Map<Node, FetchSessionHandler.FetchRequestData> prepareCloseFetchSessionRequests() {
protected Map<Node, FetchSessionHandler.FetchRequestData> prepareCloseFetchSessionRequests(boolean checkNodeAvailability) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final boolean checkNodeAvailability

final Node fetchTarget = cluster.nodeById(fetchTargetNodeId);

if (fetchTarget == null || isUnavailable(fetchTarget)) {
boolean fetchTargetAvailability = checkNodeAvailability ? (fetchTarget == null || isUnavailable(fetchTarget)) : fetchTarget == null;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final boolean

  • i would also call this isFetchTargetAvailable
  • let's not use inline if. this makes the code harder to read.

log.trace("Skipping fetch for partition {} because node {} is awaiting reconnect backoff", partition, node);
} else if (nodesWithPendingFetchRequests.contains(node.id())) {
log.trace("Skipping fetch for partition {} because previous request to {} has not been processed", partition, node);
if (checkNodeAvailability) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to avoid nested if?


return pollInternal(
prepareFetchRequests(),
prepareFetchRequests(checkNodeAvailability),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just pass false. the var above is useless.


return pollInternal(
prepareCloseFetchSessionRequests(),
prepareCloseFetchSessionRequests(checkNodeAvailability),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

*/
public synchronized int sendFetches() {
final Map<Node, FetchSessionHandler.FetchRequestData> fetchRequests = prepareFetchRequests();
boolean checkNodeAvailability = true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above


client.backoff(node, 500);
assertEquals(0, sendFetches());
assertEquals(1, sendFetches());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you changing the tests?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're not checking the node availablity using the networkClientDelegate in FetchRequestManager, the request that was made would pass through the check and made it way to unsentRequest. After the backoff the request should be sent already.

Should I not change the test and make new one instead?

* that have no existing requests in flight.
*/
protected Map<Node, FetchSessionHandler.FetchRequestData> prepareFetchRequests() {
protected Map<Node, FetchSessionHandler.FetchRequestData> prepareFetchRequests(boolean checkNodeAvailability) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final boolean

@Phuc-Hong-Tran
Copy link
Copy Markdown
Contributor Author

Phuc-Hong-Tran commented Jan 3, 2024

@philipnee, Thanks for the comments. When you said I should rethink about the approach, are you suggesting that I should start over from the idea phase?

@philipnee
Copy link
Copy Markdown
Contributor

hi @Phuc-Hong-Tran - yes, the abstractFetch implementation is based on the LegacyKafkaConsumer and therefore requires connection probing. We don't need that in the AsyncKafkaConsumer as it is being done right before sending out the requests.

@Phuc-Hong-Tran
Copy link
Copy Markdown
Contributor Author

I understand, will get another PR out ASAP.

@Phuc-Hong-Tran
Copy link
Copy Markdown
Contributor Author

@philipnee, PTAL, thanks in advance.

@Phuc-Hong-Tran
Copy link
Copy Markdown
Contributor Author

@philipnee, is there anything else that I should change?

@philipnee
Copy link
Copy Markdown
Contributor

hi @Phuc-Hong-Tran - thanks for the PR. Could you also clean up the isUnavailable and maybeThrowAuthFailure methods in the networkClientDelegate. I believe they aren't being used anywhere.

@Phuc-Hong-Tran
Copy link
Copy Markdown
Contributor Author

@philipnee Will do

@Phuc-Hong-Tran
Copy link
Copy Markdown
Contributor Author

@philipnee All done

@kirktrue
Copy link
Copy Markdown
Contributor

@Phuc-Hong-Tran—is this PR ready for review?

@Phuc-Hong-Tran
Copy link
Copy Markdown
Contributor Author

@kirktrue, it's ready for review

@github-actions
Copy link
Copy Markdown

This PR is being marked as stale since it has not had any activity in 90 days. If you would like to keep this PR alive, please ask a committer for review. If the PR has merge conflicts, please update it with the latest from trunk (or appropriate release branch)

If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.

@github-actions github-actions Bot added the stale Stale PRs label Jun 25, 2024
@mjsax mjsax added the consumer label Jul 22, 2024
@philipnee philipnee added the KIP-848 The Next Generation of the Consumer Rebalance Protocol label Jul 23, 2024
* @return
*/
@Override
protected boolean isUnavailable(Node node) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does seem a bit like half a refactor. Would it be possible to remove isUnavailable and maybeThrowAuthFailure entirely from AbstractFetch?

@github-actions github-actions Bot removed the stale Stale PRs label Jan 5, 2025
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 5, 2025

This PR is being marked as stale since it has not had any activity in 90 days. If you
would like to keep this PR alive, please leave a comment asking for a review. If the PR has
merge conflicts, update it with the latest from the base branch.

If you are having difficulty finding a reviewer, please reach out on the [mailing list](https://kafka.apache.org/contact).

If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.

@github-actions github-actions Bot added the stale Stale PRs label Apr 5, 2025
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2025

This PR has been closed since it has not had any activity in 120 days. If you feel like this
was a mistake, or you would like to continue working on it, please feel free to re-open the
PR and ask for a review.

@github-actions github-actions Bot added the closed-stale PRs that were closed due to inactivity label May 5, 2025
@github-actions github-actions Bot closed this May 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clients closed-stale PRs that were closed due to inactivity consumer KIP-848 The Next Generation of the Consumer Rebalance Protocol stale Stale PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants