Skip to content

KAFKA-5235: GetOffsetShell: new KafkaConsumer API, support for multiple topics, minimized number of requests to server#3051

Closed
tashoyan wants to merge 44 commits intoapache:trunkfrom
tashoyan:trunk
Closed

KAFKA-5235: GetOffsetShell: new KafkaConsumer API, support for multiple topics, minimized number of requests to server#3051
tashoyan wants to merge 44 commits intoapache:trunkfrom
tashoyan:trunk

Conversation

@tashoyan
Copy link
Copy Markdown
Contributor

@tashoyan tashoyan commented May 14, 2017

This PR addresses two improvements:

KAFKA-5235 GetOffsetShell: retrieve offsets for all given topics and partitions with single request to the broker
KAFKA-5234 GetOffsetShell: retrieve offsets for multiple topics with single request

  1. Previous implementation used SimpleConsumer to get offsets and old Producer API to get topic/partition metadata. Previous implementation determined a leader broker for each partition and then requested the leader for offsets. In total, it did as many requests to the broker as the number of partitions (plus a request to Zookeeper for metadata).
    New implementation kafka-get-offsets.sh uses KafkaConsumer API. It makes at most two requests to the broker: 1) to query existing topics and partitions, 2) to grab all requested offsets. New implementation correctly handles non-existing topics and partitions asked by user:

kafka-get-offsets.sh --bootstrap-servers vm:9092 --topics AAA,ZZZ --partitions 0,1
AAA:0:7
AAA:1:Partition not found
ZZZ:0:Topic not found

  1. Previously, user could get offsets for one topic only. Now user can get offsets for many topics at once:
    kafka-get-offsets.sh --bootstrap-servers vm:9092 --topics AAA,ZZZ
    Moreover, now user is able to retrieve offsets for all topics - this is the default when no topics specified:
    kafka-get-offsets.sh --bootstrap-servers vm:9092
    Thanks to this feature, there is no need anymore to retrieve all topics by means of kafka-topics.sh.
    When no topics specified, the new kafka-get-offsets.sh tool takes into account only user-level topics and ignores Kafka-internal topics (i.e. consumer offsets). This behavior can be altered via a special command line argument:
    kafka-get-offsets.sh --bootstrap-servers vm:9092 --include-internal-topics

  2. New kafka-get-offsets.sh tool is consistent with other console tools with respect to command line argument names. In addition, kafka-get-offsets.sh tool gives the possibility to pass an arbitrary setting to KafkaConsumer via --consumer-property argument. Old command line arguments are marked as deprecated with appropriate warning messages.

I hope, now kafka-get-offsets.sh is easier in use and gives performance improvement.
@junrao I suppose you may want to review.

tashoyan added 30 commits May 1, 2017 22:52
…is new implementation. Now it works for replicated topics.
… are available in the key. Remove unused constant.
…e tools. Deprecate old args and display warnings.
…g topic. Update tests to distinguish the two error handling paths.
@asfgit
Copy link
Copy Markdown

asfgit commented Jun 16, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/5378/
Test PASSed (JDK 8 and Scala 2.12).

@tashoyan
Copy link
Copy Markdown
Contributor Author

@guozhangwang would you review this PR please? I hope it won't take much time. Why keep it awaiting in the queue for months?

@ijuma
Copy link
Copy Markdown
Member

ijuma commented Jun 22, 2017

@tashoyan Thanks for the PR and sorry for the delay. Have you checked if the ConsumerGroupCommand doesn't have the required functionality?

@tashoyan
Copy link
Copy Markdown
Contributor Author

@ijuma ConsumerGroupCommand is a different tool. It presents information on a per-consumer-group basis. It shows the offset of the specified consumer group in the specified topic. ConsumerGroupCommand cannot show the latest offset in the topic or the offset by a timestamp. Also it cannot show offsets for multiple topics.

@tashoyan
Copy link
Copy Markdown
Contributor Author

@junrao @guozhangwang @gwenshap @ijuma may I kindly ask you to review this PR?

@asfgit
Copy link
Copy Markdown

asfgit commented Jul 31, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.11/6440/
Test FAILed (JDK 7 and Scala 2.11).

@asfgit
Copy link
Copy Markdown

asfgit commented Jul 31, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/6425/
Test FAILed (JDK 8 and Scala 2.12).

@asfgit
Copy link
Copy Markdown

asfgit commented Jul 31, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.11/6444/
Test PASSed (JDK 7 and Scala 2.11).

@asfgit
Copy link
Copy Markdown

asfgit commented Jul 31, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/6429/
Test PASSed (JDK 8 and Scala 2.12).

@alexsavio
Copy link
Copy Markdown

Is there any other way of getting offset information without GetOffsetShell? If not, this is a quite important PR, isn't it? Right now, GetOffsetShell doesn't work with SASL_SSL, and I can't find any solution for this. Thanks for the amazing work!

@tashoyan
Copy link
Copy Markdown
Contributor Author

tashoyan commented Sep 7, 2017

I also think that this PR is important. We need this functionality for our monitoring tools and for performance metrics. I hope, finally we will have it in the product.

@ijuma
Copy link
Copy Markdown
Member

ijuma commented May 26, 2018

@tashoyan Sorry for the delay. Since you have made changes to the command line parameters, can you please submit a KIP (https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals)?

@tashoyan
Copy link
Copy Markdown
Contributor Author

@ijuma I don't have access to create KIP. Could you give it to me? I wrote to dev mailgroup, but no answer.

@tashoyan
Copy link
Copy Markdown
Contributor Author

tashoyan commented Jun 1, 2018

KIP-308

@ijuma
Copy link
Copy Markdown
Member

ijuma commented Jun 14, 2018

Thanks @tashoyan. I did a minimal PR for inclusion in 2.0.0 and added you as a co-author:

#5220

ijuma added a commit to ijuma/kafka that referenced this pull request Jun 14, 2018
This does the minimal amount of work so that the tool
relies on public non-deprecated APIs (i.e. so that
it doesn't rely on old clients code).

Additional improvements have been proposed via
KIP-308.

There are a few other PRs that touch this class with
overlapping goals:

- apache#2891
- apache#3051
- apache#3320

One of them remains relevant in the context of KIP-308, but
the others are not. I included the authors of the 3 PRs as
co-authors.

Co-authored-by: Arseniy Tashoyan <tashoyan@gmail.com>
Co-authored-by: Vahid Hashemian <vahidhashemian@us.ibm.com>
Co-authored-by: Mohammed Amine GARMES
Co-authored-by: Ismael Juma <ismael@juma.me.uk>
ijuma added a commit to ijuma/kafka that referenced this pull request Jun 14, 2018
This does the minimal amount of work so that the tool
relies on public non-deprecated APIs (i.e. so that
it doesn't rely on old clients code).

Additional improvements have been proposed via
KIP-308.

There are a few other PRs that touch this class with
overlapping goals:

- apache#2891
- apache#3051
- apache#3320

One of them remains relevant in the context of KIP-308, but
the others are not. I included the authors of the 3 PRs as
co-authors.

Co-authored-by: Arseniy Tashoyan <tashoyan@gmail.com>
Co-authored-by: Vahid Hashemian <vahidhashemian@us.ibm.com>
Co-authored-by: Mohammed Amine GARMES
Co-authored-by: Ismael Juma <ismael@juma.me.uk>
ijuma added a commit that referenced this pull request Jun 14, 2018
This does the minimal amount of work so that the tool
relies on public non-deprecated APIs (i.e. it no longer
relies on Scala clients code).

Additional improvements (not included here) have
been proposed via KIP-308.

There are a few other PRs that touch this class with
overlapping goals:

- #2891
- #3051
- #3320

One of them remains relevant in the context of KIP-308, but
the others have been superseded. I included the authors of
the 3 PRs as co-authors.

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Vahid Hashemian <vahidhashemian@us.ibm.com>, Manikumar Reddy <manikumar.reddy@gmail.com>

Co-authored-by: Arseniy Tashoyan <tashoyan@gmail.com>
Co-authored-by: Vahid Hashemian <vahidhashemian@us.ibm.com>
Co-authored-by: Mohammed Amine GARMES
Co-authored-by: Ismael Juma <ismael@juma.me.uk>
vahidhashemian added a commit to vahidhashemian/kafka that referenced this pull request Jun 14, 2018
This does the minimal amount of work so that the tool
relies on public non-deprecated APIs (i.e. it no longer
relies on Scala clients code).

Additional improvements (not included here) have
been proposed via KIP-308.

There are a few other PRs that touch this class with
overlapping goals:

- apache#2891
- apache#3051
- apache#3320

One of them remains relevant in the context of KIP-308, but
the others have been superseded. I included the authors of
the 3 PRs as co-authors.

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Vahid Hashemian <vahidhashemian@us.ibm.com>, Manikumar Reddy <manikumar.reddy@gmail.com>

Co-authored-by: Arseniy Tashoyan <tashoyan@gmail.com>
Co-authored-by: Vahid Hashemian <vahidhashemian@us.ibm.com>
Co-authored-by: Mohammed Amine GARMES
Co-authored-by: Ismael Juma <ismael@juma.me.uk>
@tashoyan
Copy link
Copy Markdown
Contributor Author

Thanks @ijuma
Should I close this PR?

@ijuma
Copy link
Copy Markdown
Member

ijuma commented Jun 16, 2018

Maybe the easiest is to close and submit a new one once the KIP has been voted.

ying-zheng pushed a commit to ying-zheng/kafka that referenced this pull request Jul 6, 2018
This does the minimal amount of work so that the tool
relies on public non-deprecated APIs (i.e. it no longer
relies on Scala clients code).

Additional improvements (not included here) have
been proposed via KIP-308.

There are a few other PRs that touch this class with
overlapping goals:

- apache#2891
- apache#3051
- apache#3320

One of them remains relevant in the context of KIP-308, but
the others have been superseded. I included the authors of
the 3 PRs as co-authors.

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Vahid Hashemian <vahidhashemian@us.ibm.com>, Manikumar Reddy <manikumar.reddy@gmail.com>

Co-authored-by: Arseniy Tashoyan <tashoyan@gmail.com>
Co-authored-by: Vahid Hashemian <vahidhashemian@us.ibm.com>
Co-authored-by: Mohammed Amine GARMES
Co-authored-by: Ismael Juma <ismael@juma.me.uk>
@dajac
Copy link
Copy Markdown
Member

dajac commented Oct 24, 2020

Closing as this is addressed in #9430 (KIP-635).

@dajac dajac closed this Oct 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants