Skip to content

KAFKA-13354: Topic metrics count request rate inconsistently with other request metrics#11474

Closed
dongjinleekr wants to merge 9 commits intoapache:trunkfrom
dongjinleekr:feature/KAFKA-13354
Closed

KAFKA-13354: Topic metrics count request rate inconsistently with other request metrics#11474
dongjinleekr wants to merge 9 commits intoapache:trunkfrom
dongjinleekr:feature/KAFKA-13354

Conversation

@dongjinleekr
Copy link
Copy Markdown
Contributor

I inspected this issue a little bit and found the following:

  1. There are 18 metrics in kafka.server:type=BrokerTopicMetrics and 4 of them should be counted per request, but actually counted per TopicPartition:

    • kafka.server:type=BrokerTopicMetrics,name=TotalProduceRequestsPerSec
    • kafka.server:type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec
    • kafka.server:type=BrokerTopicMetrics,name=FailedProduceRequestsPerSec
    • kafka.server:type=BrokerTopicMetrics,name=FailedFetchRequestsPerSec
  2. 5 of them are omitted in the documentation (see: KAFKA-13436: Omitted BrokerTopicMetrics metrics in the documentation #11473)

    • kafka.server:type=BrokerTopicMetrics,name=TotalProduceRequestsPerSec
    • kafka.server:type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec
    • kafka.server:type=BrokerTopicMetrics,name=FailedProduceRequestsPerSec
    • kafka.server:type=BrokerTopicMetrics,name=FailedFetchRequestsPerSec
    • kafka.server:type=BrokerTopicMetrics,name=BytesRejectedPerSec

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@dongjinleekr
Copy link
Copy Markdown
Contributor Author

@splett2 Is this approach the same thing you thought?

Copy link
Copy Markdown
Contributor

@splett2 splett2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, left a few comments. Can we add some ReplicaManager unit tests?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we would want this metric to be incremented once per request instead of once per TP.
ditto for totalProduceRequestRate, totalFetchRequestRate, failedFetchRequestRate

Otherwise the allTopicsStats will be inconsistent with the sum of individual topic stats.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(After some thinking) Oh, you are right. I omitted them. 😓

Comment on lines 961 to 965
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe something like
requestedTopics.map(brokerTopicStats.topicStats).foreach(_.totalProduceRequestRate.mark())
flows a bit better.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside of using 'map' is that it will create an intermediate collection.

Copy link
Copy Markdown
Contributor

@splett2 splett2 Nov 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good point, although I would imagine that the set of topics per request is usually small. In that case it may be fine to keep the update code as is. Maybe change:

case topic => to just topic => as i dont think the case is necessary for iterating over a set.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all // in short, just replace case topic => to topic =>; do I understand correctly?

@dongjinleekr
Copy link
Copy Markdown
Contributor Author

@dajac @splett2 Here is the update. I updated/added some tests to check whether the metrics work correctly.

Add to this, I found that some methods and variables in ReplicaManagerTest are not separated per test, like the topic name foo. As soon as this PR is merged, I will open an additional PR for improving it.

@github-actions
Copy link
Copy Markdown

This PR is being marked as stale since it has not had any activity in 90 days. If you
would like to keep this PR alive, please leave a comment asking for a review. If the PR has
merge conflicts, update it with the latest from the base branch.

If you are having difficulty finding a reviewer, please reach out on the [mailing list](https://kafka.apache.org/contact).

If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.

@github-actions github-actions Bot added the stale Stale PRs label Dec 25, 2024
@github-actions
Copy link
Copy Markdown

This PR has been closed since it has not had any activity in 120 days. If you feel like this
was a mistake, or you would like to continue working on it, please feel free to re-open the
PR and ask for a review.

@github-actions github-actions Bot added the closed-stale PRs that were closed due to inactivity label Jan 25, 2025
@github-actions github-actions Bot closed this Jan 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

closed-stale PRs that were closed due to inactivity stale Stale PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants