Skip to content

Expose Kafka metrics for User Event consumer as json endpoint#4665

Merged
chetanmeh merged 1 commit intoapache:masterfrom
chetanmeh:expose-kafka-metrics
Oct 7, 2019
Merged

Expose Kafka metrics for User Event consumer as json endpoint#4665
chetanmeh merged 1 commit intoapache:masterfrom
chetanmeh:expose-kafka-metrics

Conversation

@chetanmeh
Copy link
Member

@chetanmeh chetanmeh commented Oct 3, 2019

This PR expose the metrics related to specific Kafka producer or consumer over /metrics/kafka endpoint.

Description

For now this feature is enabled for User Event service where the event consumer's Kafka consumer instance metrics are exposed. Such metrics can provide insight into the Kafka consumers processing stats.

The http endpoint is not exposed to end user and meant for developers and system admins to understand the performance characteristics of the consumer at any given time by direct access

It also registers the KamonMetricsReporter (introduced via #4481) with the User Event consumer to expose some of those metrics as Kamon metrics

A typical response for the /metrics/kafka endpoint is like below ( > 100 entries)

[
  {
    "description": "The number of responses received per second",
    "group": "consumer-node-metrics",
    "name": "response-rate",
    "tags": {
      "client-id": "event-consumer",
      "node-id": "node--1"
    },
    "value": 0.05848409232688709
  },
  {
    "description": "The total number of connections with successful authentication",
    "group": "consumer-metrics",
    "name": "successful-authentication-total",
    "tags": {
      "client-id": "event-consumer"
    },
    "value": 0
  },
  {
    "description": "The maximum size of any request sent.",
    "group": "consumer-metrics",
    "name": "request-size-max",
    "tags": {
      "client-id": "event-consumer"
    },
    "value": 179
  },
  {
    "description": "The total number of bytes consumed",
    "group": "consumer-fetch-manager-metrics",
    "name": "bytes-consumed-total",
    "tags": {
      "client-id": "event-consumer"
    },
    "value": 0
  }
]

Later we plan to enable this feature with Activation Persister Service (#4632) also

Related issue and scope

  • I opened an issue to propose and discuss this change (#????)

My changes affect the following components

  • API
  • Controller
  • Message Bus (e.g., Kafka)
  • Loadbalancer
  • Invoker
  • Intrinsic actions (e.g., sequences, conductors)
  • Data stores (e.g., CouchDB)
  • Tests
  • Deployment
  • CLI
  • General tooling
  • Documentation

Types of changes

  • Bug fix (generally a non-breaking change which closes an issue).
  • Enhancement or new feature (adds new functionality).
  • Breaking change (a bug fix or enhancement which changes existing behavior).

Checklist:

  • I signed an Apache CLA.
  • I reviewed the style guides and followed the recommendations (Travis CI will check :).
  • I added tests to cover my changes.
  • My changes require further changes to the documentation.
  • I updated the documentation where necessary.

@chetanmeh
Copy link
Member Author

For complete metrics json see kafka-metrics.txt

@codecov-io
Copy link

codecov-io commented Oct 3, 2019

Codecov Report

Merging #4665 into master will decrease coverage by 6.15%.
The diff coverage is 81.25%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4665      +/-   ##
==========================================
- Coverage   84.59%   78.43%   -6.16%     
==========================================
  Files         184      185       +1     
  Lines        8438     8454      +16     
  Branches      591      589       -2     
==========================================
- Hits         7138     6631     -507     
- Misses       1300     1823     +523
Impacted Files Coverage Δ
...enwhisk/connector/kafka/KamonMetricsReporter.scala 0% <0%> (ø) ⬆️
...pache/openwhisk/connector/kafka/KafkaMetrics.scala 86.66% <86.66%> (ø)
...core/database/cosmosdb/RxObservableImplicits.scala 0% <0%> (-100%) ⬇️
...ore/database/cosmosdb/cache/CacheInvalidator.scala 0% <0%> (-100%) ⬇️
...core/database/cosmosdb/CosmosDBArtifactStore.scala 0% <0%> (-96.17%) ⬇️
...tabase/cosmosdb/cache/CacheInvalidatorConfig.scala 0% <0%> (-94.74%) ⬇️
...sk/core/database/cosmosdb/CosmosDBViewMapper.scala 0% <0%> (-92.6%) ⬇️
...e/database/cosmosdb/cache/ChangeFeedListener.scala 0% <0%> (-86.67%) ⬇️
...e/database/cosmosdb/cache/KafkaEventProducer.scala 0% <0%> (-76.48%) ⬇️
...whisk/core/database/cosmosdb/CosmosDBSupport.scala 0% <0%> (-74.08%) ⬇️
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6139b00...a3b91dd. Read the comment docs.

Copy link
Member

@rabbah rabbah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the admin interfaces have some basic auth protection out of the box?

@chetanmeh
Copy link
Member Author

should the admin interfaces have some basic auth protection out of the box?

For now I think its fine. The 2 endpoints exposed are around metrics only (support GET with no state change) and further such microservices are meant to be exposed within network only and not for outside access

@chetanmeh chetanmeh merged commit 30813d0 into apache:master Oct 7, 2019
@chetanmeh chetanmeh deleted the expose-kafka-metrics branch October 7, 2019 09:22
BillZong pushed a commit to BillZong/openwhisk that referenced this pull request Nov 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants