Skip to content

Conversation

@merlimat
Copy link
Contributor

Motivation

Expose Pulsar broker metrics in a format suitable for Prometheus monitoring platform.

Modifications

Added a new REST endpoint (/admin/broker-stats/metrics-prometheus) that can be called to fetch the stats.

Metrics are returned in text format as defined at https://prometheus.io/docs/instrumenting/exposition_formats/

Result

Prometheus can easily be used to monitor a Pulsar cluster and create time-series dashboards for namespaces.

@merlimat merlimat added the type/feature The PR added a new feature or issue requested a new feature label Mar 14, 2017
@merlimat merlimat added this to the 1.17 milestone Mar 14, 2017
@merlimat merlimat self-assigned this Mar 14, 2017
@merlimat
Copy link
Contributor Author

/cc @agarman. I also have some dashboards on the works

@agarman
Copy link

agarman commented Mar 15, 2017

That's great. Sorry my team's participation is currently in limbo.

@merlimat merlimat modified the milestones: 1.17, 1.18 Mar 31, 2017
Copy link
Contributor

@rdhabalia rdhabalia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM just minor comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also check isSorted(other.boundaries)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the other is also a StatsBuckets the sorted check would have been already performed in its own constructor

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As NamespaceStatsAggregator already has thread-local localNamespaceStats, so what if we make PrometheusMetricsGenerator singleton and move NamespaceStatsAggregator to class variable so, we don't have to create new object of PrometheusMetricsGenerator and NamespaceStatsAggregator everytime. will it be a problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, made everything static.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if pulsar().getExecutor() has 20 threads in pool so, if all of them keeps AggregatedNamespaceStats as thread-local then does it make sense to use executor with lower number of thread to reduce number of stats objects in thread-local?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. We could have a 1 thread executor for such background tasks. I was looking the the /broker-stats/metrics endpoint and we're doing generation from the web thread, which could be 1 out 32 :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be i < buckets.length

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, fixed.

@merlimat
Copy link
Contributor Author

@rdhabalia Updated, PTAL

Copy link
Contributor

@rdhabalia rdhabalia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.. just minor comments. 👍

@SuppressWarnings("restriction")
@Override
public double get() {
return sun.misc.VM.maxDirectMemory();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use JvmMetrics. getJvmDirectMemoryUsed() as we know that sun.misc.VM.maxDirectMemory() may give wrong result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to get the max direct memory configured. For the used mem, I'm using JvmMetrics. getJvmDirectMemoryUsed() a couple of lines above :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops.. my bad.

import javax.ws.rs.PathParam;
import javax.ws.rs.Produces;
import javax.ws.rs.WebApplicationException;
import javax.ws.rs.container.AsyncResponse;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there are multiple unused imports in this class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@merlimat merlimat merged commit e93a4c3 into apache:master May 31, 2017
hrsakai pushed a commit to hrsakai/pulsar that referenced this pull request Dec 10, 2020
hangc0276 pushed a commit to hangc0276/pulsar that referenced this pull request May 26, 2021
michaeljmarshall added a commit that referenced this pull request Sep 7, 2022
### Motivation

When a Pulsar topic is unloaded from a broker, certain metrics related to that topic will appear to remain active for the broker for 5 minutes. This is confusing for troubleshooting because it makes the topic appear to be owned by multiple brokers for a short period of time. See below for a way to reproduce this behavior.

In order to solve this "zombie" metric problem, I propose we remove the timestamps that get exported with each Prometheus metric served by the broker.

### Analysis

Since we introduced Prometheus metrics in #294, we have exported a timestamp along with most metrics. This is an optional, valid part of the spec defined [here](https://prometheus.io/docs/instrumenting/exposition_formats/#comments-help-text-and-type-information). However, after our adoption of Prometheus metrics, the Prometheus project released version 2.0 with a significant improvement to its concept of staleness. In short, before 2.0, a metric that was in the last scrape but not the next one (this often happens for topics that are unloaded) will essentially inherit the most recent value for the last 5 minute window. If there isn't one in the past 5 minutes, the metric becomes "stale" and isn't reported. Starting in 2.0, there was new logic to consider a value stale the very first time that it is not reported in a scrape. Importantly, this new behavior is only available if you do not export timestamps with metrics, as documented here: https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness. We want to use the new behavior because it gives better insight into all topic metrics, which are subject to move between brokers at any time.

This presentation https://www.youtube.com/watch?v=GcTzd2CLH7I and slide deck https://promcon.io/2017-munich/slides/staleness-in-prometheus-2-0.pdf document the feature in detail. This blog post was also helpful: https://www.robustperception.io/staleness-and-promql/.

Additional motivation comes from mailing list threads like this one https://groups.google.com/g/prometheus-users/c/8OFAwp1OEcY. It says:

> Note, however, that adding timestamps is an extremely niche use
case. Most of the users who think the need it should actually not do
it.
>
> The main usecases within that tiny niche are federation and mirroring
the data from another monitoring system.

The Prometheus Go client also indicates a similar motivation: https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#NewMetricWithTimestamp.

The OpenMetrics project also recommends against exporting timestamps: https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#exposing-timestamps.

As such, I think we are not a niche use case, and we should not add timestamps to our metrics.

### Reproducing the problem

1. Run any 2.x version of Prometheus (I used 2.31.0) along with the following scrape config:
```yaml
  - job_name: broker
    honor_timestamps: true
    scrape_interval: 30s
    scrape_timeout: 10s
    metrics_path: /metrics
    scheme: http
    follow_redirects: true
    static_configs:
      - targets: ["localhost:8080"]
```
2. Start pulsar standalone on the same machine. I used a recently compiled version of master.
3. Publish messages to a topic.
4. Observe `pulsar_in_messages_total` metric for the topic in the prometheus UI (localhost:9090)
5. Stop the producer.
6. Unload the topic from the broker.
7. Optionally, `curl` the metrics endpoint to verify that the topic’s `pulsar_in_messages_total` metric is no longer reported.
8. Watch the metrics get reported in prometheus for 5 additional minutes.

When you set `honor_timestamps: false`, the metric stops getting reported right after the topic is unloaded, which is the desired behavior.

### Modifications

* Remove all timestamps from metrics
* Fix affected tests and test files (some of those tests were in the proxy and the function worker, but no code was changed for those modules)

### Verifying this change

This change is accompanied by updated tests.

### Does this pull request potentially affect one of the following parts:

This is technically a breaking change to the metrics, though I would consider it a bug fix at this point. I will discuss it on the mailing list to ensure it gets proper visibility.

Given how frequently Pulsar changes which metrics are exposed between each scrape, I think this is an important fix that should be cherry picked to older release branches. Technically, we can avoid cherry picking this change if we advise users to set `honor_timestamps: false`. However, I think it is better to just remove them.

### Documentation
- [x] `doc-not-needed`
Technoboy- pushed a commit that referenced this pull request Feb 8, 2023
### Motivation

When a Pulsar topic is unloaded from a broker, certain metrics related to that topic will appear to remain active for the broker for 5 minutes. This is confusing for troubleshooting because it makes the topic appear to be owned by multiple brokers for a short period of time. See below for a way to reproduce this behavior.

In order to solve this "zombie" metric problem, I propose we remove the timestamps that get exported with each Prometheus metric served by the broker.

### Analysis

Since we introduced Prometheus metrics in #294, we have exported a timestamp along with most metrics. This is an optional, valid part of the spec defined [here](https://prometheus.io/docs/instrumenting/exposition_formats/#comments-help-text-and-type-information). However, after our adoption of Prometheus metrics, the Prometheus project released version 2.0 with a significant improvement to its concept of staleness. In short, before 2.0, a metric that was in the last scrape but not the next one (this often happens for topics that are unloaded) will essentially inherit the most recent value for the last 5 minute window. If there isn't one in the past 5 minutes, the metric becomes "stale" and isn't reported. Starting in 2.0, there was new logic to consider a value stale the very first time that it is not reported in a scrape. Importantly, this new behavior is only available if you do not export timestamps with metrics, as documented here: https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness. We want to use the new behavior because it gives better insight into all topic metrics, which are subject to move between brokers at any time.

This presentation https://www.youtube.com/watch?v=GcTzd2CLH7I and slide deck https://promcon.io/2017-munich/slides/staleness-in-prometheus-2-0.pdf document the feature in detail. This blog post was also helpful: https://www.robustperception.io/staleness-and-promql/.

Additional motivation comes from mailing list threads like this one https://groups.google.com/g/prometheus-users/c/8OFAwp1OEcY. It says:

> Note, however, that adding timestamps is an extremely niche use
case. Most of the users who think the need it should actually not do
it.
>
> The main usecases within that tiny niche are federation and mirroring
the data from another monitoring system.

The Prometheus Go client also indicates a similar motivation: https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#NewMetricWithTimestamp.

The OpenMetrics project also recommends against exporting timestamps: https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#exposing-timestamps.

As such, I think we are not a niche use case, and we should not add timestamps to our metrics.

### Reproducing the problem

1. Run any 2.x version of Prometheus (I used 2.31.0) along with the following scrape config:
```yaml
  - job_name: broker
    honor_timestamps: true
    scrape_interval: 30s
    scrape_timeout: 10s
    metrics_path: /metrics
    scheme: http
    follow_redirects: true
    static_configs:
      - targets: ["localhost:8080"]
```
2. Start pulsar standalone on the same machine. I used a recently compiled version of master.
3. Publish messages to a topic.
4. Observe `pulsar_in_messages_total` metric for the topic in the prometheus UI (localhost:9090)
5. Stop the producer.
6. Unload the topic from the broker.
7. Optionally, `curl` the metrics endpoint to verify that the topic’s `pulsar_in_messages_total` metric is no longer reported.
8. Watch the metrics get reported in prometheus for 5 additional minutes.

When you set `honor_timestamps: false`, the metric stops getting reported right after the topic is unloaded, which is the desired behavior.

### Modifications

* Remove all timestamps from metrics
* Fix affected tests and test files (some of those tests were in the proxy and the function worker, but no code was changed for those modules)

### Verifying this change

This change is accompanied by updated tests.

### Does this pull request potentially affect one of the following parts:

This is technically a breaking change to the metrics, though I would consider it a bug fix at this point. I will discuss it on the mailing list to ensure it gets proper visibility.

Given how frequently Pulsar changes which metrics are exposed between each scrape, I think this is an important fix that should be cherry picked to older release branches. Technically, we can avoid cherry picking this change if we advise users to set `honor_timestamps: false`. However, I think it is better to just remove them.

### Documentation
- [x] `doc-not-needed`
liangyepianzhou pushed a commit that referenced this pull request Feb 10, 2023
When a Pulsar topic is unloaded from a broker, certain metrics related to that topic will appear to remain active for the broker for 5 minutes. This is confusing for troubleshooting because it makes the topic appear to be owned by multiple brokers for a short period of time. See below for a way to reproduce this behavior.

In order to solve this "zombie" metric problem, I propose we remove the timestamps that get exported with each Prometheus metric served by the broker.

Since we introduced Prometheus metrics in #294, we have exported a timestamp along with most metrics. This is an optional, valid part of the spec defined [here](https://prometheus.io/docs/instrumenting/exposition_formats/#comments-help-text-and-type-information). However, after our adoption of Prometheus metrics, the Prometheus project released version 2.0 with a significant improvement to its concept of staleness. In short, before 2.0, a metric that was in the last scrape but not the next one (this often happens for topics that are unloaded) will essentially inherit the most recent value for the last 5 minute window. If there isn't one in the past 5 minutes, the metric becomes "stale" and isn't reported. Starting in 2.0, there was new logic to consider a value stale the very first time that it is not reported in a scrape. Importantly, this new behavior is only available if you do not export timestamps with metrics, as documented here: https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness. We want to use the new behavior because it gives better insight into all topic metrics, which are subject to move between brokers at any time.

This presentation https://www.youtube.com/watch?v=GcTzd2CLH7I and slide deck https://promcon.io/2017-munich/slides/staleness-in-prometheus-2-0.pdf document the feature in detail. This blog post was also helpful: https://www.robustperception.io/staleness-and-promql/.

Additional motivation comes from mailing list threads like this one https://groups.google.com/g/prometheus-users/c/8OFAwp1OEcY. It says:

> Note, however, that adding timestamps is an extremely niche use
case. Most of the users who think the need it should actually not do
it.
>
> The main usecases within that tiny niche are federation and mirroring
the data from another monitoring system.

The Prometheus Go client also indicates a similar motivation: https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#NewMetricWithTimestamp.

The OpenMetrics project also recommends against exporting timestamps: https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#exposing-timestamps.

As such, I think we are not a niche use case, and we should not add timestamps to our metrics.

1. Run any 2.x version of Prometheus (I used 2.31.0) along with the following scrape config:
```yaml
  - job_name: broker
    honor_timestamps: true
    scrape_interval: 30s
    scrape_timeout: 10s
    metrics_path: /metrics
    scheme: http
    follow_redirects: true
    static_configs:
      - targets: ["localhost:8080"]
```
2. Start pulsar standalone on the same machine. I used a recently compiled version of master.
3. Publish messages to a topic.
4. Observe `pulsar_in_messages_total` metric for the topic in the prometheus UI (localhost:9090)
5. Stop the producer.
6. Unload the topic from the broker.
7. Optionally, `curl` the metrics endpoint to verify that the topic’s `pulsar_in_messages_total` metric is no longer reported.
8. Watch the metrics get reported in prometheus for 5 additional minutes.

When you set `honor_timestamps: false`, the metric stops getting reported right after the topic is unloaded, which is the desired behavior.

* Remove all timestamps from metrics
* Fix affected tests and test files (some of those tests were in the proxy and the function worker, but no code was changed for those modules)

This change is accompanied by updated tests.

This is technically a breaking change to the metrics, though I would consider it a bug fix at this point. I will discuss it on the mailing list to ensure it gets proper visibility.

Given how frequently Pulsar changes which metrics are exposed between each scrape, I think this is an important fix that should be cherry picked to older release branches. Technically, we can avoid cherry picking this change if we advise users to set `honor_timestamps: false`. However, I think it is better to just remove them.

- [x] `doc-not-needed`

(cherry picked from commit 0bbc4e1)
coderzc pushed a commit that referenced this pull request Feb 28, 2023
### Motivation

When a Pulsar topic is unloaded from a broker, certain metrics related to that topic will appear to remain active for the broker for 5 minutes. This is confusing for troubleshooting because it makes the topic appear to be owned by multiple brokers for a short period of time. See below for a way to reproduce this behavior.

In order to solve this "zombie" metric problem, I propose we remove the timestamps that get exported with each Prometheus metric served by the broker.

### Analysis

Since we introduced Prometheus metrics in #294, we have exported a timestamp along with most metrics. This is an optional, valid part of the spec defined [here](https://prometheus.io/docs/instrumenting/exposition_formats/#comments-help-text-and-type-information). However, after our adoption of Prometheus metrics, the Prometheus project released version 2.0 with a significant improvement to its concept of staleness. In short, before 2.0, a metric that was in the last scrape but not the next one (this often happens for topics that are unloaded) will essentially inherit the most recent value for the last 5 minute window. If there isn't one in the past 5 minutes, the metric becomes "stale" and isn't reported. Starting in 2.0, there was new logic to consider a value stale the very first time that it is not reported in a scrape. Importantly, this new behavior is only available if you do not export timestamps with metrics, as documented here: https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness. We want to use the new behavior because it gives better insight into all topic metrics, which are subject to move between brokers at any time.

This presentation https://www.youtube.com/watch?v=GcTzd2CLH7I and slide deck https://promcon.io/2017-munich/slides/staleness-in-prometheus-2-0.pdf document the feature in detail. This blog post was also helpful: https://www.robustperception.io/staleness-and-promql/.

Additional motivation comes from mailing list threads like this one https://groups.google.com/g/prometheus-users/c/8OFAwp1OEcY. It says:

> Note, however, that adding timestamps is an extremely niche use
case. Most of the users who think the need it should actually not do
it.
>
> The main usecases within that tiny niche are federation and mirroring
the data from another monitoring system.

The Prometheus Go client also indicates a similar motivation: https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#NewMetricWithTimestamp.

The OpenMetrics project also recommends against exporting timestamps: https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#exposing-timestamps.

As such, I think we are not a niche use case, and we should not add timestamps to our metrics.

### Reproducing the problem

1. Run any 2.x version of Prometheus (I used 2.31.0) along with the following scrape config:
```yaml
  - job_name: broker
    honor_timestamps: true
    scrape_interval: 30s
    scrape_timeout: 10s
    metrics_path: /metrics
    scheme: http
    follow_redirects: true
    static_configs:
      - targets: ["localhost:8080"]
```
2. Start pulsar standalone on the same machine. I used a recently compiled version of master.
3. Publish messages to a topic.
4. Observe `pulsar_in_messages_total` metric for the topic in the prometheus UI (localhost:9090)
5. Stop the producer.
6. Unload the topic from the broker.
7. Optionally, `curl` the metrics endpoint to verify that the topic’s `pulsar_in_messages_total` metric is no longer reported.
8. Watch the metrics get reported in prometheus for 5 additional minutes.

When you set `honor_timestamps: false`, the metric stops getting reported right after the topic is unloaded, which is the desired behavior.

### Modifications

* Remove all timestamps from metrics
* Fix affected tests and test files (some of those tests were in the proxy and the function worker, but no code was changed for those modules)

### Verifying this change

This change is accompanied by updated tests.

### Does this pull request potentially affect one of the following parts:

This is technically a breaking change to the metrics, though I would consider it a bug fix at this point. I will discuss it on the mailing list to ensure it gets proper visibility.

Given how frequently Pulsar changes which metrics are exposed between each scrape, I think this is an important fix that should be cherry picked to older release branches. Technically, we can avoid cherry picking this change if we advise users to set `honor_timestamps: false`. However, I think it is better to just remove them.

### Documentation
- [x] `doc-not-needed`

(cherry picked from commit 0bbc4e1)
nicoloboschi referenced this pull request in datastax/pulsar Feb 28, 2023
When a Pulsar topic is unloaded from a broker, certain metrics related to that topic will appear to remain active for the broker for 5 minutes. This is confusing for troubleshooting because it makes the topic appear to be owned by multiple brokers for a short period of time. See below for a way to reproduce this behavior.

In order to solve this "zombie" metric problem, I propose we remove the timestamps that get exported with each Prometheus metric served by the broker.

Since we introduced Prometheus metrics in #294, we have exported a timestamp along with most metrics. This is an optional, valid part of the spec defined [here](https://prometheus.io/docs/instrumenting/exposition_formats/#comments-help-text-and-type-information). However, after our adoption of Prometheus metrics, the Prometheus project released version 2.0 with a significant improvement to its concept of staleness. In short, before 2.0, a metric that was in the last scrape but not the next one (this often happens for topics that are unloaded) will essentially inherit the most recent value for the last 5 minute window. If there isn't one in the past 5 minutes, the metric becomes "stale" and isn't reported. Starting in 2.0, there was new logic to consider a value stale the very first time that it is not reported in a scrape. Importantly, this new behavior is only available if you do not export timestamps with metrics, as documented here: https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness. We want to use the new behavior because it gives better insight into all topic metrics, which are subject to move between brokers at any time.

This presentation https://www.youtube.com/watch?v=GcTzd2CLH7I and slide deck https://promcon.io/2017-munich/slides/staleness-in-prometheus-2-0.pdf document the feature in detail. This blog post was also helpful: https://www.robustperception.io/staleness-and-promql/.

Additional motivation comes from mailing list threads like this one https://groups.google.com/g/prometheus-users/c/8OFAwp1OEcY. It says:

> Note, however, that adding timestamps is an extremely niche use
case. Most of the users who think the need it should actually not do
it.
>
> The main usecases within that tiny niche are federation and mirroring
the data from another monitoring system.

The Prometheus Go client also indicates a similar motivation: https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#NewMetricWithTimestamp.

The OpenMetrics project also recommends against exporting timestamps: https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#exposing-timestamps.

As such, I think we are not a niche use case, and we should not add timestamps to our metrics.

1. Run any 2.x version of Prometheus (I used 2.31.0) along with the following scrape config:
```yaml
  - job_name: broker
    honor_timestamps: true
    scrape_interval: 30s
    scrape_timeout: 10s
    metrics_path: /metrics
    scheme: http
    follow_redirects: true
    static_configs:
      - targets: ["localhost:8080"]
```
2. Start pulsar standalone on the same machine. I used a recently compiled version of master.
3. Publish messages to a topic.
4. Observe `pulsar_in_messages_total` metric for the topic in the prometheus UI (localhost:9090)
5. Stop the producer.
6. Unload the topic from the broker.
7. Optionally, `curl` the metrics endpoint to verify that the topic’s `pulsar_in_messages_total` metric is no longer reported.
8. Watch the metrics get reported in prometheus for 5 additional minutes.

When you set `honor_timestamps: false`, the metric stops getting reported right after the topic is unloaded, which is the desired behavior.

* Remove all timestamps from metrics
* Fix affected tests and test files (some of those tests were in the proxy and the function worker, but no code was changed for those modules)

This change is accompanied by updated tests.

This is technically a breaking change to the metrics, though I would consider it a bug fix at this point. I will discuss it on the mailing list to ensure it gets proper visibility.

Given how frequently Pulsar changes which metrics are exposed between each scrape, I think this is an important fix that should be cherry picked to older release branches. Technically, we can avoid cherry picking this change if we advise users to set `honor_timestamps: false`. However, I think it is better to just remove them.

- [x] `doc-not-needed`

(cherry picked from commit 0bbc4e1)
(cherry picked from commit e59aac7)
Annavar-satish pushed a commit to pandio-com/pulsar that referenced this pull request Mar 6, 2023
### Motivation

When a Pulsar topic is unloaded from a broker, certain metrics related to that topic will appear to remain active for the broker for 5 minutes. This is confusing for troubleshooting because it makes the topic appear to be owned by multiple brokers for a short period of time. See below for a way to reproduce this behavior.

In order to solve this "zombie" metric problem, I propose we remove the timestamps that get exported with each Prometheus metric served by the broker.

### Analysis

Since we introduced Prometheus metrics in apache#294, we have exported a timestamp along with most metrics. This is an optional, valid part of the spec defined [here](https://prometheus.io/docs/instrumenting/exposition_formats/#comments-help-text-and-type-information). However, after our adoption of Prometheus metrics, the Prometheus project released version 2.0 with a significant improvement to its concept of staleness. In short, before 2.0, a metric that was in the last scrape but not the next one (this often happens for topics that are unloaded) will essentially inherit the most recent value for the last 5 minute window. If there isn't one in the past 5 minutes, the metric becomes "stale" and isn't reported. Starting in 2.0, there was new logic to consider a value stale the very first time that it is not reported in a scrape. Importantly, this new behavior is only available if you do not export timestamps with metrics, as documented here: https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness. We want to use the new behavior because it gives better insight into all topic metrics, which are subject to move between brokers at any time.

This presentation https://www.youtube.com/watch?v=GcTzd2CLH7I and slide deck https://promcon.io/2017-munich/slides/staleness-in-prometheus-2-0.pdf document the feature in detail. This blog post was also helpful: https://www.robustperception.io/staleness-and-promql/.

Additional motivation comes from mailing list threads like this one https://groups.google.com/g/prometheus-users/c/8OFAwp1OEcY. It says:

> Note, however, that adding timestamps is an extremely niche use
case. Most of the users who think the need it should actually not do
it.
>
> The main usecases within that tiny niche are federation and mirroring
the data from another monitoring system.

The Prometheus Go client also indicates a similar motivation: https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#NewMetricWithTimestamp.

The OpenMetrics project also recommends against exporting timestamps: https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#exposing-timestamps.

As such, I think we are not a niche use case, and we should not add timestamps to our metrics.

### Reproducing the problem

1. Run any 2.x version of Prometheus (I used 2.31.0) along with the following scrape config:
```yaml
  - job_name: broker
    honor_timestamps: true
    scrape_interval: 30s
    scrape_timeout: 10s
    metrics_path: /metrics
    scheme: http
    follow_redirects: true
    static_configs:
      - targets: ["localhost:8080"]
```
2. Start pulsar standalone on the same machine. I used a recently compiled version of master.
3. Publish messages to a topic.
4. Observe `pulsar_in_messages_total` metric for the topic in the prometheus UI (localhost:9090)
5. Stop the producer.
6. Unload the topic from the broker.
7. Optionally, `curl` the metrics endpoint to verify that the topic’s `pulsar_in_messages_total` metric is no longer reported.
8. Watch the metrics get reported in prometheus for 5 additional minutes.

When you set `honor_timestamps: false`, the metric stops getting reported right after the topic is unloaded, which is the desired behavior.

### Modifications

* Remove all timestamps from metrics
* Fix affected tests and test files (some of those tests were in the proxy and the function worker, but no code was changed for those modules)

### Verifying this change

This change is accompanied by updated tests.

### Does this pull request potentially affect one of the following parts:

This is technically a breaking change to the metrics, though I would consider it a bug fix at this point. I will discuss it on the mailing list to ensure it gets proper visibility.

Given how frequently Pulsar changes which metrics are exposed between each scrape, I think this is an important fix that should be cherry picked to older release branches. Technically, we can avoid cherry picking this change if we advise users to set `honor_timestamps: false`. However, I think it is better to just remove them.

### Documentation
- [x] `doc-not-needed`

(cherry picked from commit 0bbc4e1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type/feature The PR added a new feature or issue requested a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants