Skip to content

Fix Prometheus server crash on listening to already used port#1986

Merged
lalitb merged 12 commits intoopen-telemetry:mainfrom
lalitb:prom-server-error
Feb 16, 2023
Merged

Fix Prometheus server crash on listening to already used port#1986
lalitb merged 12 commits intoopen-telemetry:mainfrom
lalitb:prom-server-error

Conversation

@lalitb
Copy link
Copy Markdown
Member

@lalitb lalitb commented Feb 15, 2023

Fixes #1903

Changes

Please provide a brief description of the changes here.

  • Handle exception thrown by CivetServer (used by prometheus-cpp) if the port to listen is already used.
  • Update documentation for Prometheus example (remove reference of PeriodicPushMetricReader) as per prometheus design changes in Convert Prometheus Exporter to Pull MetricReader #1953.
  • Change default Prometheus listening port to 9464 in example.

It's difficult to add unit-test to simulate port usage. So tested changes as below:

Step 1: Start a http server at particular port (say 8000):

$python3 -m http.server --cgi 8000
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...

Step 2: Try running Prometheus example to invoke Prometheus exporter on this used port 8000:

$ ./prometheus_example all localhost:8000
PrometheusExporter example program running ...
[Error] File: /<path>/opentelemetry-cpp/exporters/prometheus/src/exporter.cc:26 [Prometheus Exporter] Can't initialize prometheus exposer with endpoint: localhost:8000
Error: null context when constructing CivetServer. Possible problem binding to port.
...
[Warning] File: /<path>/opentelemetry-cpp/sdk/src/metrics/metric_reader.cc:47 MetricReader::Shutdown - Cannot invoke shutdown twice!
$

This fails gracefully.

Step 3: Try running Prometheus example to invoke Prometheus exporter on some unused port (say 8001):

$ ./prometheus_example all localhost:8001
PrometheusExporter example program running ...
...
$

This is successful.

For significant contributions please make sure you have completed the following items:

  • CHANGELOG.md updated for non-trivial changes
  • Unit tests have been added
  • Changes in public API reviewed

@lalitb lalitb requested a review from a team February 15, 2023 07:43
@lalitb lalitb changed the title Fix Prometheus server crash on listening to already used port ( #1903) Fix Prometheus server crash on listening to already used port Feb 15, 2023
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 15, 2023

Codecov Report

Merging #1986 (074ce2d) into main (e47dec5) will increase coverage by 0.03%.
The diff coverage is n/a.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1986      +/-   ##
==========================================
+ Coverage   87.11%   87.13%   +0.03%     
==========================================
  Files         166      166              
  Lines        4597     4597              
==========================================
+ Hits         4004     4005       +1     
+ Misses        593      592       -1     
Impacted Files Coverage Δ
sdk/src/trace/batch_span_processor.cc 91.48% <0.00%> (+0.78%) ⬆️

Copy link
Copy Markdown
Member

@marcalff marcalff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, setting the exporter to shutdown looks like the best solution.

Comment thread exporters/prometheus/src/collector.cc Outdated
@esigo esigo added the ok-to-merge The PR is ok to merge (has two approves or raised by a maintainer/approver and has one approve) label Feb 15, 2023
Copy link
Copy Markdown
Member

@esigo esigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thanks for the fix :)

@lalitb lalitb merged commit a6211fa into open-telemetry:main Feb 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ok-to-merge The PR is ok to merge (has two approves or raised by a maintainer/approver and has one approve)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prometheus example crashes

3 participants