Please try to fill out as much of the information below as you can. Thank you!
Which version contains the bug?
0.2.0 & development
Describe the bug
Hello there
We use check_prometheus in the ‘alert -P’ method. We only want to output errors. If we run without ‘-P’ the SMS/email will be too long with over 100 checks when an error occurs.
We have compared version ‘v0.2.0’ and ‘development’. We noticed that the versions behave differently with ‘alert’. However, both are the same, if our Prometheus is all set to ‘Inactive’, i.e. OK, then we get an ‘UNKNOWN’ back
Can you check if this is the same for you?
Thanks for your great work
How to recreate the bug?
Version: 0.2.0 (Check against a cluster without alerts)
./check_prometheus_v0.2.0_Linux_x86_64 -H metrics.clean-cluster.example.com -p 80 alert
[OK] - Alerts inactive | total=123 firing=0 pending=0 inactive=123
./check_prometheus_v0.2.0_Linux_x86_64 -H metrics.clean-cluster.example.com -p 80 alert -P
[UNKNOWN] - 0 Alerts: 0 Firing - 0 Pending - 0 Inactive
|
# echo $?
3
Version: 0.2.0 (check against a cluster with problem)
./check_prometheus_v0.2.0_Linux_x86_64 -H metrics.some-probelm-cluster.example.com -p 80 alert
[CRITICAL] - 126 Alerts: 1 Firing - 0 Pending - 125 Inactive
\_[OK] [FluxHelmReleaseNotReady] is inactive
\_[OK] [FluxGitRepositorySyncFailed] is inactive
...
\_[CRITICAL] [Watchdog] is firing - value: 1.00
\_[OK] [InfoInhibitor] is inactive
..
\_[OK] [K3sCertificateExpiration] is inactive
| total=126 firing=1 pending=0 inactive=125
./check_prometheus_v0.2.0_Linux_x86_64 -H metrics.some-probelm-cluster.example.com -p 80 alert -P
[CRITICAL] - 1 Alerts: 1 Firing - 0 Pending - 0 Inactive
\_[CRITICAL] [Watchdog] is firing - value: 1.00
|
Version: development (Check against a cluster without alerts)
./check_prometheus_develop -H metrics.clean-cluster.example.com -p 80 alert
[OK] - 125 Alerts: 0 Firing - 0 Pending - 125 Inactive
\_ [OK] [FluxHelmReleaseNotReady] is inactive
\_ [OK] [FluxGitRepositorySyncFailed] is inactive
\_ [OK] [InfoInhibitor] is inactive
..
\_ [OK] [K3sCertificateExpiration] is inactive
|total=125 firing=0 pending=0 inactive=125
./check_prometheus_develop -H metrics.clean-cluster.example.com -p 80 alert -P
[UNKNOWN] - 0 Alerts: 0 Firing - 0 Pending - 0 Inactive
# echo $?
3
Version: development (check against a cluster with problem)
./check_prometheus_develop -H metrics.some-probelm-cluster.example.com -p 80 alert
[CRITICAL] - 126 Alerts: 1 Firing - 0 Pending - 125 Inactive
\_ [OK] [FluxHelmReleaseNotReady] is inactive
\_ [OK] [FluxGitRepositorySyncFailed] is inactive
..
\_ [CRITICAL] [Watchdog] is firing - value: 1.00
\_ [OK] [InfoInhibitor] is inactive
..
\_ [OK] [K3sCertificateExpiration] is inactive
|total=126 firing=1 pending=0 inactive=125
./check_prometheus_develop -H metrics.some-probelm-cluster.example.com -p 80 alert -P
[CRITICAL] - 1 Alerts: 1 Firing - 0 Pending - 0 Inactive
\_ [CRITICAL] [Watchdog] is firing - value: 1.00
Please try to fill out as much of the information below as you can. Thank you!
Which version contains the bug?
0.2.0 & development
Describe the bug
Hello there
We use check_prometheus in the ‘alert -P’ method. We only want to output errors. If we run without ‘-P’ the SMS/email will be too long with over 100 checks when an error occurs.
We have compared version ‘v0.2.0’ and ‘development’. We noticed that the versions behave differently with ‘alert’. However, both are the same, if our Prometheus is all set to ‘Inactive’, i.e. OK, then we get an ‘UNKNOWN’ back
Can you check if this is the same for you?
Thanks for your great work
How to recreate the bug?
Version: 0.2.0 (Check against a cluster without alerts)
./check_prometheus_v0.2.0_Linux_x86_64 -H metrics.clean-cluster.example.com -p 80 alert [OK] - Alerts inactive | total=123 firing=0 pending=0 inactive=123Version: 0.2.0 (check against a cluster with problem)
Version: development (Check against a cluster without alerts)
./check_prometheus_develop -H metrics.clean-cluster.example.com -p 80 alert -P [UNKNOWN] - 0 Alerts: 0 Firing - 0 Pending - 0 Inactive # echo $? 3Version: development (check against a cluster with problem)
./check_prometheus_develop -H metrics.some-probelm-cluster.example.com -p 80 alert -P [CRITICAL] - 1 Alerts: 1 Firing - 0 Pending - 0 Inactive \_ [CRITICAL] [Watchdog] is firing - value: 1.00