Skip to content

Conversation

@christophermichaeljohnston
Copy link
Contributor

Remove runner_id and runner_name from metrics as these are unique for each ephemeral runner cause a new metric for every action.

fixes #3670

@christophermichaeljohnston
Copy link
Contributor Author

job_workflow_ref also causes horrible high cardinality so I removed it

@christophermichaeljohnston
Copy link
Contributor Author

christophermichaeljohnston commented Oct 25, 2024

The result of this are metrics that function as actual counters:

gha_completed_jobs_total{enterprise="",event_name="push",job_name="test",job_result="succeeded",organization="...",repository="..."} 3

@IanMoroney
Copy link

@nikola-jokic , this is another cardinality fix for metrics.
Could it be reviewed and merged?

@Link- Link- self-assigned this Dec 4, 2024
@Link- Link- added the attention Requires attention label Dec 4, 2024
@Link- Link- added this to the gha-runner-scale-set-0.10.0 milestone Dec 4, 2024
@jameshounshell
Copy link

jameshounshell commented Feb 14, 2025

If possible this would be great if it could be implemented as a command line flag to change what labels are included in the metric. I think that runner_id and runner_name should be removed by default, but I would like to keep job_workflow_ref.

This issue demonstrates how we could have a sane default of labels (I'll concede job_workflow_ref to be off) but I'd like to be able to enable it for my use case.

Another great use case is reducing the cardinality of the histograms gha_job_execution_duration and gha_job_startup_duration which currently have 48 buckets. It might be good to be able to supply our own buckets until a consensus is reached about fewer, more reasonable, buckets.

edit: The old runners included job_workflow_ref and though high'ish cardinaility there are only a finite amount of workflow names, not increasing forever like runner_id. re: #3185

@jameshounshell
Copy link

Small update, last week I deployed code identical to this (minus removing job_workflow_ref), plus the bucket fixes and it's been working nicely.

@nikola-jokic nikola-jokic self-requested a review as a code owner February 25, 2025 09:10
@nikola-jokic
Copy link
Collaborator

Hey, I think this is a very good addition. I agree, these labels are just causing problems. Thanks for testing it @jameshounshell !

@rekha-prakash-maersk
Copy link

rekha-prakash-maersk commented Feb 26, 2025

@nikola-jokic It would be really helpful in terms of user experience and performance if this could be added to next release ? we run the SelfHosted runner in large scale and the experience is not great because the metrics are missing often due to high memory usage of listener pod and listeners restarting few times day due to cpu and memory pileup, though I have allocated more resources to it.

@nikola-jokic
Copy link
Collaborator

Hey @rekha-prakash-maersk,

That is why I fixed the linting errors myself, so I can be sure this will go in for the next release ☺️

Copy link
Collaborator

@nikola-jokic nikola-jokic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nikola-jokic nikola-jokic merged commit ddc872d into actions:master Feb 26, 2025
15 checks passed
@rekha-prakash-maersk
Copy link

@nikola-jokic awesome, thanks! can't wait for next release!!

@jameshounshell
Copy link

Sorry I was bogged down and missed this. I still wanted to keep job_workflow_ref despite it's high cardinality (not infinite). I'll work up a PR to make it an optional label. Maybe we can discuss it's merits on that PR.

@nikola-jokic
Copy link
Collaborator

Hey @jameshounshell,

We are starting to work on metrics customizations. We are already designing a solution so please wait for that and we would appreciate your feedback!

Okabe-Junya added a commit to mercari/actions-runner-controller that referenced this pull request Jan 5, 2026
* Updates: runner to v2.318.0 container-hooks to v0.6.1 (actions#3684)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Updates: runner to v2.319.0 (actions#3702)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Updates: runner to v2.319.1 (actions#3708)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>

* Add exponential backoff when generating runner reg tokens (actions#3724)

* Updates: runner to v2.320.0 (actions#3763)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Updates: runner to v2.321.0 container-hooks to v0.6.2 (actions#3809)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Fix ARC e2e tests (actions#3836)

* Make EphemeralRunnerController MaxConcurrentReconciles configurable (actions#3832)

Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>

* Make EphemeralRunnerReconciler create runner pods earlier (actions#3831)

Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>

* Bump github.com/bradleyfalzon/ghinstallation/v2 from 2.8.0 to 2.12.0 (actions#3837)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>

* Update docs with details for the dashboard visualizations (actions#3696)

Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>

* Make k8s client rate limiter parameters configurable (actions#3848)

Co-authored-by: Taketoshi Fujiwara <t-b-fujiwara@mercari.com>

* Bump golang.org/x/crypto from 0.22.0 to 0.31.0 (actions#3844)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>

* Prepare `0.10.0` release (actions#3849)

* Fix helm chart bug related to `runnerMaxConcurrentReconciles` (actions#3858)

* Prepare `0.10.1` release (actions#3859)

* Update dependabot config to group packages (& include actions eco) (actions#3880)

* Fix template tests and add go test on gha-validate-chart (actions#3886)

* cmd/ghalistener/config: export Validate (actions#3870)

Co-authored-by: Han-Wen Nienhuys <hanwenn@gmail.com>

* Updated dead link (actions#3830)

Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>

* docs: end markdown code block correctly (actions#3736)

* Clarify syntax for `githubConfigSecret` (actions#3812)

Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>

* Bump golang.org/x/net from 0.25.0 to 0.33.0 (actions#3881)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>

* Updates: runner to v2.322.0 (actions#3893)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Sanitize labels ending in hyphen, underscore, and dot (actions#3664)

* metrics cardinality for ghalistener (actions#3671)

Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>
Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>

* Rename log from target/actual to build/autoscalingRunnerSet version (actions#3957)

* Use Ready from the pod conditions when setting it to the EphemeralRunner (actions#3891)

* AutoscalingRunnerSet env: not Rendering correctly (actions#3826)

Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>

* Drop verbose flag from runner scale set init-dind-externals copy (actions#3805)

* Include custom annotations and labels to all resources created by `gha-runner-scale-set` chart (actions#3934)

* Remove old githubrunnerscalesetlistener, remove warning and fix config bug (actions#3937)

* Wrap errors in controller helper methods and swap logic in cleanups (actions#3960)

* Clean up as much as possible in a single pass for the EphemeralRunner reconciler (actions#3941)

* Use gha-runner-scale-set-controller.chart instead of .Chart.Version (actions#3729)

Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>

* Trim volume and container helpers in gha-runner-scale-set (actions#3807)

Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>

* Small readme updates for readability  (actions#3860)

* Update all dependencies, conforming to the new controller-runtime API (actions#3949)

* feat: allow namespace overrides (actions#3797)

Signed-off-by: Jesús Fernández <7312236+fernandezcuesta@users.noreply.github.com>
Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>

* chore: Added `OwnerReferences` during resource creation for `EphemeralRunnerSet`, `EphemeralRunner`, and `EphemeralRunnerPod` (actions#3575)

* Updates: runner to v2.323.0 (actions#3976)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Bump github.com/golang-jwt/jwt/v4 from 4.5.1 to 4.5.2 (actions#3984)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add events role permission to leader_election_role (actions#3988)

* Create configurable metrics (actions#3975)

* Prepare 0.11.0 release (actions#3992)

* Fix busy runners metric (actions#4016)

* Bump the gomod group across 1 directory with 7 updates (actions#4008)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>

* Include more context to errors raised by github/actions client (actions#4032)

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Pin third party actions (actions#3981)

* upgrade(golangci-lint): v2.1.2  (actions#4023)

Signed-off-by: karamaru-alpha <mrnk3078@gmail.com>

* Revised dashboard (actions#4022)

* feat(helm): move dind to sidecar (actions#3842)

* Fix code block fences (actions#3140)

Co-authored-by: Mosè Giordano <giordano@users.noreply.github.com>

* Add missing backtick to metrics.serviceMonitor.namespace line to correct formatting (actions#3790)

* Bump go version (actions#4075)

* Create backoff mechanism for failed runners and allow re-creation of failed ephemeral runners (actions#4059)

* Updates: runner to v2.324.0 container-hooks to v0.7.0 (actions#4086)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Fix docker lint warnings (actions#4074)

* Relax version requirements to allow patch version mismatch (actions#4080)

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Refactor resource naming removing unnecessary calculations (actions#4076)

* Allow use of client id as an app id (actions#4057)

* Updates: runner to v2.325.0 (actions#4109)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Add job_workflow_ref label to listener metrics (actions#4054)

Signed-off-by: rskmm0chang <rskmm0chang@hatena.ne.jp>

* Bump github.com/cloudflare/circl from 1.6.0 to 1.6.1 (actions#4118)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add startup probe to dind side-car (actions#4117)

* Avoid nil point when config.Metrics is nil and expose all metrics if none are configured (actions#4101)

Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>

* Add response body to error when fetching access token (actions#4005)

Co-authored-by: mluffman <mluffman@thoughtmachine.net>
Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>

* Delete config secret when listener pod gets deleted (actions#4033)

Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>

* Azure Key Vault integration to resolve secrets (actions#4090)

* Bump github.com/golang-jwt/jwt/v5 from 5.2.1 to 5.2.2 (actions#4120)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Prepare 0.12.0 release (actions#4122)

* Bump build-push-action to 6.18.0 (actions#4123)

* Remove cache for build-push-action (actions#4124)

* Fix indentation of startupProbe attributes in dind sidecar (actions#4126)

* Fix dind sidecar template (actions#4128)

* Remove duplicate float64 call (actions#4139)

* Remove check if runner exists after exit code 0 (actions#4142)

* Explicitly requeue during backoff ephemeral runner (actions#4152)

* Prepare 0.12.1 release (actions#4153)

* Update CodeQL workflow for v3 (global-run-codeql.yaml) (actions#4157)

* Bump the actions group across 1 directory with 5 updates (actions#4160)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* feat(runner): add ubuntu 24.04 support (actions#3598)

* Fix image pull secrets list arguments in the chart (actions#4164)

* Remove workflow actions version comments since upgrades are done via dependabot (actions#4161)

* Updates: runner to v2.326.0 (actions#4176)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update example GitHub URLs in values.yaml to include an example for enterprise account-level runners (actions#4181)

* Add Missing Languages to CodeQL Advanced Configuration (actions#4179)

* Updates: runner to v2.327.0 (actions#4185)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Remove deprecated preserveUnknownFields from CRDs (actions#4135)

* Updates: runner to v2.327.1 (actions#4188)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Remove JIT config from ephemeral runner status field (actions#4191)

* Fix usage of underscore in Runner Scale Set name (actions#3545)

Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>

* Bump docker/login-action from 3.4.0 to 3.5.0 in the actions group (actions#4196)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump actions/checkout from 4 to 5 in the actions group (actions#4205)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Updates: runner to v2.328.0 (actions#4209)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Requeue if create pod returns already exists error (actions#4201)

* docs: fix repo path typo (actions#4229)

* Update CODEOWNERS (actions#4251)

* Update CODEOWNERS to include new maintainer (actions#4253)

* Remove ephemeral runner when exit code != 0 and is patched with the job (actions#4239)

* Add workflow name and target labels (actions#4240)

* Bump the actions group across 1 directory with 5 updates (actions#4262)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Introduce new kubernetes-novolume mode (actions#4250)

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Ensure ephemeral runner is deleted from the service on exit != 0 (actions#4260)

* docs: fix broken Grafana dashboard JSON path (actions#4270)

* Potential fix for code scanning alert no. 3: Workflow does not contain permissions (actions#4273)

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Potential fix for code scanning alert no. 1: Workflow does not contain permissions (actions#4274)

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: jiaren-wu <190862939+jiaren-wu@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Bump all dependencies (actions#4266)

* Bump the gomod group across 1 directory with 4 updates (actions#4277)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Prepare 0.13.0 release (actions#4280)

* Revert "gha: customize client-go rate limiter params (#4)"

This reverts commit 8728190.
Keep several instrumentations

* Revert "gha: make MaxConcurrentReconciles for each reconciler configurable (#1)"

This reverts commit 057a1e7.

* chore(chart): bump version to 0.13.0-rc.1 for testing

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Jesús Fernández <7312236+fernandezcuesta@users.noreply.github.com>
Signed-off-by: karamaru-alpha <mrnk3078@gmail.com>
Signed-off-by: rskmm0chang <rskmm0chang@hatena.ne.jp>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>
Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Ken Muse <kenmuse@users.noreply.github.com>
Co-authored-by: Taketoshi Fujiwara <t-b-fujiwara@mercari.com>
Co-authored-by: Rob Herley <robherley@github.com>
Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>
Co-authored-by: Han-Wen Nienhuys <hanwen@engflow.com>
Co-authored-by: Han-Wen Nienhuys <hanwenn@gmail.com>
Co-authored-by: Matteo Bianchi <37507190+mbianchidev@users.noreply.github.com>
Co-authored-by: James Ward <james@notjam.es>
Co-authored-by: John Wesley Walker III <81404201+jww3@users.noreply.github.com>
Co-authored-by: &es <84567633+and-es@users.noreply.github.com>
Co-authored-by: Chris Johnston <christophermichaeljohnston@gmail.com>
Co-authored-by: thinkbiggerltd <46003245+thinkbiggerltd@users.noreply.github.com>
Co-authored-by: Cees-Jan Kiewiet <ceesjank@gmail.com>
Co-authored-by: Mikey Smet <26899585+Mikey032@users.noreply.github.com>
Co-authored-by: Patrick Vickery <167798809+pvickery-ParamountCommerce@users.noreply.github.com>
Co-authored-by: Salman Chishti <salmanmkc@GitHub.com>
Co-authored-by: J. Fernández <7312236+fernandezcuesta@users.noreply.github.com>
Co-authored-by: kahirokunn <okinakahiro@gmail.com>
Co-authored-by: David Maxwell <davidmaxwell77@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Ryosei Karaki <38310693+karamaru-alpha@users.noreply.github.com>
Co-authored-by: Borislav Velkov <velkov.borislav@gmail.com>
Co-authored-by: Mosè Giordano <765740+giordano@users.noreply.github.com>
Co-authored-by: Mosè Giordano <giordano@users.noreply.github.com>
Co-authored-by: scodef <52595068+scodef@users.noreply.github.com>
Co-authored-by: Ryo Sakamoto <4330349+rskmm0chang@users.noreply.github.com>
Co-authored-by: Tingluo Huang <tingluohuang@github.com>
Co-authored-by: Nash Luffman <nashluffman@gmail.com>
Co-authored-by: mluffman <mluffman@thoughtmachine.net>
Co-authored-by: Wim Fournier <github@fournier.nl>
Co-authored-by: Jeev B <jeevb@users.noreply.github.com>
Co-authored-by: Mark Huijgen <48476507+mhuijgen@users.noreply.github.com>
Co-authored-by: calx <108835591+null-calx@users.noreply.github.com>
Co-authored-by: adjn <104127038+adjn@users.noreply.github.com>
Co-authored-by: Ho Kim <ho.kim@ulagbulag.io>
Co-authored-by: Cory Calahan <corycalahan@github.com>
Co-authored-by: Kylie Stradley <4666485+KyFaSt@users.noreply.github.com>
Co-authored-by: Alex Hatzenbuhler <hatz@hey.com>
Co-authored-by: clechevalli <56869366+clechevalli@users.noreply.github.com>
Co-authored-by: zkpepe <shim.growers_6u@icloud.com>
Co-authored-by: Dennis Stone <densto88@github.com>
Co-authored-by: Berat Postalcioglu <bpstlcgl@gmail.com>
Co-authored-by: Jiaren Wu <jiaren-wu@github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: jiaren-wu <190862939+jiaren-wu@users.noreply.github.com>
Co-authored-by: Junya Okabe <junya@mercari.com>
unpollito pushed a commit to DistruApp/actions-runner-controller that referenced this pull request Jan 21, 2026
Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>
Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

attention Requires attention gha-runner-scale-set Related to the gha-runner-scale-set mode

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ghalistener high cardinality metrics

6 participants