Skip to content

[metrics] Introduce metrics API#2

Merged
YapingLi04 merged 3 commits into
mainfrom
metricsapi
Feb 9, 2026
Merged

[metrics] Introduce metrics API#2
YapingLi04 merged 3 commits into
mainfrom
metricsapi

Conversation

@YapingLi04
Copy link
Copy Markdown
Owner

@YapingLi04 YapingLi04 commented Oct 3, 2025

See the design doc by @ikruglov

This PR introduces the metrics API framework, adds some basic system wide and per unit metrics, and a basic CLI. The PR is broken into two commits as described below.

First commit

The first commit includes:

  • Metrics API definitions
  • Code to set up the varlink server
  • The describe method which shows all the metrics families
  • The list method which lists all the metrics
  • Type definitions related to MetricFamily
  • Common code to build json objects

Second commit

The second commit adds some basic metrics, a basic CLI (systemd-report) which
lists the metrics, and integration tests.

System wide metrics:

  • units_by_type_total
  • units_by_state_total

Two per unit metrics:

  • unit_active_state
  • unit_load_state

A service state metric:

  • nrestarts

Deviations from the original design

  • Introduced top level field "object" for ease of filtering. Instead of having fields: { unit: "foo", unit_type: "service" }, we now have object: foo.service as the top level field.

Sample outputs

units_by_type_total:

{
        "name" : "io.systemd.Manager.units_by_type_total",
        "value" : 52,
        "fields" : {
                "type" : "target"
        }
}
{
        "name" : "io.systemd.Manager.units_by_type_total",
        "value" : 82,
        "fields" : {
                "type" : "device"
        }
}
{
        "name" : "io.systemd.Manager.units_by_type_total",
        "value" : 2,
        "fields" : {
                "type" : "automount"
        }
}

units_by_state_total:

{
        "name" : "io.systemd.Manager.units_by_state_total",
        "value" : 216,
        "fields" : {
                "state" : "active"
        }
}
{
        "name" : "io.systemd.Manager.units_by_state_total",
        "value" : 0,
        "fields" : {
                "state" : "reloading"
        }
}
{
        "name" : "io.systemd.Manager.units_by_state_total",
        "value" : 120,
        "fields" : {
                "state" : "inactive"
        }
}

unit_active_state:

{
        "name" : "io.systemd.Manager.unit_active_state",
        "object" : "multi-user.target",
        "value" : "active"
}
{
        "name" : "io.systemd.Manager.unit_active_state",
        "object" : "systemd-sysusers.service",
        "value" : "inactive"
}

unit_load_state:

{
        "name" : "io.systemd.Manager.unit_load_state",
        "object" : "multi-user.target",
        "value" : "loaded"
}

nrestarts:

{
        "name" : "io.systemd.Manager.nrestarts",
        "object" : "user@0.service",
        "value" : 0
}
{
        "name" : "io.systemd.Manager.nrestarts",
        "object" : "user-runtime-dir@0.service",
        "value" : 0
}

@YapingLi04 YapingLi04 force-pushed the metricsapi branch 4 times, most recently from f3f0f23 to a58483a Compare October 3, 2025 13:34
@YapingLi04 YapingLi04 changed the title [metrics api] Introduce metrics API [metrics] Introduce metrics API Oct 3, 2025
@YapingLi04 YapingLi04 force-pushed the metricsapi branch 4 times, most recently from a08c235 to d752c3d Compare October 7, 2025 18:10
@YapingLi04 YapingLi04 force-pushed the metricsapi branch 4 times, most recently from dcf9f85 to 22c1de9 Compare October 24, 2025 11:17
@YapingLi04 YapingLi04 force-pushed the metricsapi branch 3 times, most recently from fb58f07 to d522db9 Compare October 30, 2025 14:41
@YapingLi04 YapingLi04 force-pushed the metricsapi branch 2 times, most recently from 435e127 to 0d405ec Compare November 9, 2025 14:02
@YapingLi04 YapingLi04 force-pushed the metricsapi branch 3 times, most recently from 067174f to f4f0de4 Compare November 25, 2025 11:42
@YapingLi04 YapingLi04 force-pushed the metricsapi branch 6 times, most recently from ed64627 to 9d0b19e Compare January 21, 2026 17:52
@YapingLi04 YapingLi04 force-pushed the metricsapi branch 2 times, most recently from 7ae31ca to 56eeda4 Compare February 2, 2026 12:54
This commit introduces the shared code for the metrics API framework:

- Metrics API definitions
- Code to set up the varlink server
- The describe method which shows all the metrics families
- The list method which lists all the metrics
- Type definitions related to MetricFamily
- Common code to build json objects
This commit adds some basic metrics and integration tests.

System wide metrics:
- units_by_type_total: target/device/automount etc.
- units_by_state_total: active/reloading/inactive etc.

Two per unit metrics which shows the current state of a unit:
- unit_active_state
- unit_load_state

A metric for service state:
- nrestarts

Here are some sample outputs:

units_by_type_total:

{
        "name" : "io.systemd.Manager.units_by_type_total",
        "value" : 52,
        "fields" : {
                "type" : "target"
        }
}
{
        "name" : "io.systemd.Manager.units_by_type_total",
        "value" : 82,
        "fields" : {
                "type" : "device"
        }
}
{
        "name" : "io.systemd.Manager.units_by_type_total",
        "value" : 2,
        "fields" : {
                "type" : "automount"
        }
}

units_by_state_total:

{
        "name" : "io.systemd.Manager.units_by_state_total",
        "value" : 216,
        "fields" : {
                "state" : "active"
        }
}
{
        "name" : "io.systemd.Manager.units_by_state_total",
        "value" : 0,
        "fields" : {
                "state" : "reloading"
        }
}
{
        "name" : "io.systemd.Manager.units_by_state_total",
        "value" : 120,
        "fields" : {
                "state" : "inactive"
        }
}

unit_active_state:

{
        "name" : "io.systemd.Manager.unit_active_state",
        "object" : "multi-user.target",
        "value" : "active"
}
{
        "name" : "io.systemd.Manager.unit_active_state",
        "object" : "systemd-sysusers.service",
        "value" : "inactive"
}

unit_load_state:

{
        "name" : "io.systemd.Manager.unit_load_state",
        "object" : "multi-user.target",
        "value" : "loaded"
}

nrestarts:

{
        "name" : "io.systemd.Manager.nrestarts",
        "object" : "user@0.service",
        "value" : 0
}
{
        "name" : "io.systemd.Manager.nrestarts",
        "object" : "user-runtime-dir@0.service",
        "value" : 0
}
systemd-report will list all the metrics.
@YapingLi04 YapingLi04 merged commit e047394 into main Feb 9, 2026
33 of 44 checks passed
YapingLi04 pushed a commit that referenced this pull request Mar 9, 2026
Fix a typo which causes a segfault when processing a user record
with matchHostname when it's an array instead of a simple string:

$ echo '{"userName":"crashhostarray","perMachine":[{"matchHostname":["host1","host2"],"locked":false}]}' | userdbctl -F -
Segmentation fault         (core dumped)

$ coredumpctl info
...
       Message: Process 1172301 (userdbctl) of user 1000 dumped core.

                Module libz.so.1 from rpm zlib-ng-2.3.3-1.fc43.x86_64
                Module libcrypto.so.3 from rpm openssl-3.5.4-2.fc43.x86_64
                Stack trace of thread 1172301:
                #0  0x00007fded7b3a656 __strcmp_evex (libc.so.6 + 0x159656)
                #1  0x00007fded7e95397 per_machine_hostname_match (libsystemd-shared-260.so + 0x295397)
                #2  0x00007fded7e955b5 per_machine_match (libsystemd-shared-260.so + 0x2955b5)
                #3  0x00007fded7e957c6 dispatch_per_machine (libsystemd-shared-260.so + 0x2957c6)
                #4  0x00007fded7e96c97 user_record_load (libsystemd-shared-260.so + 0x296c97)
                #5  0x000000000040572d display_user (/home/fsumsal/repos/@systemd/systemd/build/userdbctl + 0x572d)
                systemd#6  0x00007fded7ea9727 dispatch_verb (libsystemd-shared-260.so + 0x2a9727)
                systemd#7  0x000000000041077c run (/home/fsumsal/repos/@systemd/systemd/build/userdbctl + 0x1077c)
                systemd#8  0x00000000004107ce main (/home/fsumsal/repos/@systemd/systemd/build/userdbctl + 0x107ce)
                systemd#9  0x00007fded79e45b5 __libc_start_call_main (libc.so.6 + 0x35b5)
                systemd#10 0x00007fded79e4668 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x3668)
                systemd#11 0x00000000004038d5 _start (/home/fsumsal/repos/@systemd/systemd/build/userdbctl + 0x38d5)
                ELF object binary architecture: AMD x86-64
YapingLi04 pushed a commit that referenced this pull request Mar 9, 2026
The fido2_hmac_salt/fido2_hmac_credential/recovery_key fields kept
leaking memory as the array itself wasn't deallocated after deallocating
each of its elements data:

$ build-san/userdbctl -F fuzz-corpus-userdb/auth-fido2.json
...
=================================================================
==1292840==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x7f56f00e5e4b in realloc.part.0 (/lib64/libasan.so.8+0xe5e4b) (BuildId: 25975f766867e9e604dc5a71a8befeaed3301942)
    #1 0x7f56ed869e42 in greedy_realloc ../src/basic/alloc-util.c:65
    #2 0x7f56ed7ff5e9 in dispatch_fido2_hmac_salt ../src/shared/user-record.c:836
    #3 0x7f56edd73cbc in sd_json_dispatch_full ../src/libsystemd/sd-json/sd-json.c:5204
    #4 0x7f56edd745fc in sd_json_dispatch ../src/libsystemd/sd-json/sd-json.c:5276
    #5 0x7f56ed80100b in dispatch_privileged ../src/shared/user-record.c:998
    systemd#6 0x7f56edd73cbc in sd_json_dispatch_full ../src/libsystemd/sd-json/sd-json.c:5204
    systemd#7 0x7f56edd745fc in sd_json_dispatch ../src/libsystemd/sd-json/sd-json.c:5276
    systemd#8 0x7f56ed80622c in user_record_load ../src/shared/user-record.c:1697
    systemd#9 0x000000408c15 in display_user ../src/userdb/userdbctl.c:447
    systemd#10 0x7f56ed83cc9a in dispatch_verb ../src/shared/verbs.c:137
    systemd#11 0x00000041df2b in run ../src/userdb/userdbctl.c:1908
    systemd#12 0x00000041dfbe in main ../src/userdb/userdbctl.c:1911
    systemd#13 0x7f56ec8105b4 in __libc_start_call_main (/lib64/libc.so.6+0x35b4) (BuildId: 2b5beec0fd24fe9c9f43eddfdd5facf0b8a1b805)
    systemd#14 0x7f56ec810667 in __libc_start_main@@GLIBC_2.34 (/lib64/libc.so.6+0x3667) (BuildId: 2b5beec0fd24fe9c9f43eddfdd5facf0b8a1b805)
    systemd#15 0x000000404a44 in _start (/home/fsumsal/repos/@systemd/systemd/build-san/userdbctl+0x404a44) (BuildId: 19e8b7e7b7038d2cea20bc18a55bea2a9e4406d5)

Direct leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x7f56f00e5e4b in realloc.part.0 (/lib64/libasan.so.8+0xe5e4b) (BuildId: 25975f766867e9e604dc5a71a8befeaed3301942)
    #1 0x7f56ed869e42 in greedy_realloc ../src/basic/alloc-util.c:65
    #2 0x7f56ed7fe779 in dispatch_fido2_hmac_credential_array ../src/shared/user-record.c:775
    #3 0x7f56edd73cbc in sd_json_dispatch_full ../src/libsystemd/sd-json/sd-json.c:5204
    #4 0x7f56edd745fc in sd_json_dispatch ../src/libsystemd/sd-json/sd-json.c:5276
    #5 0x7f56ed80622c in user_record_load ../src/shared/user-record.c:1697
    systemd#6 0x000000408c15 in display_user ../src/userdb/userdbctl.c:447
    systemd#7 0x7f56ed83cc9a in dispatch_verb ../src/shared/verbs.c:137
    systemd#8 0x00000041df2b in run ../src/userdb/userdbctl.c:1908
    systemd#9 0x00000041dfbe in main ../src/userdb/userdbctl.c:1911
    systemd#10 0x7f56ec8105b4 in __libc_start_call_main (/lib64/libc.so.6+0x35b4) (BuildId: 2b5beec0fd24fe9c9f43eddfdd5facf0b8a1b805)
    systemd#11 0x7f56ec810667 in __libc_start_main@@GLIBC_2.34 (/lib64/libc.so.6+0x3667) (BuildId: 2b5beec0fd24fe9c9f43eddfdd5facf0b8a1b805)
    systemd#12 0x000000404a44 in _start (/home/fsumsal/repos/@systemd/systemd/build-san/userdbctl+0x404a44) (BuildId: 19e8b7e7b7038d2cea20bc18a55bea2a9e4406d5)

SUMMARY: AddressSanitizer: 176 byte(s) leaked in 2 allocation(s).
YapingLi04 pushed a commit that referenced this pull request Mar 9, 2026
…d#40979)

Fix a typo which causes a segfault when processing a user record
with `matchHostname` when it's an array instead of a simple string:

```
$ echo '{"userName":"crashhostarray","perMachine":[{"matchHostname":["host1","host2"],"locked":false}]}' | userdbctl -F -
Segmentation fault         (core dumped)

$ coredumpctl info
...
       Message: Process 1172301 (userdbctl) of user 1000 dumped core.

                Module libz.so.1 from rpm zlib-ng-2.3.3-1.fc43.x86_64
                Module libcrypto.so.3 from rpm openssl-3.5.4-2.fc43.x86_64
                Stack trace of thread 1172301:
                #0  0x00007fded7b3a656 __strcmp_evex (libc.so.6 + 0x159656)
                #1  0x00007fded7e95397 per_machine_hostname_match (libsystemd-shared-260.so + 0x295397)
                #2  0x00007fded7e955b5 per_machine_match (libsystemd-shared-260.so + 0x2955b5)
                #3  0x00007fded7e957c6 dispatch_per_machine (libsystemd-shared-260.so + 0x2957c6)
                #4  0x00007fded7e96c97 user_record_load (libsystemd-shared-260.so + 0x296c97)
                #5  0x000000000040572d display_user (/home/fsumsal/repos/@systemd/systemd/build/userdbctl + 0x572d)
                systemd#6  0x00007fded7ea9727 dispatch_verb (libsystemd-shared-260.so + 0x2a9727)
                systemd#7  0x000000000041077c run (/home/fsumsal/repos/@systemd/systemd/build/userdbctl + 0x1077c)
                systemd#8  0x00000000004107ce main (/home/fsumsal/repos/@systemd/systemd/build/userdbctl + 0x107ce)
                systemd#9  0x00007fded79e45b5 __libc_start_call_main (libc.so.6 + 0x35b5)
                systemd#10 0x00007fded79e4668 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x3668)
                systemd#11 0x00000000004038d5 _start (/home/fsumsal/repos/@systemd/systemd/build/userdbctl + 0x38d5)
                ELF object binary architecture: AMD x86-64
```
YapingLi04 pushed a commit that referenced this pull request Mar 19, 2026
Fix a typo which causes a segfault when processing a user record
with matchHostname when it's an array instead of a simple string:

$ echo '{"userName":"crashhostarray","perMachine":[{"matchHostname":["host1","host2"],"locked":false}]}' | userdbctl -F -
Segmentation fault         (core dumped)

$ coredumpctl info
...
       Message: Process 1172301 (userdbctl) of user 1000 dumped core.

                Module libz.so.1 from rpm zlib-ng-2.3.3-1.fc43.x86_64
                Module libcrypto.so.3 from rpm openssl-3.5.4-2.fc43.x86_64
                Stack trace of thread 1172301:
                #0  0x00007fded7b3a656 __strcmp_evex (libc.so.6 + 0x159656)
                #1  0x00007fded7e95397 per_machine_hostname_match (libsystemd-shared-260.so + 0x295397)
                #2  0x00007fded7e955b5 per_machine_match (libsystemd-shared-260.so + 0x2955b5)
                #3  0x00007fded7e957c6 dispatch_per_machine (libsystemd-shared-260.so + 0x2957c6)
                #4  0x00007fded7e96c97 user_record_load (libsystemd-shared-260.so + 0x296c97)
                #5  0x000000000040572d display_user (/home/fsumsal/repos/@systemd/systemd/build/userdbctl + 0x572d)
                systemd#6  0x00007fded7ea9727 dispatch_verb (libsystemd-shared-260.so + 0x2a9727)
                systemd#7  0x000000000041077c run (/home/fsumsal/repos/@systemd/systemd/build/userdbctl + 0x1077c)
                systemd#8  0x00000000004107ce main (/home/fsumsal/repos/@systemd/systemd/build/userdbctl + 0x107ce)
                systemd#9  0x00007fded79e45b5 __libc_start_call_main (libc.so.6 + 0x35b5)
                systemd#10 0x00007fded79e4668 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x3668)
                systemd#11 0x00000000004038d5 _start (/home/fsumsal/repos/@systemd/systemd/build/userdbctl + 0x38d5)
                ELF object binary architecture: AMD x86-64

(cherry picked from commit 1e2517b)
YapingLi04 pushed a commit that referenced this pull request Mar 19, 2026
The fido2_hmac_salt/fido2_hmac_credential/recovery_key fields kept
leaking memory as the array itself wasn't deallocated after deallocating
each of its elements data:

$ build-san/userdbctl -F fuzz-corpus-userdb/auth-fido2.json
...
=================================================================
==1292840==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x7f56f00e5e4b in realloc.part.0 (/lib64/libasan.so.8+0xe5e4b) (BuildId: 25975f766867e9e604dc5a71a8befeaed3301942)
    #1 0x7f56ed869e42 in greedy_realloc ../src/basic/alloc-util.c:65
    #2 0x7f56ed7ff5e9 in dispatch_fido2_hmac_salt ../src/shared/user-record.c:836
    #3 0x7f56edd73cbc in sd_json_dispatch_full ../src/libsystemd/sd-json/sd-json.c:5204
    #4 0x7f56edd745fc in sd_json_dispatch ../src/libsystemd/sd-json/sd-json.c:5276
    #5 0x7f56ed80100b in dispatch_privileged ../src/shared/user-record.c:998
    systemd#6 0x7f56edd73cbc in sd_json_dispatch_full ../src/libsystemd/sd-json/sd-json.c:5204
    systemd#7 0x7f56edd745fc in sd_json_dispatch ../src/libsystemd/sd-json/sd-json.c:5276
    systemd#8 0x7f56ed80622c in user_record_load ../src/shared/user-record.c:1697
    systemd#9 0x000000408c15 in display_user ../src/userdb/userdbctl.c:447
    systemd#10 0x7f56ed83cc9a in dispatch_verb ../src/shared/verbs.c:137
    systemd#11 0x00000041df2b in run ../src/userdb/userdbctl.c:1908
    systemd#12 0x00000041dfbe in main ../src/userdb/userdbctl.c:1911
    systemd#13 0x7f56ec8105b4 in __libc_start_call_main (/lib64/libc.so.6+0x35b4) (BuildId: 2b5beec0fd24fe9c9f43eddfdd5facf0b8a1b805)
    systemd#14 0x7f56ec810667 in __libc_start_main@@GLIBC_2.34 (/lib64/libc.so.6+0x3667) (BuildId: 2b5beec0fd24fe9c9f43eddfdd5facf0b8a1b805)
    systemd#15 0x000000404a44 in _start (/home/fsumsal/repos/@systemd/systemd/build-san/userdbctl+0x404a44) (BuildId: 19e8b7e7b7038d2cea20bc18a55bea2a9e4406d5)

Direct leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x7f56f00e5e4b in realloc.part.0 (/lib64/libasan.so.8+0xe5e4b) (BuildId: 25975f766867e9e604dc5a71a8befeaed3301942)
    #1 0x7f56ed869e42 in greedy_realloc ../src/basic/alloc-util.c:65
    #2 0x7f56ed7fe779 in dispatch_fido2_hmac_credential_array ../src/shared/user-record.c:775
    #3 0x7f56edd73cbc in sd_json_dispatch_full ../src/libsystemd/sd-json/sd-json.c:5204
    #4 0x7f56edd745fc in sd_json_dispatch ../src/libsystemd/sd-json/sd-json.c:5276
    #5 0x7f56ed80622c in user_record_load ../src/shared/user-record.c:1697
    systemd#6 0x000000408c15 in display_user ../src/userdb/userdbctl.c:447
    systemd#7 0x7f56ed83cc9a in dispatch_verb ../src/shared/verbs.c:137
    systemd#8 0x00000041df2b in run ../src/userdb/userdbctl.c:1908
    systemd#9 0x00000041dfbe in main ../src/userdb/userdbctl.c:1911
    systemd#10 0x7f56ec8105b4 in __libc_start_call_main (/lib64/libc.so.6+0x35b4) (BuildId: 2b5beec0fd24fe9c9f43eddfdd5facf0b8a1b805)
    systemd#11 0x7f56ec810667 in __libc_start_main@@GLIBC_2.34 (/lib64/libc.so.6+0x3667) (BuildId: 2b5beec0fd24fe9c9f43eddfdd5facf0b8a1b805)
    systemd#12 0x000000404a44 in _start (/home/fsumsal/repos/@systemd/systemd/build-san/userdbctl+0x404a44) (BuildId: 19e8b7e7b7038d2cea20bc18a55bea2a9e4406d5)

SUMMARY: AddressSanitizer: 176 byte(s) leaked in 2 allocation(s).
(cherry picked from commit 3c7bd94)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant