[AGENTRUN-558] Rust checks code - Add ability to run Rust-based checks through shared libraries by Enzu83 · Pull Request #42351 · DataDog/datadog-agent

Enzu83 · 2025-10-27T10:00:02Z

NOTE: This PR contains the draft for http_check in Rust. It's working well for few use cases.

What does this PR do?

This PR adds code to write and compile Rust-based checks, with a simple Rust check as an example.

Motivation

Describe how you validated your changes

Additional Notes

Check out this PR for the Go part (shared library loader):
- [AGENTRUN-866] Add ability to run Rust-based checks through shared libraries #39676

agent-platform-auto-pr · 2025-10-27T10:52:18Z

Static quality checks

✅ Please find below the results from static quality gates
Comparison made with ancestor c7d9740
📊 Static Quality Gates Dashboard

Successful checks

Info

	Quality gate	Change	Size (prev → curr → max)
✅	agent_deb_amd64	N/A	N/A → 705.336 → 708.410
✅	agent_deb_amd64_fips	N/A	N/A → 700.621 → 704.000
✅	agent_heroku_amd64	N/A	N/A → 326.967 → 329.530
✅	agent_msi	N/A	N/A → 571.509 → 982.080
✅	agent_rpm_amd64	N/A	N/A → 705.322 → 708.380
✅	agent_rpm_amd64_fips	N/A	N/A → 700.608 → 703.990
✅	agent_rpm_arm64	N/A	N/A → 686.901 → 693.520
✅	agent_rpm_arm64_fips	N/A	N/A → 682.988 → 688.480
✅	agent_suse_amd64	N/A	N/A → 705.322 → 708.380
✅	agent_suse_amd64_fips	N/A	N/A → 700.608 → 703.990
✅	agent_suse_arm64	N/A	N/A → 686.901 → 693.520
✅	agent_suse_arm64_fips	N/A	N/A → 682.988 → 688.480
✅	docker_agent_amd64	N/A	N/A → 767.552 → 770.720
✅	docker_agent_arm64	N/A	N/A → 773.686 → 780.200
✅	docker_agent_jmx_amd64	N/A	N/A → 958.463 → 961.600
✅	docker_agent_jmx_arm64	N/A	N/A → 953.380 → 959.800
✅	docker_cluster_agent_amd64	N/A	N/A → 180.766 → 181.080
✅	docker_cluster_agent_arm64	N/A	N/A → 196.619 → 198.490
✅	docker_cws_instrumentation_amd64	N/A	N/A → 7.135 → 7.180
✅	docker_cws_instrumentation_arm64	N/A	N/A → 6.689 → 6.920
✅	docker_dogstatsd_amd64	N/A	N/A → 38.812 → 39.380
✅	docker_dogstatsd_arm64	N/A	N/A → 37.128 → 37.940
✅	dogstatsd_deb_amd64	N/A	N/A → 30.031 → 30.610
✅	dogstatsd_deb_arm64	N/A	N/A → 28.176 → 29.110
✅	dogstatsd_rpm_amd64	N/A	N/A → 30.031 → 30.610
✅	dogstatsd_suse_amd64	N/A	N/A → 30.031 → 30.610
✅	iot_agent_deb_amd64	N/A	N/A → 43.045 → 43.290
✅	iot_agent_deb_arm64	N/A	N/A → 40.158 → 40.920
✅	iot_agent_deb_armhf	N/A	N/A → 40.744 → 41.030
✅	iot_agent_rpm_amd64	N/A	N/A → 43.046 → 43.290
✅	iot_agent_suse_amd64	N/A	N/A → 43.046 → 43.290

On-wire sizes (compressed)

	Quality gate	Change	Size (prev → curr → max)
✅	agent_deb_amd64	N/A	N/A → 173.329 → 174.490
✅	agent_deb_amd64_fips	N/A	N/A → 172.267 → 173.750
✅	agent_heroku_amd64	N/A	N/A → 87.118 → 88.450
✅	agent_msi	N/A	N/A → 142.930 → 143.020
✅	agent_rpm_amd64	N/A	N/A → 176.094 → 177.660
✅	agent_rpm_amd64_fips	N/A	N/A → 174.975 → 176.600
✅	agent_rpm_arm64	N/A	N/A → 159.349 → 161.260
✅	agent_rpm_arm64_fips	N/A	N/A → 158.781 → 160.550
✅	agent_suse_amd64	N/A	N/A → 176.094 → 177.660
✅	agent_suse_amd64_fips	N/A	N/A → 174.975 → 176.600
✅	agent_suse_arm64	N/A	N/A → 159.349 → 161.260
✅	agent_suse_arm64_fips	N/A	N/A → 158.781 → 160.550
✅	docker_agent_amd64	N/A	N/A → 261.087 → 262.450
✅	docker_agent_arm64	N/A	N/A → 250.102 → 252.630
✅	docker_agent_jmx_amd64	N/A	N/A → 329.750 → 331.080
✅	docker_agent_jmx_arm64	N/A	N/A → 314.728 → 317.270
✅	docker_cluster_agent_amd64	N/A	N/A → 63.864 → 64.490
✅	docker_cluster_agent_arm64	N/A	N/A → 60.142 → 61.170
✅	docker_cws_instrumentation_amd64	N/A	N/A → 2.994 → 3.330
✅	docker_cws_instrumentation_arm64	N/A	N/A → 2.726 → 3.090
✅	docker_dogstatsd_amd64	N/A	N/A → 15.028 → 15.820
✅	docker_dogstatsd_arm64	N/A	N/A → 14.351 → 14.830
✅	dogstatsd_deb_amd64	N/A	N/A → 7.945 → 8.790
✅	dogstatsd_deb_arm64	N/A	N/A → 6.822 → 7.710
✅	dogstatsd_rpm_amd64	N/A	N/A → 7.956 → 8.800
✅	dogstatsd_suse_amd64	N/A	N/A → 7.956 → 8.800
✅	iot_agent_deb_amd64	N/A	N/A → 11.274 → 12.040
✅	iot_agent_deb_arm64	N/A	N/A → 9.637 → 10.450
✅	iot_agent_deb_armhf	N/A	N/A → 9.836 → 10.620
✅	iot_agent_rpm_amd64	N/A	N/A → 11.294 → 12.060
✅	iot_agent_suse_amd64	N/A	N/A → 11.294 → 12.060

cit-pr-commenter · 2025-10-27T10:55:23Z

Regression Detector

Regression Detector Results

Metrics dashboard
Target profiles
Run ID: 6324e5c9-c7b0-420c-954e-9aca4f0ecfaf

Baseline: c7d9740
Comparison: be1c6eb
Diff

Optimization Goals: ✅ No significant changes detected

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf	experiment	goal	Δ mean %	Δ mean % CI	trials	links
➖	docker_containers_cpu	% cpu utilization	+1.22	[-1.74, +4.18]	1	Logs

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	trials	links
➖	quality_gate_logs	% cpu utilization	+1.46	[+0.01, +2.91]	1	Logs bounds checks dashboard
➖	docker_containers_cpu	% cpu utilization	+1.22	[-1.74, +4.18]	1	Logs
➖	quality_gate_metrics_logs	memory utilization	+0.64	[+0.43, +0.86]	1	Logs bounds checks dashboard
➖	ddot_metrics	memory utilization	+0.50	[+0.29, +0.72]	1	Logs
➖	otlp_ingest_logs	memory utilization	+0.34	[+0.25, +0.43]	1	Logs
➖	otlp_ingest_metrics	memory utilization	+0.31	[+0.16, +0.46]	1	Logs
➖	file_tree	memory utilization	+0.19	[+0.13, +0.25]	1	Logs
➖	file_to_blackhole_100ms_latency	egress throughput	+0.01	[-0.03, +0.05]	1	Logs
➖	file_to_blackhole_0ms_latency	egress throughput	+0.01	[-0.48, +0.49]	1	Logs
➖	ddot_metrics_sum_delta	memory utilization	+0.01	[-0.20, +0.21]	1	Logs
➖	ddot_metrics_sum_cumulativetodelta_exporter	memory utilization	+0.00	[-0.23, +0.24]	1	Logs
➖	file_to_blackhole_500ms_latency	egress throughput	+0.00	[-0.39, +0.39]	1	Logs
➖	tcp_dd_logs_filter_exclude	ingress throughput	-0.00	[-0.09, +0.09]	1	Logs
➖	uds_dogstatsd_to_api	ingress throughput	-0.00	[-0.13, +0.12]	1	Logs
➖	file_to_blackhole_1000ms_latency	egress throughput	-0.01	[-0.41, +0.40]	1	Logs
➖	uds_dogstatsd_to_api_v3	ingress throughput	-0.02	[-0.14, +0.10]	1	Logs
➖	quality_gate_idle_all_features	memory utilization	-0.04	[-0.08, -0.00]	1	Logs bounds checks dashboard
➖	uds_dogstatsd_20mb_12k_contexts_20_senders	memory utilization	-0.13	[-0.19, -0.07]	1	Logs
➖	ddot_logs	memory utilization	-0.25	[-0.31, -0.18]	1	Logs
➖	ddot_metrics_sum_cumulative	memory utilization	-0.26	[-0.43, -0.10]	1	Logs
➖	docker_containers_memory	memory utilization	-0.32	[-0.39, -0.25]	1	Logs
➖	quality_gate_idle	memory utilization	-0.35	[-0.39, -0.30]	1	Logs bounds checks dashboard
➖	tcp_syslog_to_blackhole	ingress throughput	-1.13	[-1.19, -1.06]	1	Logs

Bounds Checks: ✅ Passed

perf	experiment	bounds_check_name	replicates_passed	links
✅	docker_containers_cpu	simple_check_run	10/10
✅	docker_containers_memory	memory_usage	10/10
✅	docker_containers_memory	simple_check_run	10/10
✅	file_to_blackhole_0ms_latency	lost_bytes	10/10
✅	file_to_blackhole_0ms_latency	memory_usage	10/10
✅	file_to_blackhole_1000ms_latency	lost_bytes	10/10
✅	file_to_blackhole_1000ms_latency	memory_usage	10/10
✅	file_to_blackhole_100ms_latency	lost_bytes	10/10
✅	file_to_blackhole_100ms_latency	memory_usage	10/10
✅	file_to_blackhole_500ms_latency	lost_bytes	10/10
✅	file_to_blackhole_500ms_latency	memory_usage	10/10
✅	quality_gate_idle	intake_connections	10/10	bounds checks dashboard
✅	quality_gate_idle	memory_usage	10/10	bounds checks dashboard
✅	quality_gate_idle_all_features	intake_connections	10/10	bounds checks dashboard
✅	quality_gate_idle_all_features	memory_usage	10/10	bounds checks dashboard
✅	quality_gate_logs	intake_connections	10/10	bounds checks dashboard
✅	quality_gate_logs	lost_bytes	10/10	bounds checks dashboard
✅	quality_gate_logs	memory_usage	10/10	bounds checks dashboard
✅	quality_gate_metrics_logs	cpu_usage	10/10	bounds checks dashboard
✅	quality_gate_metrics_logs	intake_connections	10/10	bounds checks dashboard
✅	quality_gate_metrics_logs	lost_bytes	10/10	bounds checks dashboard
✅	quality_gate_metrics_logs	memory_usage	10/10	bounds checks dashboard

Explanation

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

Replicate Execution Details

We run multiple replicates for each experiment/variant. However, we allow replicates to be automatically retried if there are any failures, up to 8 times, at which point the replicate is marked dead and we are unable to run analysis for the entire experiment. We call each of these attempts at running replicates a replicate execution. This section lists all replicate executions that failed due to the target crashing or being oom killed.

Note: In the below tables we bucket failures by experiment, variant, and failure type. For each of these buckets we list out the replicate indexes that failed with an annotation signifying how many times said replicate failed with the given failure mode. In the below example the baseline variant of the experiment named experiment_with_failures had two replicates that failed by oom kills. Replicate 0, which failed 8 executions, and replicate 1 which failed 6 executions, all with the same failure mode.

Experiment	Variant	Replicates	Failure	Logs	Debug Dashboard
experiment_with_failures	baseline	0 (x8) 1 (x6)	Oom killed		Debug Dashboard

The debug dashboard links will take you to a debugging dashboard specifically designed to investigate replicate execution failures.

❌ Retried Profiling Replicate Execution Failures (target internal profiling)

Note: Profiling replicas may still be executing. See the debug dashboard for up to date status.

Experiment	Variant	Replicates	Failure	Debug Dashboard
quality_gate_idle_all_features	baseline	11 (x4)	Oom killed	Debug Dashboard
quality_gate_idle_all_features	comparison	11 (x3)	Oom killed	Debug Dashboard

CI Pass/Fail Decision

✅ Passed. All Quality Gates passed.

quality_gate_idle_all_features, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_idle_all_features, bounds check memory_usage: 10/10 replicas passed. Gate passed.
quality_gate_logs, bounds check lost_bytes: 10/10 replicas passed. Gate passed.
quality_gate_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
quality_gate_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_idle, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_idle, bounds check memory_usage: 10/10 replicas passed. Gate passed.
quality_gate_metrics_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_metrics_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
quality_gate_metrics_logs, bounds check cpu_usage: 10/10 replicas passed. Gate passed.
quality_gate_metrics_logs, bounds check lost_bytes: 10/10 replicas passed. Gate passed.

…braries (#39676)  ### What does this PR do? ⚠️ This new feature is experimental ⚠️ This PR introduces a new way of running checks, through shared libraries. They are Rust-based and loaded at runtime by a new checks loader. You can read the documentation [here](https://datadoghq.atlassian.net/wiki/spaces/ARUN/pages/5479301643/Running+shared+library+checks+in+the+Agent). The Rust-based checks API is minimal for now, only the submit functions are available. The API can be expanded quite easily by adding new fields to the C structure `aggregator_t`. Also, this PR moves the Go submit functions (like `SubmitMetric`) from the collector Python package to a new package (named `aggregator`). That way both Python and shared library checks can use callbacks from this package. These Go submit functions were in the Python package just because Python checks were the only ones using them, but their scope are larger than just Python checks ### Motivation Provide a new way of writing checks to improve Agent performances and to rely a bit less on the Python runtime. ### Describe how you validated your changes  Few unit tests to test the new checks loader and the shared library checks implementation (for the Go part). An e2e test for Linux and Windows to load and run a simple shared library check that submits one metric. ### Possible Drawbacks / Trade-offs ### Additional Notes  The Rust part of this feature (where shared libraries are compiled from) is on this PR: - #42351 Co-authored-by: pgimalac <pierre.gimalac@datadoghq.com> Co-authored-by: maxime.chambre <maxime.chambre@datadoghq.com>

…tted payloads

… sure that the content can't be overwritten

…e number of submitted payloads is print at the end

dd-octo-sts · 2026-02-24T16:32:32Z

This pull request has been automatically marked as stale because it has not had activity in the past 15 days.

It will be closed in 30 days if no further activity occurs. If this pull request is still relevant, adding a comment or pushing new commits will keep it open. Also, you can always reopen the pull request if you missed the window.

Thank you for your contributions!

dd-octo-sts · 2026-03-27T04:54:10Z

This pull request was automatically closed because it has been stale for 15 days with no activity.

If this pull request is still relevant, please reopen it or create a new pull request with updated information.

Thanks!

initial commit

8ad7f83

github-actions Bot added long review PR is complex, plan time to review it team/agent-runtimes labels Oct 27, 2025

Enzu83 mentioned this pull request Oct 27, 2025

[AGENTRUN-866] Add ability to run Rust-based checks through shared libraries #39676

Merged

Enzu83 changed the title ~~[AGENTRUN-558] Rust-based checks source code~~ [AGENTRUN-558] Rust checks code - Add ability to run Rust-based checks through shared libraries Oct 27, 2025

Enzu83 and others added 9 commits October 29, 2025 14:39

docs: add TODO comment on what should be refactored

f478368

feat: add API function to get the shared library check version

dbfe844

refactor: rename version API function

d511f21

fix: keep the same name for the version function symbol

86482ce

Merge branch 'main' into maxime.chambre/shared-library-check-rustcheck

29759ff

chore: update readme and renamed variables

f33a7cd

docs: update and clarify readme

039c22c

Merge branch 'main' into maxime.chambre/shared-library-check-rustcheck

d088c8f

feat: example check now send service check and event

b5f8160

refactor: simplify code, split example check in multiple files

7f1f1c1

Enzu83 force-pushed the maxime.chambre/shared-library-check-rustcheck branch from 08f686b to 7f1f1c1 Compare January 14, 2026 14:06

Merge branch 'main' into maxime.chambre/shared-library-check-rustcheck

ffd27d2

Enzu83 force-pushed the maxime.chambre/shared-library-check-rustcheck branch from 001168a to 8e6885e Compare January 14, 2026 19:28

feat: basic crate to test shared library checks by printing the submi…

90af960

…tted payloads

Enzu83 force-pushed the maxime.chambre/shared-library-check-rustcheck branch from 8e6885e to 90af960 Compare January 15, 2026 12:20

Enzu83 added 4 commits January 15, 2026 13:58

docs: add README for sharedlibrary-tester

e11b76b

fix: segfault when converting c-string to rust-string if ptr is null

ca6d0bf

refactor: clone C-struct Aggregator when received by Rust check to be…

7b29073

… sure that the content can't be overwritten

feat: standalone shlib tester requires path to the check config && th…

be1c6eb

…e number of submitted payloads is print at the end

Enzu83 closed this Feb 9, 2026

Enzu83 reopened this Feb 9, 2026

dd-octo-sts Bot added the internal Identify a non-fork PR label Feb 9, 2026

Enzu83 added 2 commits February 9, 2026 14:07

refactor: remove anyhow deps for the check macro

f5c3f27

feat: http check draft

9da2a9f

dd-octo-sts Bot added the stale label Feb 24, 2026

dd-octo-sts Bot added the auto-closed label Mar 27, 2026

dd-octo-sts Bot closed this Mar 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AGENTRUN-558] Rust checks code - Add ability to run Rust-based checks through shared libraries#42351

[AGENTRUN-558] Rust checks code - Add ability to run Rust-based checks through shared libraries#42351
Enzu83 wants to merge 19 commits intomainfrom
maxime.chambre/shared-library-check-rustcheck

Enzu83 commented Oct 27, 2025 •

edited

Loading

Uh oh!

agent-platform-auto-pr Bot commented Oct 27, 2025 •

edited

Loading

Info

Uh oh!

cit-pr-commenter Bot commented Oct 27, 2025 •

edited

Loading

Experiments ignored for regressions

Fine details of change detection per experiment

Bounds Checks: ✅ Passed

Explanation

Replicate Execution Details

❌ Retried Profiling Replicate Execution Failures (target internal profiling)

Uh oh!

dd-octo-sts Bot commented Feb 24, 2026

Uh oh!

dd-octo-sts Bot commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Enzu83 commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Motivation

Describe how you validated your changes

Additional Notes

Uh oh!

agent-platform-auto-pr Bot commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Static quality checks

Info

Uh oh!

cit-pr-commenter Bot commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Regression Detector

Regression Detector Results

Optimization Goals: ✅ No significant changes detected

Experiments ignored for regressions

Fine details of change detection per experiment

Bounds Checks: ✅ Passed

Explanation

Replicate Execution Details

❌ Retried Profiling Replicate Execution Failures (target internal profiling)

CI Pass/Fail Decision

Uh oh!

dd-octo-sts Bot commented Feb 24, 2026

Uh oh!

dd-octo-sts Bot commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Enzu83 commented Oct 27, 2025 •

edited

Loading

agent-platform-auto-pr Bot commented Oct 27, 2025 •

edited

Loading

cit-pr-commenter Bot commented Oct 27, 2025 •

edited

Loading