[ACTP] PAR: rshell allow-list redesign + per-env paths by julesmcrt · Pull Request #49825 · DataDog/datadog-agent

julesmcrt · 2026-04-23T17:12:31Z

What does this PR do?

Started as a bug fix and grew into a redesign of the operator-side allow-list contract for rshell, plus a new per-environment path-list feature. Three themes:

1. Bug fixes (original scope)

Fixes three bugs in the intersection layer for private_action_runner.restricted_shell.{allowed_commands,allowed_paths}:

YAML [] was treated as "operator unset." GetStringSlice returns nil for a YAML empty sequence, indistinguishable from "key absent." The kill-switch row of the truth table didn't work.
Path intersection used plain string equality. An operator entry narrower than a backend entry (e.g. /host/var/log/par-probe.txt against backend /host/var/log) was dropped instead of admitted.
Bare operator command entries silently produced empty intersections. Backend ships rshell:<name>; operator-written cat never matched. Now warned at startup pointing the operator at the corrected form.

2. Contract redesign — sentinel-default, always-intersect

Replaces the three-way nil / [] / [X] slice contract with a uniform "always intersect" contract using sentinel defaults:

allowed_paths default is ["/"]. pathContains("/", X) is true for any absolute path, so the intersection passes the backend list through when the operator hasn't narrowed.
allowed_commands default is ["rshell:*"]. The wildcard token is a special-case in the operator-side intersection: when present, every backend entry in the rshell: namespace is admitted (scoped via onlyRshellPrefixedCommands).

The IsConfigured gate and the handler's nil-pass-through bypass are gone. End-user behavior is preserved: unset operator config gets the backend list as-is on both axes, explicit empty list is the kill-switch, explicit non-empty narrows.

3. Per-environment paths (new feature)

RunCommandInputs.AllowedPaths is now map[string][]string keyed by environment (default / containerized). wf-actions-server ships one task with both keys; the runner picks the relevant slice based on env.IsContainerized(), then the slice flows through the same intersection logic. This lets a single Balto rule cover both host-installed agents (which see /var/log) and containerized agents (which see /host/var/log).

⚠️ Wire-format change: allowedPaths is no longer a slice. The runner side ships in this PR; Balto / wf-actions-server need to ship the new shape concurrently. Coordinate the rollout — old tasks (slice shape) will fail to deserialize on the new runner.

Implementation notes

Pure path utilities live in helper.go: cleanPathList, reducePathListToBroadest, intersectPathLists, commonPath, onlyRshellPrefixedCommands, backendPathsForEnv.
Handler methods (filterAllowedCommands, filterAllowedPaths) and Run stay in run_command.go.
Two advisory warnings at config load surface misconfigurations: backslash entries (forward-slash contract) and non-directory entries (rshell's os.Root sandbox is directory-only).

Motivation

Discovered during end-to-end PAR validation against a prod-connected test drive. Findings and reproduction traces: experimental/.../rshell-permission-test.md. Parent PRs:

[ACTP] PAR: intersect rshell allow-lists with datadog.yaml #49536 (operator-side intersection)
DataDog/dd-source#414468 (backend injects allowedPaths into every task)

Confluence: Rshell permission model (allow lists).

Describe how you validated your changes

Unit tests. 118 tests pass across pkg/privateactionrunner/... and pkg/config/setup. Linter clean (dda inv linter.go → 0 issues).
- Helper tests in helper_test.go cover commonPath, cleanPathList, reducePathListToBroadest (including idempotence + order-independence properties), intersectPathLists, onlyRshellPrefixedCommands, and backendPathsForEnv.
- Method tests in run_command_test.go pin filterAllowedCommands and filterAllowedPaths matrices.
- YAML-backed transform tests exercise the real config-load path (the original bug [core] Check runner #1 escaped because earlier tests used SetWithoutSource, which doesn't reproduce the nil-from-YAML behavior).

End-to-end validation against image v109530495-91d30ac1-7-arm64 on a prod-connected test drive. Behavior matched the Confluence contract (v13) on every scenario in this table. The contract has shifted since (sentinel defaults, per-env paths) — a fresh E2E pass on the latest image should follow before merge.

Scenario	Works as documented?
Operator `[]` + `[]` kill-switch	Yes — effective `[]` on both axes
Narrower commands `["rshell:cat"]`	Yes — only `cat` works
File-level path entry	Yes — WARN at load; entry silently dropped by rshell sandbox
Directory sub-path `["/host/var/log/nginx"]`	Yes — narrower wins
Bare command name `["cat"]`	Yes — WARN at load; intersection empty
Disjoint non-empty	Yes — effective `[]`, everything blocked
Prefix-sibling `/var/logger` vs `/var/log`	Yes — separator boundary honored

Additional Notes

The startup warnings (unnamespaced commands, backslash paths, non-directory paths) are advisory. Entries still flow through to the handler; the warnings make silent-failure modes observable in agent logs.
Confluence will need a follow-up update once the per-env path shape is finalized in Balto.

🤖 Generated with Claude Code

Three bugs in the operator-side tightening layer for rshell allow-lists, surfaced during PAR validation against a prod-connected test drive: * YAML `allowed_commands: []` / `allowed_paths: []` was silently treated as "operator unset" because GetStringSlice returns a nil slice for an explicit YAML empty list. The kill-switch row of the truth table now works: the transform gates on IsConfigured and normalizes the nil into a non-nil empty slice. * Path intersection was plain string equality, so an operator entry narrower than a backend entry (e.g. `/var/log/nginx` against the backend's `/var/log`) was dropped instead of admitted. Replaced with containment-aware "narrower wins" matching. Uses a separator-boundary check so `/var/logger` does not match `/var/log`. * Operator command entries must match the backend's namespaced form (`rshell:<name>`). Entries written without the prefix silently fail to intersect; the transform now emits a startup warning per bare entry so the failure mode is observable rather than silent.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a7df84351c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

datadog-datadog-prod-us1 · 2026-04-23T17:24:14Z

🎯 Code Coverage (details)
• Patch Coverage: 100.00%
• Overall Coverage: 50.18% (+0.01%)

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: fc105fa | Docs | Datadog PR Page | Give us feedback!}

dd-octo-sts · 2026-04-23T17:39:00Z

Files inventory check summary

File checks results against ancestor 3eb519ee:

Results for datadog-agent_7.80.0~devel.git.272.fc105fa.pipeline.110373313-1_amd64.deb:

No change detected

PAR compiles and runs on Windows (see cmd/privateactionrunner/main_windows.go) and rshell itself uses OS-native separators for its sandbox. path.Clean treats backslashes as ordinary characters, so "C:\ProgramData\Datadog\logs" did not register as contained in "C:\ProgramData\Datadog" — the "narrower wins" intersection silently dropped it. Normalize to forward slashes before the prefix check so the comparison is separator-agnostic. Matrix test gains three Windows cases plus a mixed-separator case. Caught by Codex review feedback on #49825.

cit-pr-commenter-54b7da · 2026-04-23T17:59:58Z

Regression Detector

Regression Detector Results

Metrics dashboard
Target profiles
Run ID: 59bdbbfc-999d-4cb3-a126-b548d1bdbdb0

Baseline: 4159087
Comparison: 114422d
Diff

❌ Experiments with retried target crashes

This is a critical error. One or more replicates failed with a non-zero exit code. These replicates may have been retried. See Replicate Execution Details for more information.

quality_gate_idle_all_features

Optimization Goals: ✅ No significant changes detected

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf	experiment	goal	Δ mean %	Δ mean % CI	trials	links
➖	docker_containers_cpu	% cpu utilization	+2.22	[-0.77, +5.20]	1	Logs

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	trials	links
➖	docker_containers_cpu	% cpu utilization	+2.22	[-0.77, +5.20]	1	Logs
➖	quality_gate_metrics_logs	memory utilization	+1.56	[+1.31, +1.81]	1	Logs bounds checks dashboard
➖	quality_gate_logs	% cpu utilization	+1.37	[+0.41, +2.34]	1	Logs bounds checks dashboard
➖	file_tree	memory utilization	+1.02	[+0.97, +1.06]	1	Logs
➖	otlp_ingest_metrics	memory utilization	+0.47	[+0.31, +0.63]	1	Logs
➖	quality_gate_idle	memory utilization	+0.20	[+0.15, +0.25]	1	Logs bounds checks dashboard
➖	ddot_metrics_sum_cumulative	memory utilization	+0.09	[-0.07, +0.25]	1	Logs
➖	docker_containers_memory	memory utilization	+0.06	[-0.05, +0.17]	1	Logs
➖	file_to_blackhole_1000ms_latency	egress throughput	+0.06	[-0.36, +0.48]	1	Logs
➖	uds_dogstatsd_20mb_12k_contexts_20_senders	memory utilization	+0.04	[-0.01, +0.09]	1	Logs
➖	file_to_blackhole_0ms_latency	egress throughput	+0.02	[-0.47, +0.52]	1	Logs
➖	ddot_metrics_sum_delta	memory utilization	+0.02	[-0.17, +0.21]	1	Logs
➖	uds_dogstatsd_to_api_v3	ingress throughput	+0.01	[-0.19, +0.21]	1	Logs
➖	tcp_dd_logs_filter_exclude	ingress throughput	+0.01	[-0.10, +0.11]	1	Logs
➖	uds_dogstatsd_to_api	ingress throughput	+0.00	[-0.20, +0.20]	1	Logs
➖	file_to_blackhole_500ms_latency	egress throughput	-0.01	[-0.41, +0.39]	1	Logs
➖	file_to_blackhole_100ms_latency	egress throughput	-0.03	[-0.14, +0.08]	1	Logs
➖	quality_gate_idle_all_features	memory utilization	-0.10	[-0.14, -0.06]	1	Logs bounds checks dashboard
➖	ddot_logs	memory utilization	-0.30	[-0.37, -0.23]	1	Logs
➖	otlp_ingest_logs	memory utilization	-0.30	[-0.41, -0.19]	1	Logs
➖	ddot_metrics	memory utilization	-0.30	[-0.51, -0.10]	1	Logs
➖	ddot_metrics_sum_cumulativetodelta_exporter	memory utilization	-0.38	[-0.62, -0.15]	1	Logs
➖	tcp_syslog_to_blackhole	ingress throughput	-1.49	[-1.68, -1.31]	1	Logs

Bounds Checks: ✅ Passed

perf	experiment	bounds_check_name	replicates_passed	observed_value	links
✅	docker_containers_cpu	simple_check_run	10/10	699 ≥ 26
✅	docker_containers_memory	memory_usage	10/10	244.84MiB ≤ 370MiB
✅	docker_containers_memory	simple_check_run	10/10	583 ≥ 26
✅	file_to_blackhole_0ms_latency	memory_usage	10/10	0.16GiB ≤ 1.20GiB
✅	file_to_blackhole_0ms_latency	missed_bytes	10/10	0B = 0B
✅	file_to_blackhole_1000ms_latency	memory_usage	10/10	0.21GiB ≤ 1.20GiB
✅	file_to_blackhole_1000ms_latency	missed_bytes	10/10	0B = 0B
✅	file_to_blackhole_100ms_latency	memory_usage	10/10	0.17GiB ≤ 1.20GiB
✅	file_to_blackhole_100ms_latency	missed_bytes	10/10	0B = 0B
✅	file_to_blackhole_500ms_latency	memory_usage	10/10	0.18GiB ≤ 1.20GiB
✅	file_to_blackhole_500ms_latency	missed_bytes	10/10	0B = 0B
✅	quality_gate_idle	intake_connections	10/10	3 ≤ 4	bounds checks dashboard
✅	quality_gate_idle	memory_usage	10/10	139.97MiB ≤ 147MiB	bounds checks dashboard
✅	quality_gate_idle_all_features	intake_connections	10/10	3 ≤ 4	bounds checks dashboard
✅	quality_gate_idle_all_features	memory_usage	10/10	476.54MiB ≤ 495MiB	bounds checks dashboard
✅	quality_gate_logs	intake_connections	10/10	4 ≤ 6	bounds checks dashboard
✅	quality_gate_logs	memory_usage	10/10	179.47MiB ≤ 195MiB	bounds checks dashboard
✅	quality_gate_logs	missed_bytes	10/10	0B = 0B	bounds checks dashboard
✅	quality_gate_metrics_logs	cpu_usage	10/10	355.74 ≤ 2000	bounds checks dashboard
✅	quality_gate_metrics_logs	intake_connections	10/10	3 ≤ 6	bounds checks dashboard
✅	quality_gate_metrics_logs	memory_usage	10/10	390.37MiB ≤ 430MiB	bounds checks dashboard
✅	quality_gate_metrics_logs	missed_bytes	10/10	0B = 0B	bounds checks dashboard

Explanation

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

Replicate Execution Details

We run multiple replicates for each experiment/variant. However, we allow replicates to be automatically retried if there are any failures, up to 8 times, at which point the replicate is marked dead and we are unable to run analysis for the entire experiment. We call each of these attempts at running replicates a replicate execution. This section lists all replicate executions that failed due to the target crashing or being oom killed.

Note: In the below tables we bucket failures by experiment, variant, and failure type. For each of these buckets we list out the replicate indexes that failed with an annotation signifying how many times said replicate failed with the given failure mode. In the below example the baseline variant of the experiment named experiment_with_failures had two replicates that failed by oom kills. Replicate 0, which failed 8 executions, and replicate 1 which failed 6 executions, all with the same failure mode.

Experiment	Variant	Replicates	Failure	Logs	Debug Dashboard
experiment_with_failures	baseline	0 (x8) 1 (x6)	Oom killed		Debug Dashboard

The debug dashboard links will take you to a debugging dashboard specifically designed to investigate replicate execution failures.

❌ Retried Normal Replicate Execution Failures (non-profiling)

Experiment	Variant	Replicates	Failure	Debug Dashboard
quality_gate_idle_all_features	comparison	3	Oom killed	Debug Dashboard

CI Pass/Fail Decision

✅ Passed. All Quality Gates passed.

quality_gate_logs, bounds check missed_bytes: 10/10 replicas passed. Gate passed.
quality_gate_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
quality_gate_metrics_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
quality_gate_metrics_logs, bounds check missed_bytes: 10/10 replicas passed. Gate passed.
quality_gate_metrics_logs, bounds check cpu_usage: 10/10 replicas passed. Gate passed.
quality_gate_metrics_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_idle, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_idle, bounds check memory_usage: 10/10 replicas passed. Gate passed.
quality_gate_idle_all_features, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_idle_all_features, bounds check memory_usage: 10/10 replicas passed. Gate passed.

Reverts the backslash-to-slash normalization in pathContains and defines the operator-side allow-list as forward-slash only. PAR currently ships Linux-style paths from Balto; Windows-native paths in datadog.yaml have no use case today and adding cross-platform containment introduced more surface than it fixed. The failure mode is still observable: rshellAllowedPaths now logs a warning for each entry containing a backslash, mirroring the unnamespaced-command warning. Entries still flow through to the handler (where the intersection drops them against the Linux-style backend list) so the operator's written config is not silently rewritten. Addresses codex review on #49825: agreed Windows was a real bug, but concluded the cleaner fix is to narrow the contract rather than carry a cross-platform normalization helper.

rshell's AllowedPaths sandbox is built on os.Root, which represents a directory handle — file entries are silently skipped by rshell at runner creation and produce permission-denied for every open with no operator- facing message. An operator who wrote `allowed_paths: [/var/log/app.log]` would see their intended "narrow to this file" config produce a deny-everything kill-switch instead. Stat each operator-configured path at config load; warn and drop entries that exist but are not directories. Entries that don't exist yet are left in place (rshell's own warning at task time covers that case). nil-vs-empty semantics are preserved so unset and explicit `[]` continue to have their distinct meanings downstream.

Dropping non-directory entries at config load was redundant with rshell's own sandbox filter (os.Root refuses non-directory entries at runner creation, same end-user behavior either way). Replaced filterNonDirectoryPaths with warnNonDirectoryPaths: same observability win at startup, no duplicate filtering, no package-level stat stub, no nil-vs-empty bookkeeping. Confluence page updated to match.

Replace the "nil = pass-through, [] = kill-switch, [X] = intersect" three-way contract with a uniform "always intersect" contract on both axes, using sentinel default values for the pass-through case: - allowed_paths defaults to ["/"]. pathContains("/", X) is true for any absolute X, so the intersection returns the backend list as-is when the operator has not narrowed. - allowed_commands defaults to ["rshell:*"]. The wildcard token is handled as a special case in filterAllowedCommands: when present, every backend entry in the "rshell:" namespace is admitted. The transform drops the IsConfigured gate on both axes — GetStringSlice plus a nil → []string{} normalization for YAML `[]` is enough now that the default carries the sentinel. The handler drops the operatorPathsFilterEnabled / operatorCommandsFilterEnabled bools and their bypass branches; the intersection runs unconditionally. End-user behavior is unchanged: unset operator config gets the backend list as-is on both axes, explicit empty list is the kill-switch, and explicit non-empty narrows. The change moves the three-way state out of the handler API and into the config schema, where the default value encodes "no narrowing". Tests updated: setup tests now expect the sentinel as the default value; transform tests use sentinel for the unset case; handler matrix and end-to-end tests pass the sentinel where they previously relied on nil-pass-through. Added explicit tests for the wildcard match (coexistence with literal entries, dedup, namespace boundary).

…d VS bare metal). Refactor the intersection logic. Add new defaults for cleaner logic.

…/fix-allow-list-bugs

dd-octo-sts · 2026-04-27T17:13:30Z

Static quality checks

✅ Please find below the results from static quality gates
Comparison made with ancestor 3eb519e
📊 Static Quality Gates Dashboard
🔗 SQG Job

Successful checks

Info

	Quality gate	Change	Size (prev → curr → max)
✅	agent_deb_amd64	+24.07 KiB (0.00% increase)	738.993 → 739.016 → 750.310
✅	agent_deb_amd64_fips	+24.07 KiB (0.00% increase)	697.419 → 697.443 → 702.690
✅	agent_msi	+23.0 KiB (0.00% increase)	604.156 → 604.178 → 620.770
✅	agent_rpm_amd64	+24.07 KiB (0.00% increase)	738.977 → 739.000 → 750.280
✅	agent_rpm_amd64_fips	+24.07 KiB (0.00% increase)	697.403 → 697.426 → 702.670
✅	agent_rpm_arm64	+16.53 KiB (0.00% increase)	717.073 → 717.089 → 724.050
✅	agent_rpm_arm64_fips	+12.53 KiB (0.00% increase)	678.528 → 678.540 → 684.460
✅	agent_suse_amd64	+24.07 KiB (0.00% increase)	738.977 → 739.000 → 750.280
✅	agent_suse_amd64_fips	+24.07 KiB (0.00% increase)	697.403 → 697.426 → 702.670
✅	agent_suse_arm64	+16.53 KiB (0.00% increase)	717.073 → 717.089 → 724.050
✅	agent_suse_arm64_fips	+12.53 KiB (0.00% increase)	678.528 → 678.540 → 684.460
✅	docker_agent_amd64	+20.07 KiB (0.00% increase)	799.451 → 799.471 → 805.870
✅	docker_agent_arm64	+16.76 KiB (0.00% increase)	802.357 → 802.373 → 809.730
✅	docker_agent_jmx_amd64	+20.32 KiB (0.00% increase)	990.371 → 990.390 → 996.590
✅	docker_agent_jmx_arm64	+16.98 KiB (0.00% increase)	982.055 → 982.072 → 989.410
✅	iot_agent_deb_arm64	+4.0 KiB (0.01% increase)	41.361 → 41.365 → 42.560

15 successful checks with minimal change (< 2 KiB)

	Quality gate	Current Size
✅	agent_heroku_amd64	309.103 MiB
✅	docker_cluster_agent_amd64	206.269 MiB
✅	docker_cluster_agent_arm64	220.383 MiB
✅	docker_cws_instrumentation_amd64	7.142 MiB
✅	docker_cws_instrumentation_arm64	6.689 MiB
✅	docker_dogstatsd_amd64	39.347 MiB
✅	docker_dogstatsd_arm64	37.565 MiB
✅	dogstatsd_deb_amd64	30.001 MiB
✅	dogstatsd_deb_arm64	28.142 MiB
✅	dogstatsd_rpm_amd64	30.001 MiB
✅	dogstatsd_suse_amd64	30.001 MiB
✅	iot_agent_deb_amd64	44.372 MiB
✅	iot_agent_deb_armhf	42.097 MiB
✅	iot_agent_rpm_amd64	44.373 MiB
✅	iot_agent_suse_amd64	44.373 MiB

On-wire sizes (compressed)

	Quality gate	Change	Size (prev → curr → max)
✅	agent_deb_amd64	+36.76 KiB (0.02% increase)	174.972 → 175.008 → 179.160
✅	agent_deb_amd64_fips	+27.75 KiB (0.02% increase)	166.635 → 166.662 → 174.440
✅	agent_heroku_amd64	neutral	74.908 MiB → 80.310
✅	agent_msi	neutral	139.289 MiB → 147.550
✅	agent_rpm_amd64	-34.67 KiB (0.02% reduction)	176.962 → 176.928 → 182.080
✅	agent_rpm_amd64_fips	+4.29 KiB (0.00% increase)	168.030 → 168.035 → 174.140
✅	agent_rpm_arm64	+7.09 KiB (0.00% increase)	159.271 → 159.278 → 163.610
✅	agent_rpm_arm64_fips	-11.03 KiB (0.01% reduction)	151.467 → 151.456 → 156.850
✅	agent_suse_amd64	-34.67 KiB (0.02% reduction)	176.962 → 176.928 → 182.080
✅	agent_suse_amd64_fips	+4.29 KiB (0.00% increase)	168.030 → 168.035 → 174.140
✅	agent_suse_arm64	+7.09 KiB (0.00% increase)	159.271 → 159.278 → 163.610
✅	agent_suse_arm64_fips	-11.03 KiB (0.01% reduction)	151.467 → 151.456 → 156.850
✅	docker_agent_amd64	+32.96 KiB (0.01% increase)	267.115 → 267.147 → 272.990
✅	docker_agent_arm64	+21.18 KiB (0.01% increase)	254.146 → 254.166 → 261.470
✅	docker_agent_jmx_amd64	+18.83 KiB (0.01% increase)	335.769 → 335.787 → 341.610
✅	docker_agent_jmx_arm64	+8.57 KiB (0.00% increase)	318.797 → 318.805 → 326.050
✅	docker_cluster_agent_amd64	+5.27 KiB (0.01% increase)	72.297 → 72.302 → 73.460
✅	docker_cluster_agent_arm64	neutral	67.759 MiB → 68.680
✅	docker_cws_instrumentation_amd64	neutral	2.999 MiB → 3.330
✅	docker_cws_instrumentation_arm64	neutral	2.729 MiB → 3.090
✅	docker_dogstatsd_amd64	neutral	15.230 MiB → 15.870
✅	docker_dogstatsd_arm64	-6.39 KiB (0.04% reduction)	14.549 → 14.543 → 14.890
✅	dogstatsd_deb_amd64	neutral	7.936 MiB → 8.830
✅	dogstatsd_deb_arm64	neutral	6.820 MiB → 7.750
✅	dogstatsd_rpm_amd64	neutral	7.947 MiB → 8.840
✅	dogstatsd_suse_amd64	neutral	7.947 MiB → 8.840
✅	iot_agent_deb_amd64	neutral	11.675 MiB → 13.210
✅	iot_agent_deb_arm64	neutral	9.982 MiB → 11.620
✅	iot_agent_deb_armhf	neutral	10.186 MiB → 11.780
✅	iot_agent_rpm_amd64	neutral	11.693 MiB → 13.230
✅	iot_agent_suse_amd64	neutral	11.693 MiB → 13.230

matt-dz · 2026-04-27T17:26:55Z

+	}
+}
+
+func warnNonDirectoryPaths(paths []string) {


does rshell not output an error in stderr whenever a path is not accessible? not sure we need this here

rshell considers all paths to be directories, it won't accept files
this warning is a nice to have but not required

dd-gplassard · 2026-04-28T09:07:14Z

+	// the backend-injected lists. By default, they act as a no-op, allowing
+	// everything: the backend is the only filter.
+	//
+	// To allow none, use an explicit YAML empty list.
+	//
+	//   - allowed_paths defaults to ["/"].
+	//   - allowed_commands defaults to ["rshell:*"]. The wildcard token is
+	//     handled as a special case in the operator-side intersection: when
+	//     it appears in the operator list, every backend command in the
+	//     "rshell:" namespace is admitted.
+	config.BindEnvAndSetDefault(PARRestrictedShellAllowedPaths, []string{RShellPathAllowAll})


It doesn't seem like it's possible to block everything from env variables only ? Only through explicit yaml empty list ?
IMO we should have parity between what we can configure through YAML and env variables. What we did with actions allowlist is that we have private_action_runner.default_actions_enabled property which is true by default (but we also had to maintain backward compatibility so it's slightly different).
Are there other cases in the agent ?

Done — env parser now accepts JSON-array form (["a","b"], []) alongside CSV, same shape as process_config.custom_sensitive_words. DD_..._PATHS=[] is the env kill-switch; invalid JSON logs an error and falls back to nil (kill-switch downstream, fail-secure).

Mirrors the process_config.custom_sensitive_words shape: the env parser accepts both CSV ("a,b") and JSON-array (["a","b"], []) forms for both rshell axes. Lets operators express the kill-switch via env (DD_..._PATHS=[]) and handles bracketed input cleanly instead of splitting it as a CSV with literal brackets attached.

AlexandreYang · 2026-04-28T12:18:35Z

+	RShellCommandAllowAllWildcard      = RShellCommandNamespacePrefix + "*"
+	RShellPathAllowAll                 = "/"
+	RShellPathAllowMapContainerizedKey = "containerized"
+	RShellPathAllowMapBareMetalKey     = "bare_metal"


(nit)

Is bare_metal the right term ? it seems be just anything "non-container" (e.g. VM hosts are not bare-metal)

Shall we just use other?

moving to default

AlexandreYang · 2026-04-28T12:25:41Z

+	return commands
+}
+
+func warnUnnamespacedCommands(commands []string) {


It seems that we are leaking rshell logic into datadog-agent.

Shall we handle this in rshell instead?

The issue I see is that, if in the future, we have another client (other than datadog-agent) using rshell, it won't benefit from those improvement and we might need to duplicate the logic.

Same for onlyRshellPrefixedCommands() code.

and reducePathListToBroadest()

Note that, in rshell we already have some Warnings mechanism: DataDog/rshell#200 (cc @matt-dz )

Ok for warnUnnamespacedCommands

We can move onlyRshellPrefixedCommands if we change the default value of the yaml to * (currently rshell:*)

reducePathListToBroadest would be nice to have in rshell directly, but idk if we can compute the intersection of yaml allow list and backend allow list without passing each into reducePathListToBroadest first

discussed via DM, fine to merge this first for 7.79 and address issue for 7.80 via issue in https://github.com/DataDog/rshell/issues

DataDog/rshell#201

TestRshellHappyFlow was sending allowedPaths as a flat []string, but the new RunCommandInputs.AllowedPaths is map[string][]string keyed by bare_metal/containerized. The strict JSON unmarshal in ExtractInputs rejected the old shape, so the happy-flow task errored out and the exit code 0 assertion failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

AlexandreYang

approved,

the issue discussed here https://github.com/DataDog/datadog-agent/pull/49825/changes#r3154072023 will be address separately for 7.80

Backend wire format ships the per-env paths map keyed by 'default' (not 'bare_metal'). Rename the constant and string value to match.

After renaming RShellPathAllowMapBareMetalKey to RShellPathAllowMapDefaultKey, the longest key in the "unknown keys are ignored" case shrank, leaving the literal "some_future_env" two spaces over-aligned. gofmt fixes the gutter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

dd-octo-sts · 2026-04-29T12:13:41Z

⚠️ Automatic backport to 7.79.x failed.

This usually happens when the cherry-pick has merge conflicts and needs manual resolution.

To backport manually, run:

git fetch
git worktree add .worktrees/backport-7.79.x 7.79.x
cd .worktrees/backport-7.79.x
git switch --create backport-49825-to-7.79.x
git cherry-pick -x --mainline 1 114422d410dcb37adb298c791221410fd591a493
git push --set-upstream origin backport-49825-to-7.79.x

Workflow logs: https://github.com/DataDog/datadog-agent/actions/runs/25108113120

dd-octo-sts · 2026-04-29T13:03:40Z

⚠️ Automatic backport to 7.79.x failed.

This usually happens when the cherry-pick has merge conflicts and needs manual resolution.

To backport manually, run:

git fetch
git worktree add .worktrees/backport-7.79.x 7.79.x
cd .worktrees/backport-7.79.x
git switch --create backport-49825-to-7.79.x
git cherry-pick -x --mainline 1 114422d410dcb37adb298c791221410fd591a493
git push --set-upstream origin backport-49825-to-7.79.x

Workflow logs: https://github.com/DataDog/datadog-agent/actions/runs/25108113120

julesmcrt · 2026-04-29T13:53:20Z

Extensive testing done by @matt-dz

Commands axis (allowed_commands):
  C1   unset                                    ✅           ✅
  C2   explicit []  (kill-switch)               ✅           ✅
  C3   subset  ["rshell:cat"]                   ✅           ✅
  C4   sentinel only  ["rshell:*"]              ✅           ✅
  C5   sentinel + extras                        ✅           ✅
  C6   duplicates                               ✅           ✅
  C7   bare command (no rshell: prefix)         ✅           ✅
                                                ↑ verifies WARN at config load + empty intersection

Paths axis (allowed_paths):
  P1   unset                                    ✅           ✅
  P2   explicit []  (kill-switch)               ✅           ✅
  P3   subset  (one entry from backend list)    ✅           ✅
  P4   sentinel only  ["/"]                     ✅           ✅
  P5   sentinel + extras                        ✅           ✅
  P6   duplicates                               ✅           ✅
  P7   broadest-path reduction (config-side)    ✅           ✅
  P8   narrower-wins (bug #2 fix)               ✅           ✅
                                                ↑ headline test: operator entry narrower
                                                  than backend is kept, not dropped

…ths (#49825) (#50090) Backport of #49825 to `7.79.x`. Conflicts resolved: - `pkg/privateactionrunner/adapters/config/BUILD.bazel` dropped (7.79.x predates the Bazel migration). - `test/new-e2e/go.{mod,sum}` retidied against 7.79.x — adds `pkg/config/setup` + indirect deps at `v0.79.0-rc.3`. Co-authored-by: spencer.gilbert <spencer.gilbert@datadoghq.com>

julesmcrt requested a review from a team as a code owner April 23, 2026 17:12

julesmcrt requested a review from ihssane-yb April 23, 2026 17:12

dd-octo-sts Bot added the internal Identify a non-fork PR label Apr 23, 2026

github-actions Bot added the medium review PR review might take time label Apr 23, 2026

dd-octo-sts Bot added the team/action-platform label Apr 23, 2026

chatgpt-codex-connector Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread pkg/privateactionrunner/bundles/remoteaction/rshell/run_command.go Outdated

julesmcrt added changelog/no-changelog No changelog entry needed qa/rc-required Only for a PR that requires validation on the Release Candidate backport/7.79.x Automatically create a backport PR to the 7.79.x branch once the PR is merged labels Apr 23, 2026

julesmcrt added 4 commits April 23, 2026 21:05

julesmcrt requested a review from a team as a code owner April 27, 2026 07:23

julesmcrt requested a review from s-alad April 27, 2026 07:23

dd-octo-sts Bot added the team/agent-configuration label Apr 27, 2026

s-alad approved these changes Apr 27, 2026

View reviewed changes

julesmcrt added 2 commits April 27, 2026 18:32

Only send relevant paths to rshell depending on the env (containerize…

e9889be

…d VS bare metal). Refactor the intersection logic. Add new defaults for cleaner logic.

Merge remote-tracking branch 'origin/main' into jules.macret/Q/rshell…

1ca9b4e

…/fix-allow-list-bugs

github-actions Bot added long review PR is complex, plan time to review it and removed medium review PR review might take time labels Apr 27, 2026

julesmcrt changed the title ~~rshell: fix operator allow-list intersection bugs~~ [ACTP] PAR: rshell allow-list redesign + per-env paths Apr 27, 2026

Bzl gazelle

f7905c3

julesmcrt requested a review from a team as a code owner April 27, 2026 17:13

dd-octo-sts Bot added the team/agent-build label Apr 27, 2026

aiuto approved these changes Apr 27, 2026

View reviewed changes

This was referenced Apr 27, 2026

[ACTP] PAR: fix rshell allow-list intersection bugs #49945

Open

[ACTP] PAR: warn on misconfigured rshell allowed_paths entries #49948

Open

matt-dz reviewed Apr 27, 2026

View reviewed changes

This was referenced Apr 27, 2026

[ACTP] PAR: redesign rshell allow-list contract (sentinel defaults) #49949

Open

[ACTP] PAR: per-environment paths for rshell allow-list #49950

Open

dd-gplassard approved these changes Apr 28, 2026

View reviewed changes

julesmcrt added 2 commits April 28, 2026 11:27

Update comment

5e5a45e

AlexandreYang reviewed Apr 28, 2026

View reviewed changes

AlexandreYang approved these changes Apr 28, 2026

View reviewed changes

julesmcrt mentioned this pull request Apr 28, 2026

Move helpers from datadog-agent to rshell DataDog/rshell#201

Open

julesmcrt and others added 3 commits April 28, 2026 15:07

bazel run //:go_mod_tidy_all

bae63c5

rshell: rename bare_metal allowlist key to default

15033e4

Backend wire format ships the per-env paths map keyed by 'default' (not 'bare_metal'). Rename the constant and string value to match.

gh-worker-dd-mergequeue-cf854d Bot merged commit 114422d into main Apr 29, 2026
475 checks passed

gh-worker-dd-mergequeue-cf854d Bot deleted the jules.macret/Q/rshell/fix-allow-list-bugs branch April 29, 2026 12:12

github-actions Bot added this to the 7.80.0 milestone Apr 29, 2026

julesmcrt mentioned this pull request Apr 29, 2026

[Backport 7.79.x] [ACTP] PAR: rshell allow-list redesign + per-env paths (#49825) #50090

Merged

Conversation

julesmcrt commented Apr 23, 2026 • edited by matt-dz Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

1. Bug fixes (original scope)

2. Contract redesign — sentinel-default, always-intersect

3. Per-environment paths (new feature)

Implementation notes

Motivation

Describe how you validated your changes

Additional Notes

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

datadog-datadog-prod-us1 Bot commented Apr 23, 2026 • edited by datadog-prod-us1-6 Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dd-octo-sts Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Files inventory check summary

Results for datadog-agent_7.80.0~devel.git.272.fc105fa.pipeline.110373313-1_amd64.deb:

Uh oh!

cit-pr-commenter-54b7da Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Regression Detector

Regression Detector Results

❌ Experiments with retried target crashes

Optimization Goals: ✅ No significant changes detected

Experiments ignored for regressions

Fine details of change detection per experiment

Bounds Checks: ✅ Passed

Explanation

Replicate Execution Details

❌ Retried Normal Replicate Execution Failures (non-profiling)

CI Pass/Fail Decision

Uh oh!

dd-octo-sts Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Static quality checks

Info

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlexandreYang Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlexandreYang Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlexandreYang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dd-octo-sts Bot commented Apr 29, 2026

Uh oh!

julesmcrt commented Apr 23, 2026 •

edited by matt-dz

Loading

datadog-datadog-prod-us1 Bot commented Apr 23, 2026 •

edited by datadog-prod-us1-6 Bot

Loading

dd-octo-sts Bot commented Apr 23, 2026 •

edited

Loading

cit-pr-commenter-54b7da Bot commented Apr 23, 2026 •

edited

Loading

dd-octo-sts Bot commented Apr 27, 2026 •

edited

Loading

AlexandreYang Apr 28, 2026 •

edited

Loading

AlexandreYang Apr 28, 2026 •

edited

Loading