Backport #48127 and #48236#48488
Conversation
<!-- dd-meta {"pullId":"c2960337-1304-4d38-b522-5285faa7a280","source":"chat","resourceId":"e1770736-c3aa-4752-a8c6-77212765f0c7","workflowId":"a58e22a0-26bb-4586-bcd6-bd4c67edd5e1","codeChangeId":"a58e22a0-26bb-4586-bcd6-bd4c67edd5e1","sourceType":"slack"} -->
Add restricted shell allowed path config
The rshell runner currently hardcodes allowed paths (`/var/log`) in code, which makes policy changes require code changes and deploys. This change moves path policy into configuration so operators can manage rshell filesystem access through `datadog.yaml` and environment variables.
- `go test ./pkg/privateactionrunner/bundles/remoteaction/rshell`
- `go test -tags test ./pkg/config/setup -run 'TestPrivateActionRunner|TestRestrictedShell'`
- `go test -tags test ./pkg/privateactionrunner/adapters/config -run 'TestFromDDConfig|TestFromDDConfigRestrictedShellAllowedPaths'`
- `Format` tool run (goimports/gofmt on changed Go files)
- `Lint` tool run (golangci-lint execution attempted, but failed due to local toolchain mismatch: golangci-lint built with go1.24 while repo targets go1.25.7)
This preserves existing rshell default behavior by defaulting `restricted_shell.allowed_paths` to `/var/log`.
---
PR by Bits - [View session in Datadog](https://app.datadoghq.com/code/e1770736-c3aa-4752-a8c6-77212765f0c7)
Comment @DataDog to request changes
Co-authored-by: datadog-datadog-prod-us1[bot] <88084959+datadog-datadog-prod-us1[bot]@users.noreply.github.com>
Co-authored-by: alexandre.yang <alexandre.yang@datadoghq.com>
…ntainer-aware prefixing"" (#48462) Reverts #48460 Revert + this test fix: All tests pass. Here's a summary of the fix: Root cause: Both filepath.Join calls used OS path separators. On Windows, filepath.Join("/host", "/var/log") produces \host\var\log, breaking tests that expected Linux-style paths. Similarly, filepath.Join("/host", "/proc") produced \host\proc which didn't match the mock key /host/proc, causing resolveProcPath to fall back to /proc. Fix: Replaced filepath.Join with path.Join (POSIX package) in three files: - pkg/config/setup/privateactionrunner.go — container path prefix joining - pkg/config/setup/privateactionrunner_test.go — test assertion using same join - pkg/privateactionrunner/bundles/remoteaction/rshell/run_command.go — resolveProcPath These are Linux container paths (/host/...) that must always use forward slashes regardless of the host OS. Co-authored-by: alexandre.yang <alexandre.yang@datadoghq.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 38eac76b86
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if env.IsContainerized() { | ||
| for i, v := range defaultPaths { | ||
| hostPath := path.Join(containerizedPathPrefix, v) | ||
| defaultPaths[i] = hostPath |
There was a problem hiding this comment.
Preserve /var/log fallback when /host mount is absent
This unconditionally rewrites the default restricted-shell allowlist to /host/var/log in containerized mode, even when that host mount is not present. In those environments (for example Fargate or sidecars without host-volume mounts), runCommand now passes a non-existent directory to interp.AllowedPaths, and runner construction fails with failed to create runner, so every rshell command is rejected. We should gate the /host rewrite on path existence (or fall back to /var/log) the same way proc-path resolution already does.
Useful? React with 👍 / 👎.
Files inventory check summaryFile checks results against ancestor 183e910c: Results for datadog-agent_7.78.0~rc.3.git.10.38eac76.pipeline.104743243-1_amd64.deb:No change detected |
Static quality checks✅ Please find below the results from static quality gates Successful checksInfo
12 successful checks with minimal change (< 2 KiB)
On-wire sizes (compressed)
|
Regression DetectorRegression Detector ResultsMetrics dashboard Baseline: 183e910 Optimization Goals: ✅ No significant changes detected
|
| perf | experiment | goal | Δ mean % | Δ mean % CI | trials | links |
|---|---|---|---|---|---|---|
| ➖ | docker_containers_cpu | % cpu utilization | +0.49 | [-2.57, +3.55] | 1 | Logs |
Fine details of change detection per experiment
| perf | experiment | goal | Δ mean % | Δ mean % CI | trials | links |
|---|---|---|---|---|---|---|
| ➖ | quality_gate_logs | % cpu utilization | +1.71 | [+0.10, +3.32] | 1 | Logs bounds checks dashboard |
| ➖ | docker_containers_cpu | % cpu utilization | +0.49 | [-2.57, +3.55] | 1 | Logs |
| ➖ | ddot_metrics_sum_cumulativetodelta_exporter | memory utilization | +0.37 | [+0.14, +0.59] | 1 | Logs |
| ➖ | otlp_ingest_logs | memory utilization | +0.30 | [+0.20, +0.40] | 1 | Logs |
| ➖ | file_tree | memory utilization | +0.23 | [+0.18, +0.29] | 1 | Logs |
| ➖ | file_to_blackhole_500ms_latency | egress throughput | +0.06 | [-0.33, +0.46] | 1 | Logs |
| ➖ | ddot_logs | memory utilization | +0.05 | [-0.00, +0.11] | 1 | Logs |
| ➖ | file_to_blackhole_1000ms_latency | egress throughput | +0.04 | [-0.39, +0.47] | 1 | Logs |
| ➖ | otlp_ingest_metrics | memory utilization | +0.04 | [-0.12, +0.20] | 1 | Logs |
| ➖ | uds_dogstatsd_to_api | ingress throughput | +0.03 | [-0.19, +0.26] | 1 | Logs |
| ➖ | uds_dogstatsd_to_api_v3 | ingress throughput | +0.00 | [-0.20, +0.21] | 1 | Logs |
| ➖ | ddot_metrics_sum_cumulative | memory utilization | +0.00 | [-0.14, +0.15] | 1 | Logs |
| ➖ | tcp_dd_logs_filter_exclude | ingress throughput | -0.00 | [-0.11, +0.11] | 1 | Logs |
| ➖ | ddot_metrics | memory utilization | -0.02 | [-0.21, +0.17] | 1 | Logs |
| ➖ | quality_gate_idle_all_features | memory utilization | -0.03 | [-0.07, +0.00] | 1 | Logs bounds checks dashboard |
| ➖ | file_to_blackhole_0ms_latency | egress throughput | -0.04 | [-0.52, +0.45] | 1 | Logs |
| ➖ | file_to_blackhole_100ms_latency | egress throughput | -0.04 | [-0.14, +0.05] | 1 | Logs |
| ➖ | docker_containers_memory | memory utilization | -0.08 | [-0.15, -0.00] | 1 | Logs |
| ➖ | quality_gate_metrics_logs | memory utilization | -0.09 | [-0.32, +0.15] | 1 | Logs bounds checks dashboard |
| ➖ | ddot_metrics_sum_delta | memory utilization | -0.13 | [-0.29, +0.04] | 1 | Logs |
| ➖ | quality_gate_idle | memory utilization | -0.16 | [-0.21, -0.11] | 1 | Logs bounds checks dashboard |
| ➖ | uds_dogstatsd_20mb_12k_contexts_20_senders | memory utilization | -0.28 | [-0.34, -0.22] | 1 | Logs |
| ➖ | tcp_syslog_to_blackhole | ingress throughput | -0.70 | [-0.84, -0.57] | 1 | Logs |
Bounds Checks: ✅ Passed
| perf | experiment | bounds_check_name | replicates_passed | observed_value | links |
|---|---|---|---|---|---|
| ✅ | docker_containers_cpu | simple_check_run | 10/10 | 726 ≥ 26 | |
| ✅ | docker_containers_memory | memory_usage | 10/10 | 277.44MiB ≤ 370MiB | |
| ✅ | docker_containers_memory | simple_check_run | 10/10 | 702 ≥ 26 | |
| ✅ | file_to_blackhole_0ms_latency | memory_usage | 10/10 | 0.19GiB ≤ 1.20GiB | |
| ✅ | file_to_blackhole_0ms_latency | missed_bytes | 10/10 | 0B = 0B | |
| ✅ | file_to_blackhole_1000ms_latency | memory_usage | 10/10 | 0.23GiB ≤ 1.20GiB | |
| ✅ | file_to_blackhole_1000ms_latency | missed_bytes | 10/10 | 0B = 0B | |
| ✅ | file_to_blackhole_100ms_latency | memory_usage | 10/10 | 0.20GiB ≤ 1.20GiB | |
| ✅ | file_to_blackhole_100ms_latency | missed_bytes | 10/10 | 0B = 0B | |
| ✅ | file_to_blackhole_500ms_latency | memory_usage | 10/10 | 0.21GiB ≤ 1.20GiB | |
| ✅ | file_to_blackhole_500ms_latency | missed_bytes | 10/10 | 0B = 0B | |
| ✅ | quality_gate_idle | intake_connections | 10/10 | 3 = 3 | bounds checks dashboard |
| ✅ | quality_gate_idle | memory_usage | 10/10 | 174.82MiB ≤ 175MiB | bounds checks dashboard |
| ✅ | quality_gate_idle_all_features | intake_connections | 10/10 | 2 ≤ 3 | bounds checks dashboard |
| ✅ | quality_gate_idle_all_features | memory_usage | 10/10 | 492.27MiB ≤ 550MiB | bounds checks dashboard |
| ✅ | quality_gate_logs | intake_connections | 10/10 | 3 ≤ 6 | bounds checks dashboard |
| ✅ | quality_gate_logs | memory_usage | 10/10 | 212.22MiB ≤ 220MiB | bounds checks dashboard |
| ✅ | quality_gate_logs | missed_bytes | 10/10 | 0B = 0B | bounds checks dashboard |
| ✅ | quality_gate_metrics_logs | cpu_usage | 10/10 | 369.91 ≤ 2000 | bounds checks dashboard |
| ✅ | quality_gate_metrics_logs | intake_connections | 10/10 | 4 ≤ 6 | bounds checks dashboard |
| ✅ | quality_gate_metrics_logs | memory_usage | 10/10 | 419.55MiB ≤ 475MiB | bounds checks dashboard |
| ✅ | quality_gate_metrics_logs | missed_bytes | 10/10 | 0B = 0B | bounds checks dashboard |
Explanation
Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%
Performance changes are noted in the perf column of each table:
- ✅ = significantly better comparison variant performance
- ❌ = significantly worse comparison variant performance
- ➖ = no significant change in performance
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
CI Pass/Fail Decision
✅ Passed. All Quality Gates passed.
- quality_gate_metrics_logs, bounds check cpu_usage: 10/10 replicas passed. Gate passed.
- quality_gate_metrics_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
- quality_gate_metrics_logs, bounds check missed_bytes: 10/10 replicas passed. Gate passed.
- quality_gate_metrics_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
- quality_gate_idle_all_features, bounds check intake_connections: 10/10 replicas passed. Gate passed.
- quality_gate_idle_all_features, bounds check memory_usage: 10/10 replicas passed. Gate passed.
- quality_gate_idle, bounds check memory_usage: 10/10 replicas passed. Gate passed.
- quality_gate_idle, bounds check intake_connections: 10/10 replicas passed. Gate passed.
- quality_gate_logs, bounds check missed_bytes: 10/10 replicas passed. Gate passed.
- quality_gate_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
- quality_gate_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
89c54a2
into
7.78.x
Backport:
#48127
#48462
Motivation:
Need to start pentest for the Restricted Shell feature.
The backport is low risk since those PAR action are behind FF.