Skip to content

move apm language detection config under apm_config.instrumentation#23232

Closed
adel121 wants to merge 1 commit intomainfrom
adelhajhassan/move_language_detection_config_under_apm_config
Closed

move apm language detection config under apm_config.instrumentation#23232
adel121 wants to merge 1 commit intomainfrom
adelhajhassan/move_language_detection_config_under_apm_config

Conversation

@adel121
Copy link
Copy Markdown
Contributor

@adel121 adel121 commented Feb 28, 2024

What does this PR do?

Move APM language detection configurations under apm_config in config package.

Motivation

Achieve separation between:

  • language detection performed b process-agent (detecting languages on the level of processes)
  • language detection performed by core agent and cluster agent (collecting process-level languages and aggregating them by deployment-level).

This change has no impact on the functional aspects of the feature.

Additional Notes

  • This should be merged in 7.52.0 because it is related to a new feature and changing feature configuration in the future in this manner will break existing deployments with old configurations.

Possible Drawbacks / Trade-offs

  • Should not change behaviour at all, it is only renaming environment variables and changing bindings in config.

Describe how to test/QA your changes

Same QA as this PR.

@adel121 adel121 added this to the 7.52.0 milestone Feb 28, 2024
@adel121 adel121 requested review from a team as code owners February 28, 2024 09:17
@adel121 adel121 added the changelog/no-changelog No changelog entry needed label Feb 28, 2024
@adel121 adel121 force-pushed the adelhajhassan/move_language_detection_config_under_apm_config branch from 786cb78 to e756063 Compare February 28, 2024 09:48
@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented Feb 28, 2024

Bloop Bleep... Dogbot Here

Regression Detector Results

Run ID: da519242-f00c-464f-8444-ba51ac83e295
Baseline: e0749a8
Comparison: e756063
Total CPUs: 7

Performance changes are noted in the perf column of each table:

  • ✅ = significantly better comparison variant performance
  • ❌ = significantly worse comparison variant performance
  • ➖ = no significant change in performance

Experiments with missing or malformed data

  • basic_py_check

Usually, this warning means that there is no usable optimization goal data for that experiment, which could be a result of misconfiguration.

No significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf experiment goal Δ mean % Δ mean % CI
file_to_blackhole % cpu utilization +0.24 [-6.31, +6.79]

Fine details of change detection per experiment

perf experiment goal Δ mean % Δ mean % CI
otel_to_otel_logs ingress throughput +0.94 [+0.30, +1.57]
file_tree memory utilization +0.76 [+0.63, +0.88]
process_agent_real_time_mode memory utilization +0.35 [+0.30, +0.39]
tcp_syslog_to_blackhole ingress throughput +0.28 [+0.22, +0.35]
file_to_blackhole % cpu utilization +0.24 [-6.31, +6.79]
process_agent_standard_check_with_stats memory utilization +0.18 [+0.13, +0.22]
uds_dogstatsd_to_api ingress throughput +0.00 [-0.00, +0.00]
tcp_dd_logs_filter_exclude ingress throughput +0.00 [-0.00, +0.00]
trace_agent_json ingress throughput -0.03 [-0.07, +0.00]
trace_agent_msgpack ingress throughput -0.05 [-0.06, -0.03]
process_agent_standard_check memory utilization -0.23 [-0.27, -0.20]
idle memory utilization -0.34 [-0.38, -0.30]
uds_dogstatsd_to_api_cpu % cpu utilization -2.08 [-3.48, -0.68]

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

  1. Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.

  2. Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.

  3. Its configuration does not mark it "erratic".

ownersLanguages = *newOwnersLanguages()
languageTTL = config.Datadog.GetDuration("language_detection.cleanup.language_ttl")
cleanupPeriod := config.Datadog.GetDuration("language_detection.cleanup.period")
languageTTL = config.Datadog.GetDuration("apm_config.instrumentation.language_detection.language_ttl")
Copy link
Copy Markdown
Contributor

@clamoriniere clamoriniere Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I though apm_config was limited to the trace-agent configuration.
if we use it in the cluster-agent it means that we need to load the trace-agent config

Comment thread pkg/config/setup/apm.go
Comment on lines +91 to +98
config.BindEnvAndSetDefault("apm_config.instrumentation.language_detection.client_period", "10s", "DD_APM_INSTRUMENTATION_LANGUAGE_DETECTION_CLIENT_PERIOD")
// cleanup period represents how frequently we check for expired languages and remove them
config.BindEnvAndSetDefault("apm_config.instrumentation.language_detection.cleanup_period", "10m", "DD_APM_INSTRUMENTATION_LANGUAGE_DETECTION_CLEANUP_PERIOD")
// TTL refresh period represents how frequently we refresh the TTL of detected languages
config.BindEnvAndSetDefault("apm_config.instrumentation.language_detection.ttl_refresh_period", "20m", "DD_APM_INSTRUMENTATION_LANGUAGE_DETECTION_TTL_REFRESH_PERIOD")
// language TTL represents the TTL that is set for each language when it is detected
// it is also used when refreshing the expiration timestamp of the language
config.BindEnvAndSetDefault("apm_config.instrumentation.language_detection.language_ttl", "30m", "DD_APM_INSTRUMENTATION_LANGUAGE_DETECTION_LANGUAGE_TTL")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this parameters are not specific to APM. since we can imagine we have other usage of the langage information stored on top of the Deployment

I would suggestion to have a dedicated config section for it.

language_detection.resources_configuration.client_period

@adel121
Copy link
Copy Markdown
Contributor Author

adel121 commented Feb 29, 2024

close in favour of this PR

@adel121 adel121 closed this Feb 29, 2024
@dd-devflow dd-devflow Bot deleted the adelhajhassan/move_language_detection_config_under_apm_config branch August 29, 2024 00:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants