parametric tests for sampling-related span tagging by emmettbutler · Pull Request #1487 · DataDog/system-tests

emmettbutler · 2023-08-11T17:17:26Z

Description

This change adds some parametric tests that exercise various trace sampling configurations and check span tags related to sampling on the results. The configurations are the same that are tested here in the dd-trace-py test suite.

This document is the primary source describing the behavior asserted by these tests.

Motivation

I was making some significant changes to dd-trace-py's sampling behavior and realized that neither I nor my team are sure what the behavior of these span tags is supposed to be. These tests let us see the actual behavior of all of the tracer implementations with respect to sampling tags, which will enable dd-trace-py to align its behavior with the most common behavior among other implementations.

Reviewer checklist

Check what scenarios are modified. If needed, add the relevant label (run-parametric-scenario, run-profiling-scenario...). If this PR modifies any system-tests internal, then add the run-all-scenarios label (more info).
CI is green
- If not, failing jobs are not related to this change (and you are 100% sure about this statement)
if any of build-some-image label is present
1. is the image labl have been updated ?
2. just before merging, locally build and push the image to hub.docker.com
if a scenario is added (or removed), add (or remove) it in system-test-dasboard nightly

…entations

mabdinur

Left some nits but overall this change looks good. I will need @dgoffredo help validating the sampling test cases

dgoffredo

Thanks for putting this together. It's a hairy topic worth ironing out.

I ended up having to do some research in order to review this code. I spoke with Lucas to clarify my understanding of _dd.p.* and its relationship with partial flushing, and I reread the agent code to refresh my memory on _dd.*_psr. See my comments in the diff.

…iki/spaces/APM/pages/2564915820/Trace+Ingestion+Mechanisms

…g-tags-tests-from-doc

…ts-from-doc update sampling tags tests based on wiki

This change makes the tracer's sampling tagging behavior conform to the expectations described in DataDog/system-tests#1487 ## How to review Use [this view](https://github.com/DataDog/dd-trace-py/pull/6693/files?file-filters%5B%5D=.py&show-viewed-files=true) to look at the changes to Python files, since it's a lot more responsive in the browser than the default. 1. Read these [system-tests](https://github.com/DataDog/system-tests/blob/8b0bdc3392792fa45fb97845f777688abb052a85/tests/parametric/test_sampling_span_tags.py) to understand the behavior they validate. 2. Read `tests/integration/test_sampling.py` to see the library tests that perform the same validations. 3. Look at `snapshots/test_sampling_with*` to see how test expectations have changed in light of these validations. 4. Examine `ddtrace/sampler.py` to see how it implements this behavior, especially the function call `from_datadog_sampler`. 4. Glance at some of the hundreds of other snapshots to confirm that their changes make sense in light of what you've learned. ## Change Summary Filter the file list by `snapshots/test_sampling_with` to find the snapshot files that directly validate the behavior tested by [these system-tests](https://github.com/DataDog/system-tests/blob/8b0bdc3392792fa45fb97845f777688abb052a85/tests/parametric/test_sampling_span_tags.py#L74). Changes to these snapshot files include: * Removing the `.dm` tag from spans not first in their chunks * Adding the `.dm` tag to first-in-chunk spans that had their sampling decision set manually (`-4`) or by the tracer (`-3`) * Removing the `_psr` tags from spans to which it doesn't apply * Adding three test cases that involve the `DD_TRACE_RATE_LIMIT` configuration My intent is that each of these snapshot file changes maps directly on to one of the "reasons" listed for the test skips in the [system-tests](https://github.com/DataDog/system-tests/blob/8b0bdc3392792fa45fb97845f777688abb052a85/tests/parametric/test_sampling_span_tags.py#L74). Find the corresponding test code for each snapshot in `tests/integration/test_sampling.py`. `ddtrace/tracer.py` is changed to avoid propagating the `.dm` tag to child spans that exist in the same execution as their parents. `ddtrace/span.py` is changed to remove `_psr` tags from spans to which they don't apply. For example, `_dd.p.dm: -1` indicates that the sampling decision was made on the basis of a sample rate from the agent, which is the only case in which `agent_psr` is meaningful. When `dm != -1`, `agent_psr` should not be set. There are significant changes to `ddtrace/sampler.py`. The most fundamental of these is the [change](https://github.com/DataDog/dd-trace-py/pull/6693/files?file-filters%5B%5D=.py&show-viewed-files=true#diff-51bda2f09879ed6e54b15cc9da5ed1da04023d3465018dd7794d35384dd85409L357-R356) from the use of inheritance between sampler classes to the use of dependency injection via the new `from_datadog_sampler` method. This fixes a subtle bug in which executions of `sample()` were calling methods of subclasses rather than the intended superclass methods. Avoiding the use of `super()` sidesteps the situation that led to this issue and makes it straightforward to adjust the logic that chooses between the various sampler implementations. This also allows the removal of some hard-to-follow `isinstance` and `super` calls. In `sampler.py`, rate limiting is applied unconditionally as a second stage after a sampling decision has been made. In `sampler.py`, the `agent_psr` tag is only applied when a sample rate received from the agent was actually used to make the sampling decision. All changes other than the ones listed here are changes to test expectations to make existing tests conform to the new behavior. A majority of these expectation changes were made by a script that * Removed the `.dm` tag from any span with `parent_id != 0` * Removed the `agent_psr` tag from any span with `_dd.p.dm != -1` ## Risk Because these changes conform to the behavior in the system-tests, I think their risk is pretty low. The worst-case scenario is that this change causes inaccurate metrics in Datadog products, in which case it can be reverted. The performance impact indicated by the bot comment is [probably](#6693 (review)) a result of the removal of an early exit from `sample()`. In my view it's worthwhile to trade CPU time for code clarity in this case. ## Other Stuff Once this is included in a release, we can promote DataDog/system-tests#1535 to "ready for review". I've run those tests locally and told them to install the tracer from this branch. ## Checklist - [x] Change(s) are motivated and described in the PR description. - [x] Testing strategy is described if automated tests are not included in the PR. - [x] Risk is outlined (performance impact, potential for breakage, maintainability, etc). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed. If no release note is required, add label `changelog/no-changelog`. - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)). - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Title is accurate. - [x] No unnecessary changes are introduced. - [x] Description motivates each change. - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes unless absolutely necessary. - [x] Testing strategy adequately addresses listed risk(s). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] Release note makes sense to a user of the library. - [x] Reviewer has explicitly acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment. - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

emmettbutler added 2 commits August 10, 2023 13:14

sketch test that validates tags related to sampling

355207c

set expectations to match the most common results among tracer implem…

52f2952

…entations

emmettbutler requested a review from a team as a code owner August 11, 2023 17:17

emmettbutler marked this pull request as draft August 11, 2023 17:17

emmettbutler added the run-parametric-scenario label Aug 11, 2023

emmettbutler added 10 commits August 11, 2023 10:30

trigger

d33a3ca

add tests for simple trace sampling configurations

7acbb18

deduplicate and adjust test assertions

698e5b5

fix assert condition

003f384

mark some python bugs

b545e52

mark more bugs

6575214

change some expectations and skips

769ea2d

add more skips to reflect tracer behaviors

6b4faa5

more refinements

03032b6

more skips

c2096e3

emmettbutler requested review from brettlangdon and majorgreys August 16, 2023 13:55

emmettbutler marked this pull request as ready for review August 16, 2023 13:55

Merge branch 'main' into emmett.butler/sampling-tags

a28e95a

emmettbutler changed the title ~~Emmett.butler/sampling tags~~ parametric tests for sampling-related span tagging Aug 16, 2023

emmettbutler requested review from dgoffredo and mabdinur and removed request for majorgreys August 16, 2023 14:08