feat(llmobs): add tag source:otel to evals if DD_TRACE_OTEL_ENABLED=true#15538
feat(llmobs): add tag source:otel to evals if DD_TRACE_OTEL_ENABLED=true#15538ZStriker19 merged 5 commits intomainfrom
Conversation
|
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 249 ± 3 ms. The average import time from base is: 254 ± 4 ms. The import time difference between this PR and base is: -4.4 ± 0.2 ms. Import time breakdownThe following import paths have shrunk:
|
Performance SLOsComparing candidate zachg/llmobs_otel_evals_update (6b4daa3) with baseline main (fd1f2f9) 📈 Performance Regressions (2 suites)📈 iastaspectsospath - 24/24✅ ospathbasename_aspectTime: ✅ 5.137µs (SLO: <10.000µs 📉 -48.6%) vs baseline: 📈 +21.6% Memory: ✅ 38.633MB (SLO: <41.000MB -5.8%) vs baseline: +5.1% ✅ ospathbasename_noaspectTime: ✅ 1.079µs (SLO: <10.000µs 📉 -89.2%) vs baseline: -0.7% Memory: ✅ 38.633MB (SLO: <41.000MB -5.8%) vs baseline: +5.9% ✅ ospathjoin_aspectTime: ✅ 6.009µs (SLO: <10.000µs 📉 -39.9%) vs baseline: ~same Memory: ✅ 38.633MB (SLO: <41.000MB -5.8%) vs baseline: +5.1% ✅ ospathjoin_noaspectTime: ✅ 2.285µs (SLO: <10.000µs 📉 -77.1%) vs baseline: +0.3% Memory: ✅ 38.614MB (SLO: <41.000MB -5.8%) vs baseline: +5.0% ✅ ospathnormcase_aspectTime: ✅ 3.519µs (SLO: <10.000µs 📉 -64.8%) vs baseline: +1.0% Memory: ✅ 38.614MB (SLO: <41.000MB -5.8%) vs baseline: +4.9% ✅ ospathnormcase_noaspectTime: ✅ 0.568µs (SLO: <10.000µs 📉 -94.3%) vs baseline: -2.2% Memory: ✅ 38.633MB (SLO: <41.000MB -5.8%) vs baseline: +5.9% ✅ ospathsplit_aspectTime: ✅ 4.856µs (SLO: <10.000µs 📉 -51.4%) vs baseline: -0.3% Memory: ✅ 38.653MB (SLO: <41.000MB -5.7%) vs baseline: +5.6% ✅ ospathsplit_noaspectTime: ✅ 1.601µs (SLO: <10.000µs 📉 -84.0%) vs baseline: +0.5% Memory: ✅ 38.633MB (SLO: <41.000MB -5.8%) vs baseline: +5.0% ✅ ospathsplitdrive_aspectTime: ✅ 3.721µs (SLO: <10.000µs 📉 -62.8%) vs baseline: +0.4% Memory: ✅ 38.594MB (SLO: <41.000MB -5.9%) vs baseline: +4.8% ✅ ospathsplitdrive_noaspectTime: ✅ 0.701µs (SLO: <10.000µs 📉 -93.0%) vs baseline: +0.3% Memory: ✅ 38.555MB (SLO: <41.000MB -6.0%) vs baseline: +4.8% ✅ ospathsplitext_aspectTime: ✅ 4.622µs (SLO: <10.000µs 📉 -53.8%) vs baseline: ~same Memory: ✅ 38.633MB (SLO: <41.000MB -5.8%) vs baseline: +5.1% ✅ ospathsplitext_noaspectTime: ✅ 1.385µs (SLO: <10.000µs 📉 -86.1%) vs baseline: -0.5% Memory: ✅ 38.535MB (SLO: <41.000MB -6.0%) vs baseline: +5.7% 📈 telemetryaddmetric - 30/30✅ 1-count-metric-1-timesTime: ✅ 3.522µs (SLO: <20.000µs 📉 -82.4%) vs baseline: 📈 +14.0% Memory: ✅ 34.859MB (SLO: <35.500MB 🟡 -1.8%) vs baseline: +5.2% ✅ 1-count-metrics-100-timesTime: ✅ 207.495µs (SLO: <220.000µs -5.7%) vs baseline: -2.0% Memory: ✅ 34.839MB (SLO: <35.500MB 🟡 -1.9%) vs baseline: +5.1% ✅ 1-distribution-metric-1-timesTime: ✅ 3.396µs (SLO: <20.000µs 📉 -83.0%) vs baseline: -1.3% Memory: ✅ 34.859MB (SLO: <35.500MB 🟡 -1.8%) vs baseline: +5.2% ✅ 1-distribution-metrics-100-timesTime: ✅ 221.695µs (SLO: <230.000µs -3.6%) vs baseline: -0.5% Memory: ✅ 34.780MB (SLO: <35.500MB -2.0%) vs baseline: +4.9% ✅ 1-gauge-metric-1-timesTime: ✅ 2.228µs (SLO: <20.000µs 📉 -88.9%) vs baseline: -0.2% Memory: ✅ 34.800MB (SLO: <35.500MB 🟡 -2.0%) vs baseline: +4.6% ✅ 1-gauge-metrics-100-timesTime: ✅ 137.966µs (SLO: <150.000µs -8.0%) vs baseline: +0.5% Memory: ✅ 34.800MB (SLO: <35.500MB 🟡 -2.0%) vs baseline: +4.8% ✅ 1-rate-metric-1-timesTime: ✅ 3.215µs (SLO: <20.000µs 📉 -83.9%) vs baseline: -1.3% Memory: ✅ 34.898MB (SLO: <35.500MB 🟡 -1.7%) vs baseline: +5.2% ✅ 1-rate-metrics-100-timesTime: ✅ 220.569µs (SLO: <250.000µs 📉 -11.8%) vs baseline: -0.7% Memory: ✅ 34.859MB (SLO: <35.500MB 🟡 -1.8%) vs baseline: +4.9% ✅ 100-count-metrics-100-timesTime: ✅ 20.880ms (SLO: <22.000ms -5.1%) vs baseline: -0.3% Memory: ✅ 34.780MB (SLO: <35.500MB -2.0%) vs baseline: +4.4% ✅ 100-distribution-metrics-100-timesTime: ✅ 2.277ms (SLO: <2.550ms 📉 -10.7%) vs baseline: -0.9% Memory: ✅ 34.819MB (SLO: <35.500MB 🟡 -1.9%) vs baseline: +4.7% ✅ 100-gauge-metrics-100-timesTime: ✅ 1.428ms (SLO: <1.550ms -7.9%) vs baseline: +0.9% Memory: ✅ 34.760MB (SLO: <35.500MB -2.1%) vs baseline: +4.6% ✅ 100-rate-metrics-100-timesTime: ✅ 2.275ms (SLO: <2.550ms 📉 -10.8%) vs baseline: +0.4% Memory: ✅ 34.800MB (SLO: <35.500MB 🟡 -2.0%) vs baseline: +4.6% ✅ flush-1-metricTime: ✅ 4.679µs (SLO: <20.000µs 📉 -76.6%) vs baseline: +0.2% Memory: ✅ 35.114MB (SLO: <35.500MB 🟡 -1.1%) vs baseline: +4.7% ✅ flush-100-metricsTime: ✅ 175.253µs (SLO: <250.000µs 📉 -29.9%) vs baseline: ~same Memory: ✅ 35.193MB (SLO: <35.500MB 🟡 -0.9%) vs baseline: +4.8% ✅ flush-1000-metricsTime: ✅ 2.168ms (SLO: <2.500ms 📉 -13.3%) vs baseline: -0.3% Memory: ✅ 36.058MB (SLO: <36.500MB 🟡 -1.2%) vs baseline: +5.0% 🟡 Near SLO Breach (15 suites)🟡 coreapiscenario - 10/10 (1 unstable)
|
|
@codex review |
|
To use Codex here, create a Codex account and connect to github. |
PROFeNoM
left a comment
There was a problem hiding this comment.
LGTM
Just left one question
…rue (#15538) ## Description Auto-add `source:otel` tag to LLMObs evaluations when OTel tracing is enabled When `DD_TRACE_OTEL_ENABLED=true`, automatically adds `source:otel` tag to all submitted evaluations. This allows the backend to wait ~3 minutes for OTel span conversion before discarding unmatched evaluations. ### Changes - Add `source:otel` tag in `submit_evaluation()` when OTel tracing is enabled ## Testing Tested manually and also added tests: - `test_submit_evaluation_adds_source_otel_when_otel_enabled` - `test_submit_evaluation_no_source_otel_when_otel_disabled` ## Risks <!-- Note any risks associated with this change, or "None" if no risks --> ## Additional Notes <!-- Any other information that would be helpful for reviewers -->
…rue (DataDog#15538) ## Description Auto-add `source:otel` tag to LLMObs evaluations when OTel tracing is enabled When `DD_TRACE_OTEL_ENABLED=true`, automatically adds `source:otel` tag to all submitted evaluations. This allows the backend to wait ~3 minutes for OTel span conversion before discarding unmatched evaluations. ### Changes - Add `source:otel` tag in `submit_evaluation()` when OTel tracing is enabled ## Testing Tested manually and also added tests: - `test_submit_evaluation_adds_source_otel_when_otel_enabled` - `test_submit_evaluation_no_source_otel_when_otel_disabled` ## Risks <!-- Note any risks associated with this change, or "None" if no risks --> ## Additional Notes <!-- Any other information that would be helpful for reviewers -->
Description
Auto-add
source:oteltag to LLMObs evaluations when OTel tracing is enabledWhen
DD_TRACE_OTEL_ENABLED=true, automatically addssource:oteltag to all submitted evaluations. This allows the backend to wait ~3 minutes for OTel span conversion before discarding unmatched evaluations.Changes
source:oteltag insubmit_evaluation()when OTel tracing is enabledTesting
Tested manually and also added tests:
test_submit_evaluation_adds_source_otel_when_otel_enabledtest_submit_evaluation_no_source_otel_when_otel_disabledRisks
Additional Notes