feat(ai, llmobs): properly support ToolLoopAgent via existing patching#7571
Conversation
Overall package sizeSelf size: 4.79 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | import-in-the-middle | 2.0.6 | 81.92 kB | 816.75 kB | | dc-polyfill | 0.1.10 | 26.73 kB | 26.73 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
This comment has been minimized.
This comment has been minimized.
| // while we don't want to patch the noopSpan more than once, we do want to treat each as a | ||
| // fresh instance. However, this is really not necessary for non-noop spans, but not sure | ||
| // how to differentiate. | ||
| const freshSpan = Object.create(span) // TODO: does this cause memory leaks? |
There was a problem hiding this comment.
see note here - not sure if we'd really leak here since we should be 1:1 with already-created spans that will be GC'd anyways, so this should be as well.
i also thought about trying to detect the no-op span via a WeakSet, potentially with just one entry, as the only time we should see the same span reference here is that case, and then do this cloning if so. Otherwise, if from a custom or real OpenTelemetry tracer, all spans should be unique.
There was a problem hiding this comment.
Couldn't we just check if it was patched instead of just making it so that it's always patched through a wrapper?
There was a problem hiding this comment.
yeah i had tried that, but the problem is that the first time it's patched some of the variables the span instance patching depends on (name, attributes) are already bound to the wrapper function, which we want to be different on subsequent spans. the problem is, for no-op cases where a user does not provide their own tracer, the ai sdk uses the same global span instance, instead of a new one per span start of the no-op tracer. so i found that just creating a fresh object was ok for this.
BenchmarksBenchmark execution time: 2026-02-24 17:25:22 Comparing candidate commit a9ab279 in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 230 metrics, 30 unstable metrics. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #7571 +/- ##
==========================================
+ Coverage 80.30% 80.32% +0.02%
==========================================
Files 733 734 +1
Lines 31565 31547 -18
==========================================
- Hits 25348 25341 -7
+ Misses 6217 6206 -11
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
rochdev
left a comment
There was a problem hiding this comment.
LGTM, although there are some bits in the instrumentation that would benefit from moving to Orchestrion.
|
ty for adding support to this! We very much need this at our organization. |
curious what your thoughts are on what else could be moved. i think the only things left in the actual "instrumentation" file are instance patching, which could (maybe?) be a common use case still with orchestrion. with that said, to resolve the issue, i'm going to merge for now |
#7571) wip - mostly working everything works but later 5.x versions in tests test fixes add tests for toolloopagent generating spans remove load publish remove hook file instead of deleting, with comment address review comment use sets instead of arrays for orchestrion supported version checks Merge branch 'master' into sabrenner/vercel-ai-use-orchestrion Co-authored-by: sam.brenner <sam.brenner@datadoghq.com>
|
Thank you @sabrenner ! |
#7571) wip - mostly working everything works but later 5.x versions in tests test fixes add tests for toolloopagent generating spans remove load publish remove hook file instead of deleting, with comment address review comment use sets instead of arrays for orchestrion supported version checks Merge branch 'master' into sabrenner/vercel-ai-use-orchestrion Co-authored-by: sam.brenner <sam.brenner@datadoghq.com>
What does this PR do?
Properly supports
ToolLoopAgentinvocations by properly patching the traced methods.Before, what we were doing was individually patching
generateText,generateStream, etc. and patching the tracer, or defaulting to a no-op one, or not doing anything if explicitly disabled. The problem here is theToolLoopAgentis defined in the same file, and since we're patching the exports, it was never hitting the patched functions.Now, we're using our orchestrion-js implementation to rewrite the source to patch
getTracerandselectTelemetryAttributes, such that anytime a function calls those functions to perform telemetry operations, we patch them at that point (with the same patching de-duping logic we already had). This means that we should properly supportToolLoopAgent, since it seems it's a relatively thin abstraction layer over just callinggenerateTextandstreamText.Motivation
Closes #7146
MLOB-5644
Additional Notes
ToolLoopAgentrerankin a follow-up, should be an easy win, just don't want to pollute this PR with out-of-scope stuff (should now not require an instrumentation/patching change)