feat(llmobs): trace text-based bedrock converse api #12560

lievan · 2025-02-27T22:42:19Z

This PR supports instrumenting LLM spans for bedrock's Converse method. This PR does not touch ConverseStream, but we document it’s behavior in test_llmobs_converse_stream.

Its helpful to review the bedrock request syntax, and the response syntax

Example bedrock code snippet:

  response = bedrock_runtime.converse(
      system=[{
          "text": "You are an app that creates play lists for a radio station that plays rock and pop music. Only return song names and the artist. "
      }],
      modelId=MODEL_ID,
      messages=messages,
      inferenceConfig=…
      toolConfig=…
  )

Manual QA

Example with tool calls
Example without tool calls

Data this PR traces

System prompts in meta.input.messages[0].content with system role
Text based input in meta.input.messages[i].content with user role
Text based output in meta.input.messages[i].content with assistant role
Tool call outputs in meta.output.messages[0].tool_calls
Inference parameter metadata max_tokens and temperature
stop_reason

Implementation details:

We register a separate trace handler for processing bedrock converse responses.

core.on("botocore.bedrock.process_response_converse", _on_botocore_bedrock_process_response_converse)

This is to avoid the code-path that does extra post-processing of invoke model responses before it's ready for llmobs_set_tags.

Converse still relies on the same trace handler for processing 1) request input 2) bedrock exceptions.

Cassettes

I chose to use cassettes since there were some difficulties with mocking out the bedrock calls with respx. There are some authentication steps that happen within the botocore library before the mocked LLM call, leading me to run into errors like:

E           botocore.exceptions.ClientError: An error occurred (UnrecognizedClientException) when calling the Converse operation: The security token included in the request is invalid.
E           botocore.exceptions.ClientError: An error occurred (MissingAuthenticationTokenException) when calling the Converse operation: Missing Authentication Token

This means we needed to mock out or find a way to skip the internal authentication steps, which would cause the test to be dependent on non-bedrock parts of the botocore library which may be subject to change. In my opinion, this makes cassettes the better option.

To Do

Support converse stream
Support more inference params like top_p and stop_sequences

Checklist

PR author has checked that all the criteria below are met
The PR description includes an overview of the change
The PR description articulates the motivation for the change
The change includes tests OR the PR description describes a testing strategy
The PR description notes risks associated with the change, if any
Newly-added code is easy to change
The change follows the library release note guidelines
The change includes or references documentation updates if necessary
Backport labels are set (if applicable)

Reviewer Checklist

Reviewer has checked that all the criteria below are met
Title is accurate
All changes are related to the pull request's stated goal
Avoids breaking API changes
Testing strategy adequately addresses listed risks
Newly-added code is easy to change
Release note makes sense to a user of the library
If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
Backport labels are set in a manner that is consistent with the release branch maintenance policy

ddtrace/llmobs/_integrations/bedrock.py

ddtrace/contrib/internal/botocore/patch.py

ddtrace/contrib/internal/botocore/services/bedrock.py

ddtrace/contrib/internal/botocore/patch.py

ddtrace/llmobs/_integrations/bedrock.py

github-actions · 2025-02-27T22:42:52Z

CODEOWNERS have been resolved as:

.riot/requirements/15f7356.txt                                          @DataDog/apm-python
.riot/requirements/1ecd900.txt                                          @DataDog/apm-python
.riot/requirements/5295cd7.txt                                          @DataDog/apm-python
.riot/requirements/df0b19d.txt                                          @DataDog/apm-python
.riot/requirements/e1342cb.txt                                          @DataDog/apm-python
releasenotes/notes/bedrock-converse-api-20dd255c1ee18cf4.yaml           @DataDog/apm-python
tests/contrib/botocore/bedrock_cassettes/bedrock_converse.yaml          @DataDog/ml-observability
tests/contrib/botocore/bedrock_cassettes/bedrock_converse_error.yaml    @DataDog/ml-observability
tests/contrib/botocore/bedrock_cassettes/bedrock_converse_stream.yaml   @DataDog/ml-observability
tests/snapshots/tests.contrib.botocore.test_bedrock.test_converse.json  @DataDog/apm-python
ddtrace/_trace/trace_handlers.py                                        @DataDog/apm-sdk-api-python
ddtrace/contrib/internal/botocore/patch.py                              @DataDog/apm-core-python @DataDog/apm-idm-python
ddtrace/contrib/internal/botocore/services/bedrock.py                   @DataDog/ml-observability
ddtrace/llmobs/_integrations/bedrock.py                                 @DataDog/ml-observability
ddtrace/llmobs/_integrations/utils.py                                   @DataDog/ml-observability
riotfile.py                                                             @DataDog/apm-python
tests/contrib/botocore/bedrock_utils.py                                 @DataDog/ml-observability
tests/contrib/botocore/test.py                                          @DataDog/apm-core-python @DataDog/apm-idm-python
tests/contrib/botocore/test_bedrock.py                                  @DataDog/ml-observability
tests/contrib/botocore/test_bedrock_llmobs.py                           @DataDog/ml-observability

pr-commenter · 2025-02-28T16:59:12Z

Benchmarks

Benchmark execution time: 2025-03-12 21:37:42

Comparing candidate commit 224d6b0 in PR branch evan.li/claude-code-converse-api with baseline commit 8723688 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 282 metrics, 2 unstable metrics.

datadog-dd-trace-py-rkomorn · 2025-02-28T22:26:43Z

Datadog Report

Branch report: evan.li/claude-code-converse-api
Commit report: e700fca
Test service: dd-trace-py

✅ 0 Failed, 43 Passed, 290 Skipped, 49.39s Total duration (5m 5.68s time saved)

…aude-code-converse-api

ddtrace/_trace/trace_handlers.py

ddtrace/contrib/internal/botocore/services/bedrock.py

tests/contrib/botocore/test.py

…aude-code-converse-api

ddtrace/contrib/internal/botocore/services/bedrock.py

ddtrace/llmobs/_integrations/bedrock.py

releasenotes/notes/bedrock-converse-api-20dd255c1ee18cf4.yaml

ddtrace/_trace/trace_handlers.py

ddtrace/llmobs/_integrations/bedrock.py

ddtrace/llmobs/_integrations/openai.py

ddtrace/contrib/internal/botocore/services/bedrock.py

ddtrace/llmobs/_integrations/bedrock.py

ddtrace/llmobs/_integrations/utils.py

Yun-Kim

Just one suggestion but otherwise LGTM!

ddtrace/contrib/internal/botocore/services/bedrock.py

This reverts commit d5f52da.

prompt 2

ab48b9b

datadog-datadog-prod-us1 bot reviewed Feb 27, 2025

View reviewed changes

lievan added 2 commits February 28, 2025 10:55

clean up

cdd129d

more cleanup

c2510fd

lievan changed the title ~~feat(llmobs): trace bedrock converse api~~ feat(llmobs): trace text-based bedrock converse api Feb 28, 2025

lievan added 2 commits February 28, 2025 13:59

bedrock integration should not need to access tags

0fe6aae

fix token extraction

e700fca

lievan added 13 commits March 2, 2025 17:48

test refactors

a5fd4ab

Merge branch 'main' of github.com:DataDog/dd-trace-py into evan.li/cl…

72a8b07

…aude-code-converse-api

working tests

9e56c32

clean up

acdf649

decouple from span tags

5956d3f

refactor to decouple i/o parsing

209fc8f

make the tests more readable

a97b1e8

rm uneeded changes

45d08b0

default total tokens

128f936

default token tokens

5a050b2

lockfiles

e7241d2

clarify comment

55dc627

rel note

4b4ef98

lievan commented Mar 3, 2025

View reviewed changes

ddtrace/_trace/trace_handlers.py Outdated Show resolved Hide resolved

lievan commented Mar 3, 2025

View reviewed changes

ddtrace/contrib/internal/botocore/services/bedrock.py Outdated Show resolved Hide resolved

lievan added 5 commits March 3, 2025 11:15

clarify commetn

0dd9323

fix bedrock tests

4348eda

add back vcr stub

5ee2e39

reqs

6ea621f

fix nonetype int error

fee7e8a

try skipping

3b1358b

lievan commented Mar 6, 2025

View reviewed changes

tests/contrib/botocore/test.py Outdated Show resolved Hide resolved

lievan added 5 commits March 6, 2025 08:39

fix system prompt parsing

1c88cb2

fix snapshot

894a4aa

dont skip actually try to fix test

41789bd

Merge branch 'main' of github.com:DataDog/dd-trace-py into evan.li/cl…

7ba36c6

…aude-code-converse-api

add back the ski[

bb0d848

Yun-Kim reviewed Mar 11, 2025

View reviewed changes

lievan added 10 commits March 11, 2025 17:12

address comments

06fd9a6

extract out a common util function

30a6a03

safer tool use

c0b45aa

fix output

359a317

make i/o more consistent

5123f29

token usage cleanup

ed42bed

none checks for tokens

69fcb11

none checks for tokens

a818ad4

make sure catch none/empty string case

ec28d98

clean up usage code

275d0f9

Yun-Kim reviewed Mar 11, 2025

View reviewed changes

lievan added 3 commits March 11, 2025 19:00

remove accidental change

cfbc760

suggestions

d41153e

accidental openai change

ef243ab

rachelyangdog approved these changes Mar 12, 2025

View reviewed changes

Yun-Kim approved these changes Mar 12, 2025

View reviewed changes

ddtrace/contrib/internal/botocore/services/bedrock.py Outdated Show resolved Hide resolved

ddtrace/contrib/internal/botocore/services/bedrock.py Show resolved Hide resolved

lievan and others added 2 commits March 12, 2025 15:52

change tk count fn

248fa2e

Merge branch 'main' into evan.li/claude-code-converse-api

224d6b0

lievan enabled auto-merge (squash) March 12, 2025 21:27

lievan merged commit d5f52da into main Mar 12, 2025
821 of 822 checks passed

lievan deleted the evan.li/claude-code-converse-api branch March 12, 2025 21:39

lievan pushed a commit that referenced this pull request Mar 20, 2025

Revert "feat(llmobs): trace text-based bedrock converse api (#12560)"

218d2c9

This reverts commit d5f52da.

feat(llmobs): trace text-based bedrock converse api #12560

feat(llmobs): trace text-based bedrock converse api #12560

Uh oh!

Conversation

lievan commented Feb 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Manual QA

Data this PR traces

Implementation details:

Cassettes

To Do

Checklist

Reviewer Checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Feb 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pr-commenter bot commented Feb 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Uh oh!

datadog-dd-trace-py-rkomorn bot commented Feb 28, 2025

Datadog Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Yun-Kim left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

lievan commented Feb 27, 2025 •

edited

Loading

github-actions bot commented Feb 27, 2025 •

edited

Loading

pr-commenter bot commented Feb 28, 2025 •

edited

Loading