Skip to content

Conversation

@realark
Copy link
Contributor

@realark realark commented Oct 15, 2025

Braintrust-friendly attachments, conforming to otel genai semconv

tl;dr

  • Most SDK users won't directly handle attachments because the instrumentation will do the right thing for them
  • The SDK will provide an Attachment util which serializes attachments in a way Braintrust will recognize
  • This util is used to write instrumentation or fill in missing gaps for custom situations
  • The SDK will send the full attachment base64-encoded and the backend will upload the file to s3 and replace the message with a pointer to s3
  • In the future, we can have the SDK do upload+conversion without any breaking changes

to implement otel Support in an SDK

  • Provide an Attachment util which:
    • has convenience functions for creating base64 data
    • has a json serializer idiomatic to the SDK's language
  • Use this utility to replace vendor-specific attachment messages with Braintrust attachment messages

Attachment Examples

  Attachment attachment = Attachment.ofFile(ContentType.IMAGE_JPEG, "/path/to/file.jpeg");
  JsonSerializer<Attachment> jacksonSerializer = Attachment.createSerializer();

And the instrumentation will use this util to map vendor-specific message parts. For example, OAI image_url:

{
  "image_url": {
    "url": "data:image/jpeg;base64,SOME_BASE64",
    "detail": "high",
    "valid": true
  },
  "type": "image_url",
  "valid": true
}

--->
{ type: "base64_attachment", content: "data:image/jpeg;base64,SOME_BASE64" }

details

Otel genai semconv does not actually define a format for attachments or images, so we have to map vendor-specific formats to something Braintrust can understand. We do this with our own GenericPart message as defined in the otel input/output message json schemas

Braintrust recognizes three attachment types.

  • raw base64 data: { type: "base64_attachment", content: "SOME_BASE64" }
  • stored in Braintrust s3: { type: "braintrust_attachment", filename, key, content_type }
  • stored external to Braintrust: { type: "inline_attachment", filename, src, content_type }

Currently, the SDK will send base64_attachment and the backend will convert it to braintrust_attachment

In the future, the SDK may do this conversion before sending traces. If we decide to go that route, this will not require any code changes for SDK users. It will happen somewhere in the implementation (probably a span processor).

@realark realark force-pushed the ark/BRA-3203-otel-attachments branch from 84d9301 to 2078b44 Compare October 15, 2025 23:53
- Most SDK users won't directly handle attachments because the instrumentation will do the right thing for them
- The SDK will provide an  `Attachment` util which serializes attachments in a way Braintrust will recognize
- This util is used to write instrumentation or fill in missing gaps for custom situations
- The SDK will send the full attachment base64-encoded and the backend will upload the file to s3 and replace the message with a pointer to s3
- In the future, we can have the SDK do upload+conversion without any breaking changes

- Provide an `Attachment` util which:
  - has convenience functions for creating base64 data
  - has a json serializer idiomatic to the SDK's language
- Use this utility to replace vendor-specific attachment messages with Braintrust attachment messages

Attachment Examples
```
  Attachment attachment = Attachment.ofFile(Attachment.ContentType.IMAGE_JPEG, testFile.toString());
  JsonSerializer<Attachment> jacksonSerializer = Attachment.createSerializer();
```

And the instrumentation will use this util to map vendor-specific message parts. For example, OAI image_url:
```
{
  "image_url": {
    "url": "data:image/jpeg;base64,SOME_BASE64",
    "detail": "high",
    "valid": true
  },
  "type": "image_url",
  "valid": true
}
```
--->
`{ type: "base64_attachment", content: "data:image/jpeg;base64,SOME_BASE64" }`

The genai semconv does not actually define a format for attachments or images, so we have to map vendor-specific formats to something Braintrust can understand. We do this by defining our own GenericPart message as defined in the otel input/output message json schemas

- https://opentelemetry.io/docs/specs/semconv/registry/attributes/gen-ai/#gen-ai-input-messages
- https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-input-messages.json
- https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-output-messages.json

Braintrust recognizes three attachment types.

- raw base64 data: `{ type: "base64_attachment", content: "SOME_BASE64" }`
- stored in Braintrust s3: `{ type: "braintrust_attachment", filename, key, content_type }`
- stored external to Braintrust: `{ type: "inline_attachment", filename, src, content_type }`

Currently, the SDK will send `base64_attachment` and the backend will convert it to `braintrust_attachment`

In the future, the SDK may do this conversion before sending traces. If we decide to go that route, this will _not _require any code changes for SDK users. It will happen somewhere in the implementation (probably a span processor).
@realark realark force-pushed the ark/BRA-3203-otel-attachments branch from 2078b44 to 5530bd7 Compare October 16, 2025 00:07
@realark
Copy link
Contributor Author

realark commented Oct 17, 2025

This work will be merged into the new repo

@realark realark closed this Oct 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants