Skip to content

Comments

fix (ai/telemetry): serialize UInt8Arrays as base64 for inner telemetry spans#6482

Merged
lgrammel merged 4 commits intov5from
v5-fix/json-stringify-image
May 26, 2025
Merged

fix (ai/telemetry): serialize UInt8Arrays as base64 for inner telemetry spans#6482
lgrammel merged 4 commits intov5from
v5-fix/json-stringify-image

Conversation

@lgrammel
Copy link
Collaborator

Background

generateObject, generateText, streamText, and streamObject currently call JSON.stringify on the input messages. If the input messages contain an image, it is most likely normalized into a Uint8Array.

JSON.stringify does not the most obvious things to TypedArrays including Uint8Array.

// this returns '{"0": 1,"1": 2,"2": 3}', where I'd expect this to be '[1,2,3]'
JSON.stringify(new Uint8array([1, 2, 3]))

In practice, this results in bloating images by about 5-15x depending on the original image size. For Laminar, for example, a span with 3 avg sized images will not be able to be sent as it is larger than the (reasonably high) gRPC payload size for our traces endpoint.

From MDN docs:

// TypedArray
JSON.stringify([new Int8Array([1]), new Int16Array([1]), new Int32Array([1])]);
// '[{"0":1},{"0":1},{"0":1}]'
JSON.stringify([
  new Uint8Array([1]),
  new Uint8ClampedArray([1]),
  new Uint16Array([1]),
  new Uint32Array([1]),
]);
// '[{"0":1},{"0":1},{"0":1},{"0":1}]'
JSON.stringify([new Float32Array([1]), new Float64Array([1])]);
// '[{"0":1},{"0":1}]'

Summary

Added a function that maps over messages in a LanguageModelV1Prompt and maps over content parts in each message, replacing UInt8Arrays with raw base64 strings instead.

Call this function when calling recordSpan for the inner (doStream/doGenerate) span in generateObject, generateText, streamText, and streamObject.

Verification

Ran this small script against a local instance of Laminar and logged the Telemetry payloads (span attributes) on the backend to verify that they are indeed base64.

import { Laminar, getTracer } from '@lmnr-ai/lmnr'

Laminar.initialize();

import { openai } from '@ai-sdk/openai'
import { generateText, generateObject, streamText, streamObject, tool } from "ai";
import { z } from "zod";
import dotenv from "dotenv";

dotenv.config();

const handle = async () => {
  const imageUrl = "https://upload.wikimedia.org/wikipedia/commons/b/bc/CoinEx.png"
  const imageData = await fetch(imageUrl)
    .then(response => response.arrayBuffer())
    .then(buffer => Buffer.from(buffer).toString('base64'));

  const o = streamObject({
    schema: z.object({
      text: z.string(),
      companyName: z.string().optional().nullable(),
    }),
    messages: [
      {
        role: "user",
        content: [
          {
            type: "text",
            text: "Describe this image briefly"
          },
          {
            type: "image",
            image: imageData,
            mimeType: "image/png"
          }
        ]
      }
    ],
    model: openai("gpt-4.1-nano"),
    experimental_telemetry: {
      isEnabled: true,
      tracer: getTracer()
    }
  });

  for await (const chunk of o.fullStream) {
    console.log(chunk);
  }
  await Laminar.shutdown();
};

handle().then((r) => {
    console.log(r);
});

Related Issues

Fixes #6210
Continues #6377

@lgrammel lgrammel self-assigned this May 26, 2025
@lgrammel lgrammel marked this pull request as ready for review May 26, 2025 03:06
@lgrammel lgrammel merged commit 7324c21 into v5 May 26, 2025
9 checks passed
@lgrammel lgrammel deleted the v5-fix/json-stringify-image branch May 26, 2025 03:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants