Cloudflare provider analyzeImage() returns empty string for Workers AI vision models

## Summary

`LLMProviders.analyzeImage()` with the cloudflare provider returns `{ content: "", message: "" }` when called from a Cloudflare Worker binding. The vision step completes without throwing, but produces no usable text, silently breaking downstream consumers.

## Reproduction

**Context:** foodfiles `POST /v2/recipes/analyze` (apps/api/src/routes/recipes.ts)

```ts
const llm = new LLMProviders({
  cloudflare: { ai: c.env.AI },
  defaultProvider: "cloudflare",
  costOptimization: true,
  enableCircuitBreaker: true,
});
const visionResult = await llm.analyzeImage({
  image: { data: base64, mimeType: imageFile.type },
  prompt: "Describe this food image in detail...",
  maxTokens: 512,
});
// visionResult.content === ""
// visionResult.message === ""
```

Model selected by `getDefaultVisionModel()` for cloudflare: `@cf/meta/llama-3.2-11b-vision-instruct`

## Root cause hypothesis

`attachImagesToLastUserMessage()` (cloudflare.ts) formats the image as a `{ type: "image_url", image_url: { url: "data:image/jpeg;base64,..." } }` content part in the messages array.

Workers AI's `llama-3.2-11b-vision-instruct` appears to return `{ choices: [{ message: { content: null } }] }` for this format when called **via the Workers binding** (not the REST API). The `extractText()` null-content path returns `""`.

The Workers AI binding expects the raw format for vision:
```js
ai.run('@cf/meta/llama-3.2-11b-vision-instruct', {
  image: [...], // number[] / Uint8Array
  prompt: "...",
  max_tokens: 512,
})
// → { response: "description text" }
```

The REST API and the binding have different input shapes for this model. The provider only implements the messages/image_url path.

## Evidence

- Direct external call to `tarotscript-worker.blue-pine-edf6.workers.dev/v2/recipes/analyze` with a hand-crafted `image_analysis` string succeeds — confirming downstream is fine.
- Tarotscript returned `400 image_analysis is required` when the llm-providers vision step was allowed to pass through its empty string, confirming `visionResult.content` and `visionResult.message` are both `""`.
- `extractText()` handles `chatContent === null` by returning `""` (cloudflare.ts ~line 562) — no error thrown, silent failure.

## Suggested fix

For Workers AI vision models, use the raw binding format:
```js
ai.run(model, {
  image: Array.from(imageBytes), // number[]
  prompt: request.messages[lastUserIdx].textContent,
  max_tokens: request.maxTokens,
})
```

and map the `{ response: string }` return back through `formatResponse`. The messages/image_url path can remain for REST API consumers.

Alternatively, detect at runtime whether the AI binding is present and branch the format accordingly.

## Version

`@stackbilt/llm-providers` ^1.5.0 (foodfiles dependency)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cloudflare provider analyzeImage() returns empty string for Workers AI vision models #53

Summary

Reproduction

Root cause hypothesis

Evidence

Suggested fix

Version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cloudflare provider analyzeImage() returns empty string for Workers AI vision models #53

Description

Summary

Reproduction

Root cause hypothesis

Evidence

Suggested fix

Version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions