DurableAgent: Support LanguageModelV3ToolResultOutput for multimodal tool results (images, files)

## Summary

Currently, `DurableAgent.executeTool` always wraps tool results as a text type by JSON stringifying them, even when the tool returns a properly typed `LanguageModelV3ToolResultOutput` with multimodal content (e.g., images, files).

This prevents tools from returning images or files that the LLM can "see" via vision capabilities.

## Current Behavior

In `durable-agent.js`, the `executeTool` function unconditionally stringifies tool results:

```
const toolResult = await execute(parsedInput, { toolCallId, messages, experimental_context });
return {
    type: 'tool-result',
    toolCallId: toolCall.toolCallId,
    toolName: toolCall.toolName,
    output: {
        type: 'text',
        value: JSON.stringify(toolResult) ?? '',  // <-- Always stringified!
    },
};

```

This means if a tool returns:
```
return {
  type: 'content',
  value: [
    { type: 'text', text: 'Here is the image' },
    { type: 'file-data', data: base64ImageData, mediaType: 'image/jpeg' },
  ],
};

```
`return {  type: 'content',  value: [    { type: 'text', text: 'Here is the image' },    { type: 'file-data', data: base64ImageData, mediaType: 'image/jpeg' },  ],};
`
It gets stringified to a text value, and the LLM never "sees" the image.

Expected Behavior
```
executeTool should detect if the tool result is already a valid LanguageModelV3ToolResultOutput and pass it through without modification.
Proposed Solution

```
```
const toolResult = await execute(parsedInput, { toolCallId, messages, experimental_context });

// Check if tool result is already a LanguageModelV3ToolResultOutput
if (isToolResultOutput(toolResult)) {
  return {
    type: 'tool-result',
    toolCallId: toolCall.toolCallId,
    toolName: toolCall.toolName,
    output: toolResult,  // Pass through as-is
  };
}

// Otherwise, wrap as text (current behavior)
return {
    type: 'tool-result',
    ...
};

function isToolResultOutput(result: unknown): result is LanguageModelV3ToolResultOutput {
  if (typeof result !== 'object' || result === null) return false;
  const r = result as { type?: string };
  return ['text', 'json', 'content', 'error-text', 'error-json', 'execution-denied'].includes(r.type ?? '');
}

```
Use Case
Building AI agents that can:
Generate images via tools (e.g., image generation APIs)
Fetch and display images to the LLM for vision analysis
Return file attachments (PDFs, etc.) in tool results



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DurableAgent: Support LanguageModelV3ToolResultOutput for multimodal tool results (images, files) #848

Summary

Current Behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

DurableAgent: Support LanguageModelV3ToolResultOutput for multimodal tool results (images, files) #848

Description

Summary

Current Behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions