Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .changeset/thin-eagles-serve.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
'@ai-sdk/google-vertex': patch
'@example/ai-core': patch
'@ai-sdk/google': patch
---

Added Reasoning and Code Execution support to google providers
192 changes: 192 additions & 0 deletions content/providers/01-ai-sdk-providers/15-google-generative-ai.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,14 @@ The following optional provider options are available for Google Generative AI m
- `BLOCK_ONLY_HIGH`
- `BLOCK_NONE`

- **useSearchGrounding** _boolean_

Optional. When enabled, the model will [use Google search to ground the response](https://ai.google.dev/gemini-api/docs/grounding).

- **useCodeExecution** _boolean_

Optional. When enabled, the model will make use of a code execution tool that [enables the model to generate and run Python code](https://ai.google.dev/gemini-api/docs/code-execution).

- **responseModalities** _string[]_
The modalities to use for the response. The following modalities are supported: `TEXT`, `IMAGE`. When not defined or empty, the model defaults to returning only text.

Expand Down Expand Up @@ -396,6 +404,190 @@ const { sources } = await generateText({
});
```

### Code Execution

With [Code Execution](https://ai.google.dev/gemini-api/docs/code-execution), certain models can generate and execute Python code to perform calculations, solve problems, or provide more accurate information.

To enable this feature, set `useCodeExecution: true` in the `providerOptions` for the Google provider:

```ts highlight="6-10"
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';

async function main() {
const result = await generateText({
model: google('gemini-2.5-flash-preview-04-17'),
providerOptions: {
google: {
useCodeExecution: true,
},
},
prompt:
'Calculate the 20th Fibonacci number. Then find the nearest palindrome to it.',
});

// Process result.content which may include file and text parts
// (see example below)
console.log('Final aggregated text:', result.text);
}

main();
```

When Code Execution is enabled, the model's response will surface the generated code and its execution results as distinct parts within the output:

- **Generated Python Code**: This is represented as a `file` content part (or stream part).
- `type`: `'file'`
- `mediaType`: `'text/x-python'`
- `data`: A base64-encoded string of the Python code that the model generated and executed.
- **Code Execution Result**: This is represented as a `text` content part (or stream part).
- `type`: `'text'`
- `text`: A formatted string detailing the execution `outcome` (e.g., "OUTCOME_OK") and the `output` from the code. The format is typically: `Execution Result (Outcome: <OUTCOME>):\n<OUTPUT>`.

#### `generateText` with Code Execution

When using `generateText`, the `result.content` array will contain these `file` (for executable code) and `text` (for execution results) parts interspersed with other text parts generated by the model.

Here's how you can process these parts:

```ts
import { google } from '@ai-sdk/google';
import { generateText } from 'ai';
import 'dotenv/config';

async function main() {
const result = await generateText({
model: google('gemini-2.5-flash-preview-04-17'),
providerOptions: {
google: {
useCodeExecution: true,
},
},
maxOutputTokens: 2048,
prompt:
'Calculate 20th fibonacci number. Then find the nearest palindrome to it.',
});

console.log('Processing content parts:');
for (const part of result.content) {
switch (part.type) {
case 'file': {
// This is the executableCode part
process.stdout.write(
'\x1b[33m' + // Yellow color for "file"
part.type +
'\x1b[34m: ' + // Blue color for mediaType
part.mediaType + // Should be 'text/x-python'
'\x1b[0m', // Reset color
);
console.log(); // Newline
// Data is base64 encoded Python code
console.log('Code:\n', atob(part.data as string));
break;
}
case 'text': {
// This can be a regular text part or a codeExecutionResult
process.stdout.write(
'\x1b[34m' + part.type + '\x1b[0m', // Blue color for "text"
);
console.log(); // Newline
console.log(part.text); // Contains model's text or formatted execution result
break;
}
}
}

process.stdout.write('\n\n--- Full Response Details ---\n');
console.log('Aggregated Text:', result.text);
console.log('Warnings:', result.warnings);
console.log('Token usage:', result.usage);
console.log('Finish reason:', result.finishReason);
}

main().catch(console.error);
```

#### Streaming Code Execution Details (`streamText`)

When using `streamText` with `useCodeExecution: true`, the generated Python code and its execution results are streamed as distinct part types:

- **Generated Python Code**: Arrives as a stream part where `delta.type === 'file'`.
- `delta.mediaType` will be `'text/x-python'`.
- `delta.data` will be the base64-encoded Python code string.
- **Code Execution Result**: Arrives as a stream part where `delta.type === 'text'`.
- `delta.text` will contain the formatted string with the execution outcome and output (e.g., `Execution Result (Outcome: OUTCOME_OK):\nOutput...`).

Here's an example of how you might process the stream:

```ts
import { google } from '@ai-sdk/google';
import { streamText } from 'ai';
import 'dotenv/config';

async function main() {
const result = streamText({
model: google('gemini-2.5-flash-preview-04-17'),
providerOptions: {
google: {
useCodeExecution: true,
},
},
maxOutputTokens: 10000,
prompt:
'Calculate 20th fibonacci number. Then find the nearest palindrome to it.',
});

let fullResponse = '';
console.log('Streaming content parts:');

for await (const delta of result.fullStream) {
switch (delta.type) {
case 'file': {
// This is the executableCode part
process.stdout.write(
'\x1b[33m' + // Yellow color for "file"
delta.type +
'\x1b[34m: ' + // Blue color for mediaType
delta.mediaType + // Should be 'text/x-python'
'\x1b[0m', // Reset color
);
console.log(); // Newline
// Data is base64 encoded Python code
console.log('Code:\n', atob(delta.data as string));
break;
}
case 'text': {
// This can be a regular text part or a codeExecutionResult
process.stdout.write(
'\x1b[34m' + delta.type + '\x1b[0m', // Blue color for "text"
);
console.log(); // Newline
console.log(delta.text); // Contains model's text or formatted execution result
fullResponse += delta.text;
break;
}
// Other stream part types like 'reasoning', 'tool-call-delta', 'tool-call',
// 'stream-start', 'finish', 'error' can be handled here if needed.
}
}

process.stdout.write('\n\n--- Full Response Details ---\n');
console.log('Aggregated Text from Stream:', fullResponse);
console.log('Warnings:', await result.warnings);
console.log('Token usage:', await result.usage);
console.log('Finish reason:', await result.finishReason);
}

main().catch(console.error);
```

<Note>
Code Execution capabilities and specific model support are subject to Google's
offerings. Always refer to the [official Google AI
documentation](https://ai.google.dev/gemini-api/docs/code-execution) for the
most current information on compatible models and features.
</Note>

### Image Outputs

The model `gemini-2.0-flash-exp` supports image generation. Images are exposed as files in the response.
Expand Down
Loading
Loading