diff --git a/README.md b/README.md index d76779d8..14c53229 100644 --- a/README.md +++ b/README.md @@ -23,14 +23,25 @@ Foundry Local lets you embed generative AI directly into your applications — n Key benefits include: - **Self-contained SDK** — Ship AI features without requiring users to install any external dependencies. +- **Chat AND Audio in one runtime** — Text generation and speech-to-text (Whisper) through a single SDK — no need for separate tools like `whisper.cpp` + `llama.cpp`. - **Easy-to-use CLI** — Explore models and experiment locally before integrating with your app. - **Optimized models out-of-the-box** — State-of-the-art quantization and compression deliver both performance and quality. - **Small footprint** — Leverages [ONNX Runtime](https://onnxruntime.ai/); a high performance inference runtime (written in C++) that has minimal disk and memory requirements. -- **Automatic hardware acceleration** — Leverage GPUs and NPUs when available, with seamless fallback to CPU. -- **Model distribution** — Popular open-source models hosted in the cloudwith automatic downloading and updating. +- **Automatic hardware acceleration** — Leverage GPUs and NPUs when available, with seamless fallback to CPU. Zero hardware detection code needed. +- **Model distribution** — Popular open-source models hosted in the cloud with automatic downloading and updating. - **Multi-platform support** — Windows, macOS (Apple silicon), Linux and Android. - **Bring your own models** — Add and run custom models alongside the built-in catalog. +### Supported Tasks + +| Task | Model Aliases | API | +|------|--------------|-----| +| Chat / Text Generation | `phi-3.5-mini`, `qwen2.5-0.5b`, `qwen2.5-coder-0.5b`, etc. | Chat Completions | +| Audio Transcription (Speech-to-Text) | `whisper-tiny` | Audio Transcription | + +> [!NOTE] +> Foundry Local is a **unified local AI runtime** — it replaces the need for separate tools like `whisper.cpp`, `llama.cpp`, or `ollama`. One SDK handles both chat and audio, with automatic hardware acceleration (NPU > GPU > CPU). + ## 🚀 Quickstart ### Explore with the CLI @@ -196,10 +207,41 @@ Explore complete working examples in the [`samples/`](samples/) folder: | Sample | Description | |--------|-------------| -| [**cs/**](samples/cs/) | C# examples using the .NET SDK | -| [**js/**](samples/js/) | JavaScript/Node.js examples | +| [**cs/**](samples/cs/) | C# examples using the .NET SDK (includes audio transcription) | +| [**js/**](samples/js/) | JavaScript/Node.js examples (chat, audio transcription, tool calling) | | [**python/**](samples/python/) | Python examples using the OpenAI-compatible API | +#### Audio Transcription (Speech-to-Text) + +The SDK also supports audio transcription via Whisper models. Use `model.createAudioClient()` to transcribe audio files on-device: + +```javascript +import { FoundryLocalManager } from 'foundry-local-sdk'; + +const manager = FoundryLocalManager.create({ appName: 'MyApp' }); + +// Download and load the Whisper model +const whisperModel = await manager.catalog.getModel('whisper-tiny'); +await whisperModel.download(); +await whisperModel.load(); + +// Transcribe an audio file +const audioClient = whisperModel.createAudioClient(); +audioClient.settings.language = 'en'; +const result = await audioClient.transcribe('recording.wav'); +console.log('Transcription:', result.text); + +// Or stream in real-time +await audioClient.transcribeStreaming('recording.wav', (chunk) => { + process.stdout.write(chunk.text); +}); + +await whisperModel.unload(); +``` + +> [!TIP] +> A single `FoundryLocalManager` can manage both chat and audio models simultaneously. See the [chat-and-audio sample](samples/js/chat-and-audio-foundry-local/) for a complete example that transcribes audio then analyzes it with a chat model. + ## Manage This section provides an overview of how to manage Foundry Local, including installation, upgrading, and removing the application. diff --git a/docs/README.md b/docs/README.md index 356533ef..6515e353 100644 --- a/docs/README.md +++ b/docs/README.md @@ -6,4 +6,21 @@ Documentation for Foundry Local can be found in the following resources: - SDK Reference: - [C# SDK Reference](../sdk_v2/cs/README.md): This documentation provides detailed information about the C# SDK for Foundry Local, including API references, usage examples, and best practices for integrating Foundry Local into your applications. - [JavaScript SDK Reference](../sdk_v2/js/README.md): This documentation offers detailed information about the JavaScript SDK for Foundry Local, including API references, usage examples, and best practices for integrating Foundry Local into your web applications. -- [Foundry Local Lab](https://github.com/Microsoft-foundry/foundry-local-lab): This GitHub repository contains a lab designed to help you learn how to use Foundry Local effectively. It includes hands-on exercises, sample code, and step-by-step instructions to guide you through the process of setting up and using Foundry Local in various scenarios. \ No newline at end of file +- [Foundry Local Lab](https://github.com/Microsoft-foundry/foundry-local-lab): This GitHub repository contains a lab designed to help you learn how to use Foundry Local effectively. It includes hands-on exercises, sample code, and step-by-step instructions to guide you through the process of setting up and using Foundry Local in various scenarios. + +## Supported Capabilities + +Foundry Local is a unified local AI runtime that supports both **text generation** and **speech-to-text** through a single SDK: + +| Capability | Model Aliases | SDK API | +|------------|--------------|---------| +| Chat Completions (Text Generation) | `phi-3.5-mini`, `qwen2.5-0.5b`, etc. | `model.createChatClient()` | +| Audio Transcription (Speech-to-Text) | `whisper-tiny` | `model.createAudioClient()` | + +## Samples + +- [JavaScript: Chat (Hello Foundry Local)](../samples/js/hello-foundry-local/) — Basic chat completions +- [JavaScript: Audio Transcription](../samples/js/audio-transcription-foundry-local/) — Speech-to-text with Whisper +- [JavaScript: Chat + Audio](../samples/js/chat-and-audio-foundry-local/) — Unified chat and audio in one app +- [JavaScript: Tool Calling](../samples/js/tool-calling-foundry-local/) — Function calling with local models +- [C#: Getting Started](../samples/cs/GettingStarted/) — C# SDK examples including audio transcription diff --git a/samples/js/audio-transcription-foundry-local/README.md b/samples/js/audio-transcription-foundry-local/README.md new file mode 100644 index 00000000..949e40f5 --- /dev/null +++ b/samples/js/audio-transcription-foundry-local/README.md @@ -0,0 +1,39 @@ +# Sample: Audio Transcription with Foundry Local + +This sample demonstrates how to use Foundry Local for **speech-to-text (audio transcription)** using the Whisper model — entirely on-device, with no cloud services required. + +## What This Shows + +- Loading the `whisper-tiny` model via the Foundry Local SDK +- Transcribing an audio file (`.wav`, `.mp3`, etc.) to text +- Both standard and streaming transcription modes +- Automatic hardware acceleration (NPU > GPU > CPU) + +## Prerequisites + +- [Foundry Local](https://github.com/microsoft/Foundry-Local) installed on your machine +- Node.js 18+ + +## Getting Started + +Install the Foundry Local SDK: + +```bash +npm install foundry-local-sdk +``` + +Place an audio file (e.g., `recording.wav` or `recording.mp3`) in the project directory, then run: + +```bash +node src/app.js +``` + +## How It Works + +The Foundry Local SDK handles everything: +1. **Model discovery** — finds the best `whisper-tiny` variant for your hardware +2. **Model download** — downloads the model if not already cached +3. **Model loading** — loads the model into memory with optimized hardware acceleration +4. **Transcription** — runs Whisper inference entirely on-device + +No need for `whisper.cpp`, `@huggingface/transformers`, or any other separate STT tool. diff --git a/samples/js/audio-transcription-foundry-local/package.json b/samples/js/audio-transcription-foundry-local/package.json new file mode 100644 index 00000000..48665306 --- /dev/null +++ b/samples/js/audio-transcription-foundry-local/package.json @@ -0,0 +1,11 @@ +{ + "name": "audio-transcription-foundry-local", + "type": "module", + "description": "Audio transcription (speech-to-text) sample using Foundry Local", + "scripts": { + "start": "node src/app.js" + }, + "dependencies": { + "foundry-local-sdk": "latest" + } +} diff --git a/samples/js/audio-transcription-foundry-local/src/app.js b/samples/js/audio-transcription-foundry-local/src/app.js new file mode 100644 index 00000000..e9a8c7bc --- /dev/null +++ b/samples/js/audio-transcription-foundry-local/src/app.js @@ -0,0 +1,64 @@ +// Copyright (c) Microsoft Corporation. All rights reserved. +// Licensed under the MIT License. + +import { FoundryLocalManager } from "foundry-local-sdk"; +import path from "path"; + +// The Whisper model alias for audio transcription +const alias = "whisper-tiny"; + +async function main() { + console.log("Initializing Foundry Local SDK..."); + const manager = FoundryLocalManager.create({ + appName: "AudioTranscriptionSample", + logLevel: "info", + }); + + // Get the Whisper model from the catalog + const catalog = manager.catalog; + const model = await catalog.getModel(alias); + if (!model) { + throw new Error( + `Model "${alias}" not found. Run "foundry model list" to see available models.` + ); + } + + // Download the model if not already cached + if (!model.isCached) { + console.log(`Downloading model "${alias}"...`); + await model.download((progress) => { + process.stdout.write(`\rDownload progress: ${progress.toFixed(1)}%`); + }); + console.log("\nDownload complete."); + } + + // Load the model into memory + console.log(`Loading model "${model.id}"...`); + await model.load(); + console.log("Model loaded.\n"); + + // Create an audio client for transcription + const audioClient = model.createAudioClient(); + audioClient.settings.language = "en"; + + // Update this path to point to your audio file + const audioFilePath = path.resolve("recording.mp3"); + + // --- Standard transcription --- + console.log("=== Standard Transcription ==="); + const result = await audioClient.transcribe(audioFilePath); + console.log("Transcription:", result.text); + + // --- Streaming transcription --- + console.log("\n=== Streaming Transcription ==="); + await audioClient.transcribeStreaming(audioFilePath, (chunk) => { + process.stdout.write(chunk.text); + }); + console.log("\n"); + + // Clean up + await model.unload(); + console.log("Done."); +} + +main().catch(console.error); diff --git a/samples/js/chat-and-audio-foundry-local/README.md b/samples/js/chat-and-audio-foundry-local/README.md new file mode 100644 index 00000000..23de6629 --- /dev/null +++ b/samples/js/chat-and-audio-foundry-local/README.md @@ -0,0 +1,39 @@ +# Sample: Chat + Audio Transcription with Foundry Local + +This sample demonstrates how to use Foundry Local as a **unified AI runtime** for both **text generation (chat)** and **speech-to-text (audio transcription)** — all on-device, with a single SDK managing both models. + +## What This Shows + +- Using a single `FoundryLocalManager` to manage both chat and audio models +- Transcribing an audio file using the `whisper-tiny` model +- Analyzing the transcription using the `phi-3.5-mini` chat model +- Automatic hardware acceleration for both models — zero hardware detection code needed + +## Why Foundry Local? + +Without Foundry Local, building an app with both chat and speech-to-text typically requires: +- A separate STT library (`whisper.cpp`, `@huggingface/transformers`) +- A separate LLM runtime (`llama.cpp`, `node-llama-cpp`) +- Custom hardware detection code for each runtime (~200+ lines) +- Separate model download and caching logic + +With Foundry Local, you get **one SDK, one service, both capabilities** — and the hardware detection is automatic. + +## Prerequisites + +- [Foundry Local](https://github.com/microsoft/Foundry-Local) installed on your machine +- Node.js 18+ + +## Getting Started + +Install the Foundry Local SDK: + +```bash +npm install foundry-local-sdk +``` + +Place an audio file (`recording.mp3`) in the project directory, then run: + +```bash +node src/app.js +``` diff --git a/samples/js/chat-and-audio-foundry-local/package.json b/samples/js/chat-and-audio-foundry-local/package.json new file mode 100644 index 00000000..a91ecda3 --- /dev/null +++ b/samples/js/chat-and-audio-foundry-local/package.json @@ -0,0 +1,11 @@ +{ + "name": "chat-and-audio-foundry-local", + "type": "module", + "description": "Unified chat + audio transcription sample using Foundry Local", + "scripts": { + "start": "node src/app.js" + }, + "dependencies": { + "foundry-local-sdk": "latest" + } +} diff --git a/samples/js/chat-and-audio-foundry-local/src/app.js b/samples/js/chat-and-audio-foundry-local/src/app.js new file mode 100644 index 00000000..b3084816 --- /dev/null +++ b/samples/js/chat-and-audio-foundry-local/src/app.js @@ -0,0 +1,103 @@ +// Copyright (c) Microsoft Corporation. All rights reserved. +// Licensed under the MIT License. + +import { FoundryLocalManager } from "foundry-local-sdk"; +import path from "path"; + +// Model aliases +const CHAT_MODEL = "phi-3.5-mini"; +const WHISPER_MODEL = "whisper-tiny"; + +async function main() { + console.log("Initializing Foundry Local SDK..."); + const manager = FoundryLocalManager.create({ + appName: "ChatAndAudioSample", + logLevel: "info", + }); + + const catalog = manager.catalog; + + // --- Load both models --- + console.log("\n--- Loading models ---"); + + const chatModel = await catalog.getModel(CHAT_MODEL); + if (!chatModel) { + throw new Error( + `Chat model "${CHAT_MODEL}" not found. Run "foundry model list" to see available models.` + ); + } + + const whisperModel = await catalog.getModel(WHISPER_MODEL); + if (!whisperModel) { + throw new Error( + `Whisper model "${WHISPER_MODEL}" not found. Run "foundry model list" to see available models.` + ); + } + + // Download models if not cached + if (!chatModel.isCached) { + console.log(`Downloading ${CHAT_MODEL}...`); + await chatModel.download((progress) => { + process.stdout.write(`\r ${CHAT_MODEL}: ${progress.toFixed(1)}%`); + }); + console.log(); + } + + if (!whisperModel.isCached) { + console.log(`Downloading ${WHISPER_MODEL}...`); + await whisperModel.download((progress) => { + process.stdout.write(`\r ${WHISPER_MODEL}: ${progress.toFixed(1)}%`); + }); + console.log(); + } + + // Load both models into memory + console.log(`Loading ${CHAT_MODEL}...`); + await chatModel.load(); + console.log(`Loading ${WHISPER_MODEL}...`); + await whisperModel.load(); + console.log("Both models loaded.\n"); + + // --- Step 1: Transcribe audio --- + console.log("=== Step 1: Audio Transcription ==="); + const audioClient = whisperModel.createAudioClient(); + audioClient.settings.language = "en"; + + // Update this path to point to your audio file + const audioFilePath = path.resolve("recording.mp3"); + const transcription = await audioClient.transcribe(audioFilePath); + console.log("You said:", transcription.text); + + // --- Step 2: Analyze with chat model --- + console.log("\n=== Step 2: AI Analysis ==="); + const chatClient = chatModel.createChatClient(); + chatClient.settings.temperature = 0.7; + chatClient.settings.maxTokens = 500; + + // Summarize the transcription + console.log("Generating summary...\n"); + await chatClient.completeStreamingChat( + [ + { + role: "system", + content: + "You are a helpful assistant. Summarize the following transcribed audio and extract key themes and action items.", + }, + { role: "user", content: transcription.text }, + ], + (chunk) => { + const content = chunk.choices?.[0]?.message?.content; + if (content) { + process.stdout.write(content); + } + } + ); + console.log("\n"); + + // --- Clean up --- + await chatModel.unload(); + await whisperModel.unload(); + console.log("Done."); +} + +main().catch(console.error); diff --git a/sdk_v2/js/examples/audio-transcription.ts b/sdk_v2/js/examples/audio-transcription.ts new file mode 100644 index 00000000..7fddf2d8 --- /dev/null +++ b/sdk_v2/js/examples/audio-transcription.ts @@ -0,0 +1,103 @@ +// ------------------------------------------------------------------------- +// Copyright (c) Microsoft Corporation. All rights reserved. +// Licensed under the MIT License. +// ------------------------------------------------------------------------- + +import { FoundryLocalManager } from '../src/index.js'; +import path from 'path'; + +async function main() { + let modelToLoad: any = null; + + try { + // Initialize the Foundry Local SDK + console.log('Initializing Foundry Local SDK...'); + + const manager = FoundryLocalManager.create({ + appName: 'FoundryLocalAudioExample', + logLevel: 'info' + }); + console.log('✓ SDK initialized successfully'); + + // Explore available models + console.log('\nFetching available models...'); + const catalog = manager.catalog; + const models = await catalog.getModels(); + + console.log(`Found ${models.length} models:`); + for (const model of models) { + const variants = model.variants.map((v: any) => v.id).join(', '); + console.log(` - ${model.alias} (variants: ${variants})`); + } + + const modelAlias = 'whisper-tiny'; + + // Get the Whisper model + console.log(`\nLoading model ${modelAlias}...`); + modelToLoad = await catalog.getModel(modelAlias); + if (!modelToLoad) { + throw new Error(`Model ${modelAlias} not found`); + } + + // Download if not cached + if (!modelToLoad.isCached) { + console.log('Downloading model...'); + await modelToLoad.download((progress: number) => { + process.stdout.write(`\rDownload: ${progress.toFixed(1)}%`); + }); + console.log(); + } + + await modelToLoad.load(); + console.log('✓ Model loaded'); + + // Create audio client + console.log('\nCreating audio client...'); + const audioClient = modelToLoad.createAudioClient(); + + // Configure settings + audioClient.settings.language = 'en'; + audioClient.settings.temperature = 0.0; // deterministic results + + console.log('✓ Audio client created'); + + // Audio file path — update this to point to your audio file + const audioFilePath = path.join(process.cwd(), '..', 'testdata', 'Recording.mp3'); + + // Example: Standard transcription + console.log('\nTesting standard transcription...'); + const result = await audioClient.transcribe(audioFilePath); + console.log('\nTranscription result:'); + console.log(result.text); + + // Example: Streaming transcription + console.log('\nTesting streaming transcription...'); + await audioClient.transcribeStreaming(audioFilePath, (chunk: any) => { + process.stdout.write(chunk.text); + }); + console.log('\n'); + + // Unload the model + console.log('Unloading model...'); + await modelToLoad.unload(); + console.log(`✓ Model unloaded`); + + console.log('\n✓ Audio transcription example completed successfully'); + + } catch (error) { + console.log('Error running example:', error); + if (error instanceof Error && error.stack) { + console.log(error.stack); + } + // Best-effort cleanup + if (modelToLoad) { + try { await modelToLoad.unload(); } catch { /* ignore */ } + } + process.exit(1); + } +} + +// Run the example +main().catch(console.error); + +export { main }; diff --git a/sdk_v2/js/examples/responses.ts b/sdk_v2/js/examples/responses.ts new file mode 100644 index 00000000..fa8a6d93 --- /dev/null +++ b/sdk_v2/js/examples/responses.ts @@ -0,0 +1,135 @@ +// ------------------------------------------------------------------------- +// Copyright (c) Microsoft Corporation. All rights reserved. +// Licensed under the MIT License. +// ------------------------------------------------------------------------- + +import { FoundryLocalManager, getOutputText } from '../src/index.js'; +import type { StreamingEvent, FunctionToolDefinition, FunctionCallItem } from '../src/types.js'; + +async function main() { + try { + // Initialize the Foundry Local SDK + console.log('Initializing Foundry Local SDK...'); + const manager = FoundryLocalManager.create({ + appName: 'ResponsesExample', + logLevel: 'info' + }); + console.log('✓ SDK initialized'); + + // Load a model + const modelAlias = 'MODEL_ALIAS'; // Replace with a valid model alias + const catalog = manager.catalog; + const model = await catalog.getModel(modelAlias); + await model.load(); + console.log(`✓ Model ${model.id} loaded`); + + // Start the web service (required for Responses API) + manager.startWebService(); + console.log(`✓ Web service running at ${manager.urls[0]}`); + + // Create a ResponsesClient + const client = manager.createResponsesClient(model.id); + client.settings.temperature = 0.7; + client.settings.maxOutputTokens = 500; + + // ================================================================= + // Example 1: Basic text response + // ================================================================= + console.log('\n--- Example 1: Basic text response ---'); + const response = await client.create('What is the capital of France?'); + + console.log(`Status: ${response.status}`); + console.log(`Response: ${getOutputText(response)}`); + + // ================================================================= + // Example 2: Streaming response + // ================================================================= + console.log('\n--- Example 2: Streaming response ---'); + process.stdout.write('Response: '); + await client.createStreaming( + 'Write a short haiku about code.', + (event: StreamingEvent) => { + if (event.type === 'response.output_text.delta') { + process.stdout.write(event.delta); + } + } + ); + console.log('\n'); + + // ================================================================= + // Example 3: Multi-turn with previous_response_id + // ================================================================= + console.log('--- Example 3: Multi-turn conversation ---'); + client.settings.store = true; + + const turn1 = await client.create('My name is Alice. Remember it.'); + console.log(`Turn 1 (ID: ${turn1.id}): done`); + + const turn2 = await client.create('What is my name?', { + previous_response_id: turn1.id, + }); + console.log(`Turn 2: ${getOutputText(turn2)}`); + + // ================================================================= + // Example 4: Tool calling + // ================================================================= + console.log('\n--- Example 4: Tool calling ---'); + const tools: FunctionToolDefinition[] = [{ + type: 'function', + name: 'get_weather', + description: 'Get the current weather for a location.', + parameters: { + type: 'object', + properties: { + location: { type: 'string', description: 'City name' }, + }, + required: ['location'], + }, + }]; + + const toolResponse = await client.create( + 'What is the weather in Seattle?', + { tools, tool_choice: 'required' } + ); + + // Find the function call in the output + const funcCall = toolResponse.output.find( + (o): o is FunctionCallItem => o.type === 'function_call' + ); + + if (funcCall) { + console.log(`Tool call: ${funcCall.name}(${funcCall.arguments})`); + + // Simulate providing the tool result and continuing + const finalResponse = await client.create([ + { type: 'function_call_output', call_id: funcCall.call_id, output: '72°F, sunny' }, + ], { previous_response_id: toolResponse.id, tools }); + + console.log(`Final: ${getOutputText(finalResponse)}`); + } + + // ================================================================= + // Example 5: Get & delete stored response + // ================================================================= + console.log('\n--- Example 5: Get & delete stored response ---'); + const stored = await client.create('Hello!'); + console.log(`Created: ${stored.id}`); + + const retrieved = await client.get(stored.id); + console.log(`Retrieved: ${retrieved.id}, status: ${retrieved.status}`); + + const deleted = await client.delete(stored.id); + console.log(`Deleted: ${deleted.deleted}`); + + // Cleanup + manager.stopWebService(); + await model.unload(); + console.log('\n✓ Example completed successfully'); + + } catch (error) { + console.error('Error:', error instanceof Error ? error.message : error); + process.exit(1); + } +} + +main(); diff --git a/sdk_v2/js/src/foundryLocalManager.ts b/sdk_v2/js/src/foundryLocalManager.ts index ed56103c..bc408f78 100644 --- a/sdk_v2/js/src/foundryLocalManager.ts +++ b/sdk_v2/js/src/foundryLocalManager.ts @@ -2,8 +2,7 @@ import { Configuration, FoundryLocalConfig } from './configuration.js'; import { CoreInterop } from './detail/coreInterop.js'; import { ModelLoadManager } from './detail/modelLoadManager.js'; import { Catalog } from './catalog.js'; -import { ChatClient } from './openai/chatClient.js'; -import { AudioClient } from './openai/audioClient.js'; +import { ResponsesClient } from './openai/responsesClient.js'; /** * The main entry point for the Foundry Local SDK. @@ -87,4 +86,27 @@ export class FoundryLocalManager { this._urls = []; } } + + /** + * Whether the web service is currently running. + */ + public get isWebServiceRunning(): boolean { + return this._urls.length > 0; + } + + /** + * Creates a ResponsesClient for interacting with the Responses API. + * The web service must be started first via `startWebService()`. + * @param modelId - Optional default model ID for requests. + * @returns A ResponsesClient instance. + * @throws Error - If the web service is not running. + */ + public createResponsesClient(modelId?: string): ResponsesClient { + if (this._urls.length === 0) { + throw new Error( + 'Web service is not running. Call startWebService() before creating a ResponsesClient.' + ); + } + return new ResponsesClient(this._urls[0], modelId); + } } diff --git a/sdk_v2/js/src/imodel.ts b/sdk_v2/js/src/imodel.ts index 5797ce3b..be0913d6 100644 --- a/sdk_v2/js/src/imodel.ts +++ b/sdk_v2/js/src/imodel.ts @@ -1,5 +1,6 @@ import { ChatClient } from './openai/chatClient.js'; import { AudioClient } from './openai/audioClient.js'; +import { ResponsesClient } from './openai/responsesClient.js'; export interface IModel { get id(): string; @@ -15,4 +16,11 @@ export interface IModel { createChatClient(): ChatClient; createAudioClient(): AudioClient; + /** + * Creates a ResponsesClient for interacting with the model via the Responses API. + * Unlike createChatClient/createAudioClient (which use FFI), the Responses API + * is HTTP-based, so the web service base URL must be provided. + * @param baseUrl - The base URL of the Foundry Local web service. + */ + createResponsesClient(baseUrl: string): ResponsesClient; } diff --git a/sdk_v2/js/src/index.ts b/sdk_v2/js/src/index.ts index 1af50af8..7d7ee17a 100644 --- a/sdk_v2/js/src/index.ts +++ b/sdk_v2/js/src/index.ts @@ -6,6 +6,7 @@ export { ModelVariant } from './modelVariant.js'; export type { IModel } from './imodel.js'; export { ChatClient, ChatClientSettings } from './openai/chatClient.js'; export { AudioClient, AudioClientSettings } from './openai/audioClient.js'; +export { ResponsesClient, ResponsesClientSettings, getOutputText } from './openai/responsesClient.js'; export { ModelLoadManager } from './detail/modelLoadManager.js'; /** @internal */ export { CoreInterop } from './detail/coreInterop.js'; diff --git a/sdk_v2/js/src/model.ts b/sdk_v2/js/src/model.ts index c2848524..daac6558 100644 --- a/sdk_v2/js/src/model.ts +++ b/sdk_v2/js/src/model.ts @@ -1,6 +1,7 @@ import { ModelVariant } from './modelVariant.js'; import { ChatClient } from './openai/chatClient.js'; import { AudioClient } from './openai/audioClient.js'; +import { ResponsesClient } from './openai/responsesClient.js'; import { IModel } from './imodel.js'; /** @@ -146,4 +147,13 @@ export class Model implements IModel { public createAudioClient(): AudioClient { return this.selectedVariant.createAudioClient(); } + + /** + * Creates a ResponsesClient for interacting with the model via the Responses API. + * @param baseUrl - The base URL of the Foundry Local web service. + * @returns A ResponsesClient instance. + */ + public createResponsesClient(baseUrl: string): ResponsesClient { + return this.selectedVariant.createResponsesClient(baseUrl); + } } diff --git a/sdk_v2/js/src/modelVariant.ts b/sdk_v2/js/src/modelVariant.ts index 7c8b8023..4d3e2bee 100644 --- a/sdk_v2/js/src/modelVariant.ts +++ b/sdk_v2/js/src/modelVariant.ts @@ -3,6 +3,7 @@ import { ModelLoadManager } from './detail/modelLoadManager.js'; import { ModelInfo } from './types.js'; import { ChatClient } from './openai/chatClient.js'; import { AudioClient } from './openai/audioClient.js'; +import { ResponsesClient } from './openai/responsesClient.js'; import { IModel } from './imodel.js'; /** @@ -127,4 +128,13 @@ export class ModelVariant implements IModel { public createAudioClient(): AudioClient { return new AudioClient(this._modelInfo.id, this.coreInterop); } + + /** + * Creates a ResponsesClient for interacting with the model via the Responses API. + * @param baseUrl - The base URL of the Foundry Local web service. + * @returns A ResponsesClient instance. + */ + public createResponsesClient(baseUrl: string): ResponsesClient { + return new ResponsesClient(baseUrl, this._modelInfo.id); + } } diff --git a/sdk_v2/js/src/openai/responsesClient.ts b/sdk_v2/js/src/openai/responsesClient.ts new file mode 100644 index 00000000..711efb78 --- /dev/null +++ b/sdk_v2/js/src/openai/responsesClient.ts @@ -0,0 +1,489 @@ +import { + ResponseCreateParams, + ResponseObject, + ResponseToolChoice, + TruncationStrategy, + TextConfig, + ReasoningConfig, + FunctionToolDefinition, + StreamingEvent, + InputItemsListResponse, + DeleteResponseResult, + ResponseInputItem, + MessageItem, + ContentPart, +} from '../types.js'; + +/** + * Extracts the text content from an assistant message in a Response. + * Equivalent to OpenAI Python SDK's `response.output_text`. + * + * @param response - The Response object. + * @returns The concatenated text from the first assistant message, or an empty string. + */ +export function getOutputText(response: ResponseObject): string { + for (const item of response.output) { + if (item.type === 'message' && (item as MessageItem).role === 'assistant') { + const content = (item as MessageItem).content; + if (typeof content === 'string') return content; + if (Array.isArray(content)) { + return content + .filter((p: ContentPart) => 'text' in p) + .map((p: ContentPart) => (p as { text: string }).text) + .join(''); + } + } + } + return ''; +} + +/** + * Configuration settings for the Responses API client. + * Properties use camelCase in JS and are serialized to snake_case for the API. + */ +export class ResponsesClientSettings { + /** System-level instructions to guide the model. */ + instructions?: string; + temperature?: number; + topP?: number; + maxOutputTokens?: number; + frequencyPenalty?: number; + presencePenalty?: number; + toolChoice?: ResponseToolChoice; + truncation?: TruncationStrategy; + parallelToolCalls?: boolean; + store?: boolean; + metadata?: Record; + reasoning?: ReasoningConfig; + text?: TextConfig; + seed?: number; + + /** + * Serializes settings into an OpenAI Responses API-compatible request object. + * @internal + */ + _serialize(): Partial { + const filterUndefined = (obj: any): any => + Object.fromEntries(Object.entries(obj).filter(([_, v]) => v !== undefined)); + + const result: Record = { + instructions: this.instructions, + temperature: this.temperature, + top_p: this.topP, + max_output_tokens: this.maxOutputTokens, + frequency_penalty: this.frequencyPenalty, + presence_penalty: this.presencePenalty, + tool_choice: this.toolChoice, + truncation: this.truncation, + parallel_tool_calls: this.parallelToolCalls, + store: this.store, + metadata: this.metadata, + reasoning: this.reasoning ? filterUndefined(this.reasoning) : undefined, + text: this.text ? filterUndefined(this.text) : undefined, + seed: this.seed, + }; + + // Filter out undefined properties + return filterUndefined(result) as Partial; + } +} + +/** + * Client for the OpenAI Responses API served by Foundry Local's embedded web service. + * + * Unlike ChatClient/AudioClient (which use FFI via CoreInterop), the Responses API + * is HTTP-only. This client uses fetch() for all operations and parses Server-Sent Events + * for streaming. + * + * Create via `FoundryLocalManager.createResponsesClient()` or + * `model.createResponsesClient(baseUrl)`. + * + * @example + * ```typescript + * const manager = FoundryLocalManager.create({ appName: 'MyApp' }); + * manager.startWebService(); + * const client = manager.createResponsesClient('my-model-id'); + * + * // Non-streaming + * const response = await client.create('Hello, world!'); + * console.log(response.output); + * + * // Streaming + * await client.createStreaming('Tell me a story', (event) => { + * if (event.type === 'response.output_text.delta') { + * process.stdout.write(event.delta); + * } + * }); + * ``` + */ +export class ResponsesClient { + private baseUrl: string; + private modelId?: string; + + /** + * Configuration settings for responses. + */ + public settings = new ResponsesClientSettings(); + + /** + * @param baseUrl - The base URL of the Foundry Local web service (e.g. "http://127.0.0.1:5273"). + * @param modelId - Optional default model ID. Can be overridden per-request via options. + */ + constructor(baseUrl: string, modelId?: string) { + if (!baseUrl || typeof baseUrl !== 'string' || baseUrl.trim() === '') { + throw new Error('baseUrl must be a non-empty string.'); + } + // Strip trailing slashes for consistent URL construction + let url = baseUrl; + while (url.endsWith('/')) { + url = url.slice(0, -1); + } + this.baseUrl = url; + this.modelId = modelId; + } + + // ======================================================================== + // Public API + // ======================================================================== + + /** + * Creates a model response (non-streaming). + * @param input - A string prompt or array of input items. + * @param options - Additional request parameters that override client settings. + * The `model` field is optional here if a default model was set in the constructor. + * @returns The completed Response object. Check `response.status` and `response.error` + * even on success — the server returns HTTP 200 for model-level failures too. + */ + public async create( + input: string | ResponseInputItem[], + options?: Partial + ): Promise { + this.validateInput(input); + if (options?.tools) { + this.validateTools(options.tools); + } + + const body = this.buildRequest(input, { ...options, stream: false }); + + const response = await this.fetchJson( + '/v1/responses', + { method: 'POST', body: JSON.stringify(body) } + ); + return response; + } + + /** + * Creates a model response with streaming via Server-Sent Events. + * @param input - A string prompt or array of input items. + * @param callback - Called for each streaming event received. + * @param options - Additional request parameters that override client settings. + */ + public async createStreaming( + input: string | ResponseInputItem[], + callback: (event: StreamingEvent) => void, + options?: Partial + ): Promise { + this.validateInput(input); + if (options?.tools) { + this.validateTools(options.tools); + } + if (!callback || typeof callback !== 'function') { + throw new Error('Callback must be a valid function.'); + } + + const body = this.buildRequest(input, { ...options, stream: true }); + + const res = await this.doFetch('/v1/responses', { + method: 'POST', + headers: { 'Content-Type': 'application/json', 'Accept': 'text/event-stream' }, + body: JSON.stringify(body), + }); + + if (!res.body) { + throw new Error('Streaming response has no body.'); + } + + let error: Error | null = null; + + await this.parseSSEStream(res.body, (event: StreamingEvent) => { + if (error) return; + + try { + callback(event); + } catch (e) { + error = new Error( + `User callback threw an error: ${e instanceof Error ? e.message : String(e)}`, + { cause: e } + ); + } + }); + + if (error) { + throw error; + } + } + + /** + * Retrieves a stored response by ID. + * @param responseId - The ID of the response to retrieve. + * @returns The Response object, or throws if not found. + */ + public async get(responseId: string): Promise { + this.validateId(responseId, 'responseId'); + return this.fetchJson( + `/v1/responses/${encodeURIComponent(responseId)}`, + { method: 'GET' } + ); + } + + /** + * Deletes a stored response by ID. + * @param responseId - The ID of the response to delete. + * @returns The deletion result. + */ + public async delete(responseId: string): Promise { + this.validateId(responseId, 'responseId'); + return this.fetchJson( + `/v1/responses/${encodeURIComponent(responseId)}`, + { method: 'DELETE' } + ); + } + + /** + * Cancels an in-progress response. + * @param responseId - The ID of the response to cancel. + * @returns The cancelled Response object. + */ + public async cancel(responseId: string): Promise { + this.validateId(responseId, 'responseId'); + return this.fetchJson( + `/v1/responses/${encodeURIComponent(responseId)}/cancel`, + { method: 'POST' } + ); + } + + /** + * Retrieves input items for a stored response. + * @param responseId - The ID of the response. + * @returns The list of input items. + */ + public async getInputItems(responseId: string): Promise { + this.validateId(responseId, 'responseId'); + return this.fetchJson( + `/v1/responses/${encodeURIComponent(responseId)}/input_items`, + { method: 'GET' } + ); + } + + // ======================================================================== + // Internal helpers + // ======================================================================== + + /** + * Builds the full request body by merging input, settings, and per-call options. + */ + private buildRequest( + input: string | ResponseInputItem[], + options?: Partial + ): ResponseCreateParams { + const model = options?.model ?? this.modelId; + if (!model || typeof model !== 'string' || model.trim() === '') { + throw new Error( + 'Model must be specified either in the constructor, via createResponsesClient(modelId), or in options.model.' + ); + } + + const serializedSettings = this.settings._serialize(); + + // Merge order: model+input → settings defaults → per-call overrides + return { + model, + input, + ...serializedSettings, + ...options, + }; + } + + /** + * Validates that input is a non-empty string or a non-empty array of items. + */ + private validateInput(input: string | ResponseInputItem[]): void { + if (input === null || input === undefined) { + throw new Error('Input cannot be null or undefined.'); + } + if (typeof input === 'string') { + if (input.trim() === '') { + throw new Error('Input string cannot be empty.'); + } + return; + } + if (Array.isArray(input)) { + if (input.length === 0) { + throw new Error('Input items array cannot be empty.'); + } + for (const item of input) { + if (!item || typeof item !== 'object') { + throw new Error('Each input item must be a non-null object.'); + } + if (typeof (item as any).type !== 'string' || (item as any).type.trim() === '') { + throw new Error('Each input item must have a "type" property that is a non-empty string.'); + } + } + return; + } + throw new Error('Input must be a string or an array of input items.'); + } + + /** + * Validates that tools array is properly formed. + * Follows the same pattern as ChatClient.validateTools. + */ + private validateTools(tools: FunctionToolDefinition[]): void { + if (!Array.isArray(tools)) { + throw new Error('Tools must be an array if provided.'); + } + for (const tool of tools) { + if (!tool || typeof tool !== 'object' || Array.isArray(tool)) { + throw new Error('Each tool must be a non-null object with a valid "type" and "name".'); + } + if (tool.type !== 'function') { + throw new Error('Each tool must have type "function".'); + } + if (typeof tool.name !== 'string' || tool.name.trim() === '') { + throw new Error('Each tool must have a "name" property that is a non-empty string.'); + } + } + } + + /** + * Validates that a string ID parameter is non-empty and within length bounds. + */ + private validateId(id: string, paramName: string): void { + if (!id || typeof id !== 'string' || id.trim() === '') { + throw new Error(`${paramName} must be a non-empty string.`); + } + if (id.length > 1024) { + throw new Error(`${paramName} exceeds maximum length (1024).`); + } + } + + /** + * Performs a fetch and parses the JSON response, handling errors. + */ + private async fetchJson(path: string, init: RequestInit): Promise { + const res = await this.doFetch(path, { + ...init, + headers: { + 'Content-Type': 'application/json', + ...(init.headers || {}), + }, + }); + + const text = await res.text(); + try { + return JSON.parse(text) as T; + } catch { + throw new Error(`Failed to parse response JSON: ${text.substring(0, 200)}`); + } + } + + /** + * Low-level fetch wrapper with error handling. + */ + private async doFetch(path: string, init: RequestInit): Promise { + const url = `${this.baseUrl}${path}`; + let res: Response; + try { + res = await fetch(url, init); + } catch (e) { + throw new Error( + `Network error calling ${init.method ?? 'GET'} ${path}: ${e instanceof Error ? e.message : String(e)}`, + { cause: e } + ); + } + + if (!res.ok) { + const errorText = await res.text().catch(() => res.statusText); + throw new Error( + `Responses API error (${res.status}): ${errorText}` + ); + } + + return res; + } + + /** + * Parses a Server-Sent Events stream from the fetch response body. + * Format: "event: {type}\ndata: {json}\n\n" + * Terminal signal: "data: [DONE]\n\n" + * Per SSE spec, multiple data: lines within a single event are joined with \n. + */ + private async parseSSEStream( + body: ReadableStream, + onEvent: (event: StreamingEvent) => void + ): Promise { + const reader = body.getReader(); + const decoder = new TextDecoder(); + const bufferParts: string[] = []; + let parseError: Error | null = null; + + try { + while (true) { + const { done, value } = await reader.read(); + if (done) break; + + bufferParts.push(decoder.decode(value, { stream: true })); + const buffer = bufferParts.join(''); + + // Process complete SSE blocks (separated by double newlines) + const blocks = buffer.split('\n\n'); + // Keep the last (potentially incomplete) block for next iteration + const incomplete = blocks.pop() ?? ''; + bufferParts.length = 0; + if (incomplete) bufferParts.push(incomplete); + + for (const block of blocks) { + if (parseError) break; + + const trimmed = block.trim(); + if (!trimmed) continue; + + // Check for terminal signal + if (trimmed === 'data: [DONE]') { + return; + } + + // Parse SSE fields — per spec, multiple data: lines are joined with \n + const dataLines: string[] = []; + for (const line of trimmed.split('\n')) { + if (line.startsWith('data: ')) { + dataLines.push(line.slice(6)); + } else if (line === 'data:') { + dataLines.push(''); + } + // 'event:' field is informational; the type is inside the JSON data + } + const eventData = dataLines.length > 0 ? dataLines.join('\n') : undefined; + + if (eventData) { + try { + const parsed = JSON.parse(eventData) as StreamingEvent; + onEvent(parsed); + } catch (e) { + parseError = new Error( + `Failed to parse streaming event: ${e instanceof Error ? e.message : String(e)}`, + { cause: e } + ); + } + } + } + } + } finally { + reader.releaseLock(); + } + + if (parseError) { + throw parseError; + } + } +} diff --git a/sdk_v2/js/src/types.ts b/sdk_v2/js/src/types.ts index 2b298c84..3a781a64 100644 --- a/sdk_v2/js/src/types.ts +++ b/sdk_v2/js/src/types.ts @@ -62,3 +62,353 @@ export interface ToolChoice { type: string; name?: string; } + +// ============================================================================ +// Responses API Types +// Aligned with OpenAI Responses API / OpenResponses spec and +// neutron-server src/FoundryLocalCore/Core/Responses/Contracts/ +// ============================================================================ + +/** Status of a Response object. */ +export type ResponseStatus = 'queued' | 'in_progress' | 'completed' | 'failed' | 'incomplete' | 'cancelled'; + +/** Role of a message in the Responses API. */ +export type MessageRole = 'system' | 'user' | 'assistant' | 'developer'; + +/** Status of an individual response item. */ +export type ResponseItemStatus = 'in_progress' | 'completed' | 'incomplete'; + +/** Controls which tool the model should use. */ +export type ResponseToolChoice = 'none' | 'auto' | 'required' | ResponseToolChoiceFunction; + +export interface ResponseToolChoiceFunction { + type: 'function'; + name: string; +} + +/** Truncation strategy. */ +export type TruncationStrategy = 'auto' | 'disabled'; + +/** Service tier. */ +export type ServiceTier = 'default' | 'auto' | 'flex' | 'priority'; + +// --- Content Parts --- + +export interface InputTextContent { + type: 'input_text'; + text: string; +} + +export interface OutputTextContent { + type: 'output_text'; + text: string; + annotations?: Annotation[]; + logprobs?: LogProb[]; +} + +export interface RefusalContent { + type: 'refusal'; + refusal: string; +} + +export type ContentPart = InputTextContent | OutputTextContent | RefusalContent; + +export interface Annotation { + type: string; + start_index: number; + end_index: number; +} + +export interface UrlCitationAnnotation extends Annotation { + type: 'url_citation'; + url: string; + title: string; +} + +export interface LogProb { + token: string; + logprob: number; + bytes?: number[]; +} + +// --- Function Tools --- + +export interface FunctionToolDefinition { + type: 'function'; + name: string; + description?: string; + parameters?: Record; + strict?: boolean; +} + +// --- Response Items (input & output) --- + +export interface MessageItem { + type: 'message'; + id?: string; + role: MessageRole; + content: string | ContentPart[]; + status?: ResponseItemStatus; +} + +export interface FunctionCallItem { + type: 'function_call'; + id?: string; + call_id: string; + name: string; + arguments: string; + status?: ResponseItemStatus; +} + +export interface FunctionCallOutputItem { + type: 'function_call_output'; + id?: string; + call_id: string; + output: string | ContentPart[]; + status?: ResponseItemStatus; +} + +export interface ItemReference { + type: 'item_reference'; + id: string; +} + +export interface ReasoningItem { + type: 'reasoning'; + id?: string; + content?: ContentPart[]; + encrypted_content?: string; + summary?: string; + status?: ResponseItemStatus; +} + +export type ResponseInputItem = MessageItem | FunctionCallItem | FunctionCallOutputItem | ItemReference | ReasoningItem; +export type ResponseOutputItem = MessageItem | FunctionCallItem | ReasoningItem; + +// --- Reasoning & Text Config --- + +export interface ReasoningConfig { + effort?: string; + summary?: string; +} + +export interface TextFormat { + type: string; + name?: string; + description?: string; + schema?: unknown; + strict?: boolean; +} + +export interface TextConfig { + format?: TextFormat; + verbosity?: string; +} + +// --- Response Usage --- + +export interface ResponseUsage { + input_tokens: number; + output_tokens: number; + total_tokens: number; + input_tokens_details?: { cached_tokens: number }; + output_tokens_details?: { reasoning_tokens: number }; +} + +// --- Response Error --- + +export interface ResponseError { + code: string; + message: string; +} + +export interface IncompleteDetails { + reason: string; +} + +// --- Response Create Request --- + +export interface ResponseCreateParams { + model?: string; + input?: string | ResponseInputItem[]; + instructions?: string; + previous_response_id?: string; + tools?: FunctionToolDefinition[]; + tool_choice?: ResponseToolChoice; + temperature?: number; + top_p?: number; + max_output_tokens?: number; + frequency_penalty?: number; + presence_penalty?: number; + truncation?: TruncationStrategy; + parallel_tool_calls?: boolean; + store?: boolean; + metadata?: Record; + stream?: boolean; + reasoning?: ReasoningConfig; + text?: TextConfig; + seed?: number; + user?: string; +} + +// --- Response Object --- + +export interface ResponseObject { + id: string; + object: 'response'; + created_at: number; + completed_at?: number | null; + failed_at?: number | null; + cancelled_at?: number | null; + status: ResponseStatus; + incomplete_details?: IncompleteDetails | null; + model: string; + previous_response_id?: string | null; + instructions?: string | null; + output: ResponseOutputItem[]; + error?: ResponseError | null; + tools: FunctionToolDefinition[]; + tool_choice: ResponseToolChoice; + truncation: TruncationStrategy; + parallel_tool_calls: boolean; + text: TextConfig; + top_p: number; + temperature: number; + presence_penalty: number; + frequency_penalty: number; + max_output_tokens?: number | null; + reasoning?: ReasoningConfig | null; + store: boolean; + metadata?: Record | null; + usage?: ResponseUsage | null; + user?: string | null; +} + +// --- Input Items List Response --- + +export interface InputItemsListResponse { + object: 'list'; + data: ResponseInputItem[]; +} + +// --- Delete Response --- + +export interface DeleteResponseResult { + id: string; + object: string; + deleted: boolean; +} + +// --- Streaming Events --- +// Scoped to events emitted by neutron-server (StreamingEvents.cs) + +export interface ResponseLifecycleEvent { + type: 'response.created' | 'response.queued' | 'response.in_progress' | 'response.completed' | 'response.failed' | 'response.incomplete'; + response: ResponseObject; + sequence_number: number; +} + +export interface OutputItemAddedEvent { + type: 'response.output_item.added'; + item_id: string; + output_index: number; + item: ResponseOutputItem; + sequence_number: number; +} + +export interface OutputItemDoneEvent { + type: 'response.output_item.done'; + item_id: string; + output_index: number; + item: ResponseOutputItem; + sequence_number: number; +} + +export interface ContentPartAddedEvent { + type: 'response.content_part.added'; + item_id: string; + content_index: number; + part: ContentPart; + sequence_number: number; +} + +export interface ContentPartDoneEvent { + type: 'response.content_part.done'; + item_id: string; + content_index: number; + part: ContentPart; + sequence_number: number; +} + +export interface OutputTextDeltaEvent { + type: 'response.output_text.delta'; + item_id: string; + output_index: number; + content_index: number; + delta: string; + sequence_number: number; +} + +export interface OutputTextDoneEvent { + type: 'response.output_text.done'; + item_id: string; + output_index: number; + content_index: number; + text: string; + sequence_number: number; +} + +export interface RefusalDeltaEvent { + type: 'response.refusal.delta'; + item_id: string; + content_index: number; + delta: string; + sequence_number: number; +} + +export interface RefusalDoneEvent { + type: 'response.refusal.done'; + item_id: string; + content_index: number; + refusal: string; + sequence_number: number; +} + +export interface FunctionCallArgsDeltaEvent { + type: 'response.function_call_arguments.delta'; + item_id: string; + output_index: number; + delta: string; + sequence_number: number; +} + +export interface FunctionCallArgsDoneEvent { + type: 'response.function_call_arguments.done'; + item_id: string; + output_index: number; + arguments: string; + name: string; + sequence_number: number; +} + +export interface StreamingErrorEvent { + type: 'error'; + code?: string; + message?: string; + param?: string; + sequence_number: number; +} + +export type StreamingEvent = + | ResponseLifecycleEvent + | OutputItemAddedEvent + | OutputItemDoneEvent + | ContentPartAddedEvent + | ContentPartDoneEvent + | OutputTextDeltaEvent + | OutputTextDoneEvent + | RefusalDeltaEvent + | RefusalDoneEvent + | FunctionCallArgsDeltaEvent + | FunctionCallArgsDoneEvent + | StreamingErrorEvent; diff --git a/sdk_v2/js/test/openai/responsesClient.test.ts b/sdk_v2/js/test/openai/responsesClient.test.ts new file mode 100644 index 00000000..6e45f465 --- /dev/null +++ b/sdk_v2/js/test/openai/responsesClient.test.ts @@ -0,0 +1,571 @@ +import { describe, it, before, after } from 'mocha'; +import { expect } from 'chai'; +import { getTestManager, TEST_MODEL_ALIAS, IS_RUNNING_IN_CI } from '../testUtils.js'; +import { ResponsesClient, ResponsesClientSettings, getOutputText } from '../../src/openai/responsesClient.js'; +import type { + StreamingEvent, + FunctionToolDefinition, + ResponseInputItem, + ResponseObject, + MessageItem, +} from '../../src/types.js'; +import { FoundryLocalManager } from '../../src/foundryLocalManager.js'; +import { Model } from '../../src/model.js'; + +describe('ResponsesClient Tests', () => { + + // ======================================================================== + // Settings serialization + // ======================================================================== + + describe('ResponsesClientSettings', () => { + it('should serialize only defined settings', () => { + const settings = new ResponsesClientSettings(); + settings.temperature = 0.5; + settings.maxOutputTokens = 200; + + const result = settings._serialize(); + + expect(result.temperature).to.equal(0.5); + expect(result.max_output_tokens).to.equal(200); + expect(result.top_p).to.be.undefined; + expect(result.frequency_penalty).to.be.undefined; + expect(result.instructions).to.be.undefined; + }); + + it('should serialize all settings including instructions', () => { + const settings = new ResponsesClientSettings(); + settings.instructions = 'You are a helpful assistant.'; + settings.temperature = 0.7; + settings.topP = 0.9; + settings.maxOutputTokens = 500; + settings.frequencyPenalty = 0.1; + settings.presencePenalty = 0.2; + settings.toolChoice = 'auto'; + settings.truncation = 'auto'; + settings.parallelToolCalls = true; + settings.store = true; + settings.metadata = { key: 'value' }; + settings.seed = 42; + + const result = settings._serialize(); + + expect(result.instructions).to.equal('You are a helpful assistant.'); + expect(result.temperature).to.equal(0.7); + expect(result.top_p).to.equal(0.9); + expect(result.max_output_tokens).to.equal(500); + expect(result.frequency_penalty).to.equal(0.1); + expect(result.presence_penalty).to.equal(0.2); + expect(result.tool_choice).to.equal('auto'); + expect(result.truncation).to.equal('auto'); + expect(result.parallel_tool_calls).to.be.true; + expect(result.store).to.be.true; + expect(result.metadata).to.deep.equal({ key: 'value' }); + expect(result.seed).to.equal(42); + }); + + it('should return empty object when no settings defined', () => { + const settings = new ResponsesClientSettings(); + const result = settings._serialize(); + expect(Object.keys(result).length).to.equal(0); + }); + }); + + // ======================================================================== + // getOutputText helper + // ======================================================================== + + describe('getOutputText', () => { + it('should extract text from string content', () => { + const response: ResponseObject = { + id: 'resp_1', object: 'response', created_at: 0, status: 'completed', + model: 'test', output: [ + { type: 'message', role: 'assistant', content: 'Hello world' } as MessageItem, + ], + tools: [], tool_choice: 'auto', truncation: 'disabled', + parallel_tool_calls: false, text: {}, top_p: 1, temperature: 1, + presence_penalty: 0, frequency_penalty: 0, store: false, + }; + expect(getOutputText(response)).to.equal('Hello world'); + }); + + it('should extract text from content parts array', () => { + const response: ResponseObject = { + id: 'resp_2', object: 'response', created_at: 0, status: 'completed', + model: 'test', output: [ + { + type: 'message', role: 'assistant', + content: [ + { type: 'output_text', text: 'Part 1' }, + { type: 'output_text', text: ' Part 2' }, + ], + } as MessageItem, + ], + tools: [], tool_choice: 'auto', truncation: 'disabled', + parallel_tool_calls: false, text: {}, top_p: 1, temperature: 1, + presence_penalty: 0, frequency_penalty: 0, store: false, + }; + expect(getOutputText(response)).to.equal('Part 1 Part 2'); + }); + + it('should return empty string when no assistant message', () => { + const response: ResponseObject = { + id: 'resp_3', object: 'response', created_at: 0, status: 'completed', + model: 'test', output: [], + tools: [], tool_choice: 'auto', truncation: 'disabled', + parallel_tool_calls: false, text: {}, top_p: 1, temperature: 1, + presence_penalty: 0, frequency_penalty: 0, store: false, + }; + expect(getOutputText(response)).to.equal(''); + }); + + it('should skip non-assistant messages', () => { + const response: ResponseObject = { + id: 'resp_4', object: 'response', created_at: 0, status: 'completed', + model: 'test', output: [ + { type: 'message', role: 'user', content: 'User msg' } as MessageItem, + { type: 'message', role: 'assistant', content: 'Assistant msg' } as MessageItem, + ], + tools: [], tool_choice: 'auto', truncation: 'disabled', + parallel_tool_calls: false, text: {}, top_p: 1, temperature: 1, + presence_penalty: 0, frequency_penalty: 0, store: false, + }; + expect(getOutputText(response)).to.equal('Assistant msg'); + }); + + it('should skip refusal content parts', () => { + const response: ResponseObject = { + id: 'resp_5', object: 'response', created_at: 0, status: 'completed', + model: 'test', output: [ + { + type: 'message', role: 'assistant', + content: [ + { type: 'refusal', refusal: 'Cannot do that' }, + { type: 'output_text', text: 'But here is something' }, + ], + } as MessageItem, + ], + tools: [], tool_choice: 'auto', truncation: 'disabled', + parallel_tool_calls: false, text: {}, top_p: 1, temperature: 1, + presence_penalty: 0, frequency_penalty: 0, store: false, + }; + expect(getOutputText(response)).to.equal('But here is something'); + }); + }); + + // ======================================================================== + // Constructor validation + // ======================================================================== + + describe('constructor', () => { + it('should create client with valid baseUrl', () => { + const client = new ResponsesClient('http://localhost:5273'); + expect(client).to.be.instanceOf(ResponsesClient); + }); + + it('should create client with baseUrl and modelId', () => { + const client = new ResponsesClient('http://localhost:5273', 'test-model'); + expect(client).to.be.instanceOf(ResponsesClient); + }); + + it('should strip trailing slash from baseUrl', () => { + const client = new ResponsesClient('http://localhost:5273/'); + expect(client).to.be.instanceOf(ResponsesClient); + }); + + it('should throw for empty baseUrl', () => { + expect(() => new ResponsesClient('')).to.throw('baseUrl must be a non-empty string.'); + }); + + it('should throw for null baseUrl', () => { + expect(() => new ResponsesClient(null as any)).to.throw('baseUrl must be a non-empty string.'); + }); + }); + + // ======================================================================== + // Input validation + // ======================================================================== + + describe('input validation', () => { + const client = new ResponsesClient('http://localhost:5273', 'test-model'); + + it('should throw for null input', async () => { + try { + await client.create(null as any); + expect.fail('Should have thrown'); + } catch (error) { + expect(error).to.be.instanceOf(Error); + expect((error as Error).message).to.include('Input cannot be null or undefined'); + } + }); + + it('should throw for undefined input', async () => { + try { + await client.create(undefined as any); + expect.fail('Should have thrown'); + } catch (error) { + expect(error).to.be.instanceOf(Error); + expect((error as Error).message).to.include('Input cannot be null or undefined'); + } + }); + + it('should throw for empty string input', async () => { + try { + await client.create(' '); + expect.fail('Should have thrown'); + } catch (error) { + expect(error).to.be.instanceOf(Error); + expect((error as Error).message).to.include('Input string cannot be empty'); + } + }); + + it('should throw for empty array input', async () => { + try { + await client.create([]); + expect.fail('Should have thrown'); + } catch (error) { + expect(error).to.be.instanceOf(Error); + expect((error as Error).message).to.include('Input items array cannot be empty'); + } + }); + + it('should throw for input items without type', async () => { + try { + await client.create([{ role: 'user', content: 'hi' } as any]); + expect.fail('Should have thrown'); + } catch (error) { + expect(error).to.be.instanceOf(Error); + expect((error as Error).message).to.include('must have a "type" property'); + } + }); + }); + + // ======================================================================== + // Tool validation + // ======================================================================== + + describe('tool validation', () => { + const client = new ResponsesClient('http://localhost:5273', 'test-model'); + + it('should throw for non-array tools', async () => { + try { + await client.create('Hello', { tools: 'not-array' as any }); + expect.fail('Should have thrown'); + } catch (error) { + expect(error).to.be.instanceOf(Error); + expect((error as Error).message).to.include('Tools must be an array'); + } + }); + + it('should throw for tools with invalid type', async () => { + try { + await client.create('Hello', { + tools: [{ type: 'invalid', name: 'test' } as any] + }); + expect.fail('Should have thrown'); + } catch (error) { + expect(error).to.be.instanceOf(Error); + expect((error as Error).message).to.include('type "function"'); + } + }); + + it('should throw for tools without name', async () => { + try { + await client.create('Hello', { + tools: [{ type: 'function' } as any] + }); + expect.fail('Should have thrown'); + } catch (error) { + expect(error).to.be.instanceOf(Error); + expect((error as Error).message).to.include('"name" property'); + } + }); + }); + + // ======================================================================== + // Streaming callback validation + // ======================================================================== + + describe('streaming callback validation', () => { + const client = new ResponsesClient('http://localhost:5273', 'test-model'); + + it('should throw for null callback', async () => { + try { + await client.createStreaming('Hello', null as any); + expect.fail('Should have thrown'); + } catch (error) { + expect(error).to.be.instanceOf(Error); + expect((error as Error).message).to.include('Callback must be a valid function'); + } + }); + + it('should throw for non-function callback', async () => { + try { + await client.createStreaming('Hello', 'not-a-function' as any); + expect.fail('Should have thrown'); + } catch (error) { + expect(error).to.be.instanceOf(Error); + expect((error as Error).message).to.include('Callback must be a valid function'); + } + }); + }); + + // ======================================================================== + // Model ID validation + // ======================================================================== + + describe('model validation', () => { + it('should throw when no model specified anywhere', async () => { + const client = new ResponsesClient('http://localhost:5273'); + try { + await client.create('Hello'); + expect.fail('Should have thrown'); + } catch (error) { + expect(error).to.be.instanceOf(Error); + expect((error as Error).message).to.include('Model must be specified'); + } + }); + }); + + // ======================================================================== + // ID parameter validation + // ======================================================================== + + describe('ID parameter validation', () => { + const client = new ResponsesClient('http://localhost:5273', 'test-model'); + + const methods: Array<[string, (c: ResponsesClient, id: string) => Promise]> = [ + ['get', (c, id) => c.get(id)], + ['delete', (c, id) => c.delete(id)], + ['cancel', (c, id) => c.cancel(id)], + ['getInputItems', (c, id) => c.getInputItems(id)], + ]; + + for (const [methodName, fn] of methods) { + it(`should throw for empty responseId on ${methodName}`, async () => { + try { + await fn(client, ''); + expect.fail('Should have thrown'); + } catch (error) { + expect(error).to.be.instanceOf(Error); + expect((error as Error).message).to.include('responseId must be a non-empty string'); + } + }); + } + + it('should throw for excessively long responseId', async () => { + const longId = 'a'.repeat(1025); + try { + await client.get(longId); + expect.fail('Should have thrown'); + } catch (error) { + expect(error).to.be.instanceOf(Error); + expect((error as Error).message).to.include('exceeds maximum length'); + } + }); + }); + + // ======================================================================== + // Integration tests (require running web service + loaded model) + // ======================================================================== + + describe('Integration (requires model + web service)', function() { + let manager: FoundryLocalManager; + let model: Model; + let client: ResponsesClient; + let skipped = false; + + before(async function() { + this.timeout(30000); + if (IS_RUNNING_IN_CI) { + skipped = true; + this.skip(); + return; + } + + manager = getTestManager(); + const catalog = manager.catalog; + + const cachedModels = await catalog.getCachedModels(); + const cachedVariant = cachedModels.find(m => m.alias === TEST_MODEL_ALIAS); + if (!cachedVariant) { + skipped = true; + this.skip(); + return; + } + + model = await catalog.getModel(TEST_MODEL_ALIAS); + model.selectVariant(cachedVariant.id); + await model.load(); + manager.startWebService(); + client = manager.createResponsesClient(cachedVariant.id); + client.settings.temperature = 0.0; + client.settings.maxOutputTokens = 100; + }); + + after(async function() { + if (skipped) return; + try { manager.stopWebService(); } catch { /* ignore */ } + try { await model.unload(); } catch { /* ignore */ } + }); + + it('should create a non-streaming response', async function() { + this.timeout(30000); + + const response = await client.create('What is 2 + 2? Answer with just the number.'); + + expect(response).to.not.be.undefined; + expect(response.id).to.be.a('string'); + expect(response.status).to.equal('completed'); + expect(response.output).to.be.an('array'); + expect(response.output.length).to.be.greaterThan(0); + + const text = getOutputText(response); + expect(text.length).to.be.greaterThan(0); + console.log(`Response: ${text}`); + }); + + it('should create a streaming response', async function() { + this.timeout(30000); + + const events: StreamingEvent[] = []; + let textAccumulated = ''; + + await client.createStreaming( + 'What is 3 + 5? Answer with just the number.', + (event) => { + events.push(event); + if (event.type === 'response.output_text.delta') { + textAccumulated += event.delta; + } + } + ); + + expect(events.length).to.be.greaterThan(0); + + // Should have lifecycle events + expect(events.find(e => e.type === 'response.created')).to.not.be.undefined; + expect(events.find(e => e.type === 'response.completed')).to.not.be.undefined; + + // Should have text deltas + expect(events.some(e => e.type === 'response.output_text.delta')).to.be.true; + + expect(textAccumulated.length).to.be.greaterThan(0); + console.log(`Streamed text: ${textAccumulated}`); + }); + + it('should create response with input items array', async function() { + this.timeout(30000); + + const input: ResponseInputItem[] = [ + { + type: 'message', + role: 'user', + content: 'What is 10 minus 3? Answer with just the number.', + } as MessageItem, + ]; + + const response = await client.create(input); + + expect(response.status).to.equal('completed'); + expect(response.output.length).to.be.greaterThan(0); + + const text = getOutputText(response); + expect(text.length).to.be.greaterThan(0); + console.log(`Input items response: ${text}`); + }); + + it('should use instructions from settings', async function() { + this.timeout(30000); + + const instrClient = manager.createResponsesClient(model.id); + instrClient.settings.temperature = 0.0; + instrClient.settings.maxOutputTokens = 100; + instrClient.settings.instructions = 'Always respond in exactly one word.'; + + const response = await instrClient.create('What color is the sky?'); + + expect(response.status).to.equal('completed'); + const text = getOutputText(response); + expect(text.length).to.be.greaterThan(0); + console.log(`With instructions: ${text}`); + }); + + it('should get and delete a stored response', async function() { + this.timeout(30000); + + client.settings.store = true; + try { + const createResult = await client.create('Say hello'); + expect(createResult.id).to.be.a('string'); + + // Retrieve it + const retrieved = await client.get(createResult.id); + expect(retrieved.id).to.equal(createResult.id); + expect(retrieved.status).to.equal('completed'); + + // Get input items + const inputItems = await client.getInputItems(createResult.id); + expect(inputItems).to.not.be.undefined; + expect(inputItems.data).to.be.an('array'); + + // Delete it + const deleted = await client.delete(createResult.id); + expect(deleted.deleted).to.be.true; + } finally { + client.settings.store = undefined; + } + }); + + it('should chain responses via previous_response_id', async function() { + this.timeout(30000); + + client.settings.store = true; + try { + const first = await client.create('Remember: the secret word is "banana".'); + expect(first.id).to.be.a('string'); + + const second = await client.create('What is the secret word?', { + previous_response_id: first.id, + }); + expect(second.previous_response_id).to.equal(first.id); + + const text = getOutputText(second); + console.log(`Chained response: ${text}`); + } finally { + client.settings.store = undefined; + } + }); + + it('should create response with tool calling', async function() { + this.timeout(30000); + + const tools: FunctionToolDefinition[] = [{ + type: 'function', + name: 'get_weather', + description: 'Get the current weather for a location.', + parameters: { + type: 'object', + properties: { + location: { type: 'string', description: 'City name' } + }, + required: ['location'] + } + }]; + + const response = await client.create( + 'What is the weather in Seattle?', + { tools, tool_choice: 'required' } + ); + + expect(response).to.not.be.undefined; + expect(response.output.length).to.be.greaterThan(0); + + const functionCall = response.output.find((o: any) => o.type === 'function_call'); + if (functionCall) { + console.log(`Tool call: ${JSON.stringify(functionCall)}`); + expect((functionCall as any).name).to.equal('get_weather'); + } + }); + }); +});