Skip to content

Feature/js sdk responses api#504

Open
MaanavD wants to merge 4 commits intomainfrom
feature/js-sdk-responses-api
Open

Feature/js sdk responses api#504
MaanavD wants to merge 4 commits intomainfrom
feature/js-sdk-responses-api

Conversation

@MaanavD
Copy link
Collaborator

@MaanavD MaanavD commented Mar 10, 2026

This pull request introduces comprehensive support for audio transcription (speech-to-text) using Whisper models in the Foundry Local SDK, alongside unified chat and audio capabilities. It adds new documentation, sample apps, and code examples demonstrating how to use both chat and audio features through a single SDK, highlighting automatic hardware acceleration and simplified integration. The updates make it easier for developers to build applications that combine text generation and speech-to-text without needing separate runtimes or hardware detection code.

Documentation and SDK feature updates:

  • Updated README.md and docs/README.md to clarify that Foundry Local now supports both chat (text generation) and audio transcription (speech-to-text) in a single runtime, with automatic hardware acceleration and unified APIs. Added tables listing supported tasks, model aliases, and API usage. [1] [2]
  • Expanded sample listings in documentation to include new JavaScript samples for chat, audio transcription, tool calling, and combined chat+audio apps. [1] [2]

New sample applications and examples:

  • Added samples/js/audio-transcription-foundry-local sample, including a detailed README, package.json, and implementation in src/app.js, showing how to load Whisper models and transcribe audio files (standard and streaming modes). [1] [2] [3]
  • Added samples/js/chat-and-audio-foundry-local sample, with README, package.json, and implementation in src/app.js, demonstrating unified management of chat and audio models, transcribing audio, and analyzing the transcription with a chat model. [1] [2] [3]

SDK example enhancements:

  • Introduced a new TypeScript example audio-transcription.ts in sdk_v2/js/examples, showcasing programmatic use of Whisper models for audio transcription, including model discovery, download, loading, and both standard and streaming transcription.
  • Added responses.ts example to sdk_v2/js/examples, demonstrating the Responses API for text generation, streaming, multi-turn conversations, tool calling, and response management.

MaanavD added 3 commits March 10, 2026 02:18
…and SDK examples

- Update README.md to prominently feature audio transcription (STT) alongside
  chat completions, including a Supported Tasks table, JS code examples for
  audio transcription and unified chat+audio, and updated Features section
- Add samples/js/audio-transcription-foundry-local: standalone Whisper STT sample
- Add samples/js/chat-and-audio-foundry-local: unified chat + audio sample
  demonstrating single FoundryLocalManager managing both model types
- Add sdk_v2/js/examples/audio-transcription.ts: TypeScript audio example
- Update docs/README.md with capabilities table and sample links

Addresses the discoverability gap where LLMs and developers do not know
Foundry Local supports audio transcription via Whisper models.
- Adopt upstream's restructured welcome section (key benefits list, collapsible
  SDK examples) while preserving audio transcription additions
- Merge Supported Tasks table, audio transcription code example, and enhanced
  samples table descriptions into upstream's new format
- Add Foundry Local Lab link from upstream to docs/README.md alongside our
  capabilities table and samples links
- Drop standalone Features section (key points already in welcome benefits)
…tibility

Add ResponsesClient to the JS SDK v2 with full CRUD support for the
Responses API served by Foundry Local's embedded web service.

New files:
- src/openai/responsesClient.ts: HTTP-based client with SSE streaming
- test/openai/responsesClient.test.ts: 35 tests (unit + integration)
- examples/responses.ts: end-to-end usage examples

Modified files:
- src/types.ts: Responses API types (request, response, items, events)
- src/index.ts: export ResponsesClient, ResponsesClientSettings, getOutputText
- src/foundryLocalManager.ts: createResponsesClient() factory
- src/imodel.ts: createResponsesClient(baseUrl) on IModel interface
- src/model.ts, src/modelVariant.ts: delegation/implementation

Key design decisions:
- HTTP-based (fetch + SSE), not FFI, since no CoreInterop command exists
- Factory on FoundryLocalManager (owns URL) + convenience on Model
- Types scoped to neutron-server's supported feature set
- Follows ChatClient patterns: settings serialization, validation, error chaining
- getOutputText() helper matching OpenAI Python SDK's response.output_text
@vercel
Copy link

vercel bot commented Mar 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
foundry-local Ready Ready Preview, Comment Mar 10, 2026 5:42pm

Request Review

The regex /\/+$/ used to strip trailing slashes from baseUrl was flagged
as a polynomial regular expression (ReDoS risk) by CodeQL. Replaced with
a simple while/endsWith/slice loop.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant