Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 46 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,25 @@ Foundry Local lets you embed generative AI directly into your applications — n
Key benefits include:

- **Self-contained SDK** — Ship AI features without requiring users to install any external dependencies.
- **Chat AND Audio in one runtime** — Text generation and speech-to-text (Whisper) through a single SDK — no need for separate tools like `whisper.cpp` + `llama.cpp`.
- **Easy-to-use CLI** — Explore models and experiment locally before integrating with your app.
- **Optimized models out-of-the-box** — State-of-the-art quantization and compression deliver both performance and quality.
- **Small footprint** — Leverages [ONNX Runtime](https://onnxruntime.ai/); a high performance inference runtime (written in C++) that has minimal disk and memory requirements.
- **Automatic hardware acceleration** — Leverage GPUs and NPUs when available, with seamless fallback to CPU.
- **Model distribution** — Popular open-source models hosted in the cloudwith automatic downloading and updating.
- **Automatic hardware acceleration** — Leverage GPUs and NPUs when available, with seamless fallback to CPU. Zero hardware detection code needed.
- **Model distribution** — Popular open-source models hosted in the cloud with automatic downloading and updating.
- **Multi-platform support** — Windows, macOS (Apple silicon), Linux and Android.
- **Bring your own models** — Add and run custom models alongside the built-in catalog.

### Supported Tasks

| Task | Model Aliases | API |
|------|--------------|-----|
| Chat / Text Generation | `phi-3.5-mini`, `qwen2.5-0.5b`, `qwen2.5-coder-0.5b`, etc. | Chat Completions |
| Audio Transcription (Speech-to-Text) | `whisper-tiny` | Audio Transcription |

> [!NOTE]
> Foundry Local is a **unified local AI runtime** — it replaces the need for separate tools like `whisper.cpp`, `llama.cpp`, or `ollama`. One SDK handles both chat and audio, with automatic hardware acceleration (NPU > GPU > CPU).

## 🚀 Quickstart

### Explore with the CLI
Expand Down Expand Up @@ -196,10 +207,41 @@ Explore complete working examples in the [`samples/`](samples/) folder:

| Sample | Description |
|--------|-------------|
| [**cs/**](samples/cs/) | C# examples using the .NET SDK |
| [**js/**](samples/js/) | JavaScript/Node.js examples |
| [**cs/**](samples/cs/) | C# examples using the .NET SDK (includes audio transcription) |
| [**js/**](samples/js/) | JavaScript/Node.js examples (chat, audio transcription, tool calling) |
| [**python/**](samples/python/) | Python examples using the OpenAI-compatible API |

#### Audio Transcription (Speech-to-Text)

The SDK also supports audio transcription via Whisper models. Use `model.createAudioClient()` to transcribe audio files on-device:

```javascript
import { FoundryLocalManager } from 'foundry-local-sdk';

const manager = FoundryLocalManager.create({ appName: 'MyApp' });

// Download and load the Whisper model
const whisperModel = await manager.catalog.getModel('whisper-tiny');
await whisperModel.download();
await whisperModel.load();

// Transcribe an audio file
const audioClient = whisperModel.createAudioClient();
audioClient.settings.language = 'en';
const result = await audioClient.transcribe('recording.wav');
console.log('Transcription:', result.text);

// Or stream in real-time
await audioClient.transcribeStreaming('recording.wav', (chunk) => {
process.stdout.write(chunk.text);
});

await whisperModel.unload();
```

> [!TIP]
> A single `FoundryLocalManager` can manage both chat and audio models simultaneously. See the [chat-and-audio sample](samples/js/chat-and-audio-foundry-local/) for a complete example that transcribes audio then analyzes it with a chat model.

## Manage

This section provides an overview of how to manage Foundry Local, including installation, upgrading, and removing the application.
Expand Down
19 changes: 18 additions & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,21 @@ Documentation for Foundry Local can be found in the following resources:
- SDK Reference:
- [C# SDK Reference](../sdk_v2/cs/README.md): This documentation provides detailed information about the C# SDK for Foundry Local, including API references, usage examples, and best practices for integrating Foundry Local into your applications.
- [JavaScript SDK Reference](../sdk_v2/js/README.md): This documentation offers detailed information about the JavaScript SDK for Foundry Local, including API references, usage examples, and best practices for integrating Foundry Local into your web applications.
- [Foundry Local Lab](https://github.com/Microsoft-foundry/foundry-local-lab): This GitHub repository contains a lab designed to help you learn how to use Foundry Local effectively. It includes hands-on exercises, sample code, and step-by-step instructions to guide you through the process of setting up and using Foundry Local in various scenarios.
- [Foundry Local Lab](https://github.com/Microsoft-foundry/foundry-local-lab): This GitHub repository contains a lab designed to help you learn how to use Foundry Local effectively. It includes hands-on exercises, sample code, and step-by-step instructions to guide you through the process of setting up and using Foundry Local in various scenarios.

## Supported Capabilities

Foundry Local is a unified local AI runtime that supports both **text generation** and **speech-to-text** through a single SDK:

| Capability | Model Aliases | SDK API |
|------------|--------------|---------|
| Chat Completions (Text Generation) | `phi-3.5-mini`, `qwen2.5-0.5b`, etc. | `model.createChatClient()` |
| Audio Transcription (Speech-to-Text) | `whisper-tiny` | `model.createAudioClient()` |

## Samples

- [JavaScript: Chat (Hello Foundry Local)](../samples/js/hello-foundry-local/) — Basic chat completions
- [JavaScript: Audio Transcription](../samples/js/audio-transcription-foundry-local/) — Speech-to-text with Whisper
- [JavaScript: Chat + Audio](../samples/js/chat-and-audio-foundry-local/) — Unified chat and audio in one app
- [JavaScript: Tool Calling](../samples/js/tool-calling-foundry-local/) — Function calling with local models
- [C#: Getting Started](../samples/cs/GettingStarted/) — C# SDK examples including audio transcription
39 changes: 39 additions & 0 deletions samples/js/audio-transcription-foundry-local/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Sample: Audio Transcription with Foundry Local

This sample demonstrates how to use Foundry Local for **speech-to-text (audio transcription)** using the Whisper model — entirely on-device, with no cloud services required.

## What This Shows

- Loading the `whisper-tiny` model via the Foundry Local SDK
- Transcribing an audio file (`.wav`, `.mp3`, etc.) to text
- Both standard and streaming transcription modes
- Automatic hardware acceleration (NPU > GPU > CPU)

## Prerequisites

- [Foundry Local](https://github.com/microsoft/Foundry-Local) installed on your machine
- Node.js 18+

## Getting Started

Install the Foundry Local SDK:

```bash
npm install foundry-local-sdk
```

Place an audio file (e.g., `recording.wav` or `recording.mp3`) in the project directory, then run:

```bash
node src/app.js
```

## How It Works

The Foundry Local SDK handles everything:
1. **Model discovery** — finds the best `whisper-tiny` variant for your hardware
2. **Model download** — downloads the model if not already cached
3. **Model loading** — loads the model into memory with optimized hardware acceleration
4. **Transcription** — runs Whisper inference entirely on-device

No need for `whisper.cpp`, `@huggingface/transformers`, or any other separate STT tool.
11 changes: 11 additions & 0 deletions samples/js/audio-transcription-foundry-local/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"name": "audio-transcription-foundry-local",
"type": "module",
"description": "Audio transcription (speech-to-text) sample using Foundry Local",
"scripts": {
"start": "node src/app.js"
},
"dependencies": {
"foundry-local-sdk": "latest"
}
}
64 changes: 64 additions & 0 deletions samples/js/audio-transcription-foundry-local/src/app.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.

import { FoundryLocalManager } from "foundry-local-sdk";
import path from "path";

// The Whisper model alias for audio transcription
const alias = "whisper-tiny";

async function main() {
console.log("Initializing Foundry Local SDK...");
const manager = FoundryLocalManager.create({
appName: "AudioTranscriptionSample",
logLevel: "info",
});

// Get the Whisper model from the catalog
const catalog = manager.catalog;
const model = await catalog.getModel(alias);
if (!model) {
throw new Error(
`Model "${alias}" not found. Run "foundry model list" to see available models.`
);
}

// Download the model if not already cached
if (!model.isCached) {
console.log(`Downloading model "${alias}"...`);
await model.download((progress) => {
process.stdout.write(`\rDownload progress: ${progress.toFixed(1)}%`);
});
console.log("\nDownload complete.");
}

// Load the model into memory
console.log(`Loading model "${model.id}"...`);
await model.load();
console.log("Model loaded.\n");

// Create an audio client for transcription
const audioClient = model.createAudioClient();
audioClient.settings.language = "en";

// Update this path to point to your audio file
const audioFilePath = path.resolve("recording.mp3");

// --- Standard transcription ---
console.log("=== Standard Transcription ===");
const result = await audioClient.transcribe(audioFilePath);
console.log("Transcription:", result.text);

// --- Streaming transcription ---
console.log("\n=== Streaming Transcription ===");
await audioClient.transcribeStreaming(audioFilePath, (chunk) => {
process.stdout.write(chunk.text);
});
console.log("\n");

// Clean up
await model.unload();
console.log("Done.");
}

main().catch(console.error);
39 changes: 39 additions & 0 deletions samples/js/chat-and-audio-foundry-local/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Sample: Chat + Audio Transcription with Foundry Local

This sample demonstrates how to use Foundry Local as a **unified AI runtime** for both **text generation (chat)** and **speech-to-text (audio transcription)** — all on-device, with a single SDK managing both models.

## What This Shows

- Using a single `FoundryLocalManager` to manage both chat and audio models
- Transcribing an audio file using the `whisper-tiny` model
- Analyzing the transcription using the `phi-3.5-mini` chat model
- Automatic hardware acceleration for both models — zero hardware detection code needed

## Why Foundry Local?

Without Foundry Local, building an app with both chat and speech-to-text typically requires:
- A separate STT library (`whisper.cpp`, `@huggingface/transformers`)
- A separate LLM runtime (`llama.cpp`, `node-llama-cpp`)
- Custom hardware detection code for each runtime (~200+ lines)
- Separate model download and caching logic

With Foundry Local, you get **one SDK, one service, both capabilities** — and the hardware detection is automatic.

## Prerequisites

- [Foundry Local](https://github.com/microsoft/Foundry-Local) installed on your machine
- Node.js 18+

## Getting Started

Install the Foundry Local SDK:

```bash
npm install foundry-local-sdk
```

Place an audio file (`recording.mp3`) in the project directory, then run:

```bash
node src/app.js
```
11 changes: 11 additions & 0 deletions samples/js/chat-and-audio-foundry-local/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"name": "chat-and-audio-foundry-local",
"type": "module",
"description": "Unified chat + audio transcription sample using Foundry Local",
"scripts": {
"start": "node src/app.js"
},
"dependencies": {
"foundry-local-sdk": "latest"
}
}
103 changes: 103 additions & 0 deletions samples/js/chat-and-audio-foundry-local/src/app.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.

import { FoundryLocalManager } from "foundry-local-sdk";
import path from "path";

// Model aliases
const CHAT_MODEL = "phi-3.5-mini";
const WHISPER_MODEL = "whisper-tiny";

async function main() {
console.log("Initializing Foundry Local SDK...");
const manager = FoundryLocalManager.create({
appName: "ChatAndAudioSample",
logLevel: "info",
});

const catalog = manager.catalog;

// --- Load both models ---
console.log("\n--- Loading models ---");

const chatModel = await catalog.getModel(CHAT_MODEL);
if (!chatModel) {
throw new Error(
`Chat model "${CHAT_MODEL}" not found. Run "foundry model list" to see available models.`
);
}

const whisperModel = await catalog.getModel(WHISPER_MODEL);
if (!whisperModel) {
throw new Error(
`Whisper model "${WHISPER_MODEL}" not found. Run "foundry model list" to see available models.`
);
}

// Download models if not cached
if (!chatModel.isCached) {
console.log(`Downloading ${CHAT_MODEL}...`);
await chatModel.download((progress) => {
process.stdout.write(`\r ${CHAT_MODEL}: ${progress.toFixed(1)}%`);
});
console.log();
}

if (!whisperModel.isCached) {
console.log(`Downloading ${WHISPER_MODEL}...`);
await whisperModel.download((progress) => {
process.stdout.write(`\r ${WHISPER_MODEL}: ${progress.toFixed(1)}%`);
});
console.log();
}

// Load both models into memory
console.log(`Loading ${CHAT_MODEL}...`);
await chatModel.load();
console.log(`Loading ${WHISPER_MODEL}...`);
await whisperModel.load();
console.log("Both models loaded.\n");

// --- Step 1: Transcribe audio ---
console.log("=== Step 1: Audio Transcription ===");
const audioClient = whisperModel.createAudioClient();
audioClient.settings.language = "en";

// Update this path to point to your audio file
const audioFilePath = path.resolve("recording.mp3");
const transcription = await audioClient.transcribe(audioFilePath);
console.log("You said:", transcription.text);

// --- Step 2: Analyze with chat model ---
console.log("\n=== Step 2: AI Analysis ===");
const chatClient = chatModel.createChatClient();
chatClient.settings.temperature = 0.7;
chatClient.settings.maxTokens = 500;

// Summarize the transcription
console.log("Generating summary...\n");
await chatClient.completeStreamingChat(
[
{
role: "system",
content:
"You are a helpful assistant. Summarize the following transcribed audio and extract key themes and action items.",
},
{ role: "user", content: transcription.text },
],
(chunk) => {
const content = chunk.choices?.[0]?.message?.content;
if (content) {
process.stdout.write(content);
}
}
);
console.log("\n");

// --- Clean up ---
await chatModel.unload();
await whisperModel.unload();
console.log("Done.");
}

main().catch(console.error);
Loading