Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ Main flow:
- `src/index.ts`: public exports.
- `src/KittenTTS.ts`: main SDK class and lifecycle.
- `src/KittenTTSConfig.ts`: user config and defaults.
- `src/*.web.ts`: React Native Web entrypoints and platform-specific browser implementations.
- `src/KittenTTSError.ts`: SDK error codes and helpers.
- `src/KittenModel.ts`: model names, download URLs, sizes, speed priors.
- `src/KittenVoice.ts`: voice enum and display helpers.
Expand All @@ -29,6 +30,7 @@ Main flow:
- `src/engine/TTSEngine.ts`: text-to-token-to-ONNX inference.
- `src/phonemizer/CEPhonemizer.ts`: JS/Emscripten phonemizer adapter.
- `src/audio/AudioOutput.ts`: optional playback helpers.
- `src/storage/AssetStorage.ts`: web/Node asset cache abstraction used by the web platform files.
- `vendor/cephonemizer/`: vendored C++ phonemizer source.
- `scripts/build-cephonemizer.js`: builds generated phonemizer runtime.
- `scripts/patch-onnxruntime-react-native.js`: postinstall ONNX Runtime compatibility patches.
Expand Down
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
- Added Swift-parity word timing metadata via `KittenTTSResult.wordTimings`.
- Added `KittenTTS.generateStreaming()` for sentence-by-sentence generation.
- Added `tts.play(result)` so apps can inspect timings before playback.
- Added React Native Web support through browser-specific ONNX Runtime Web,
Cache API asset storage, CE phonemizer, and audio playback implementations.

## 0.8.0

Expand Down
42 changes: 36 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@
</p>

<p align="center">
On-device text-to-speech for React Native.
On-device text-to-speech for React Native and React Native Web.
<br />
Generate speech on iOS and Android without sending text to a cloud TTS API.
Generate speech on iOS, Android, and web without sending text to a cloud TTS API.
</p>

<p align="center">
Expand All @@ -21,9 +21,15 @@

> Developer preview. APIs may change between releases.

> Expo Go will not work. KittenTTS uses native modules
> (`onnxruntime-react-native` and `react-native-fs`), so Expo apps need a
> development build or a prebuilt native project.
> Expo Go will not work for native iOS/Android. KittenTTS uses native modules
> (`onnxruntime-react-native` and `react-native-fs`) on mobile, so Expo apps
> need a development build or a prebuilt native project. Web builds use
> `onnxruntime-web` and browser storage instead.

> React Native Web loads a pinned ONNX Runtime Web script and WASM assets from
> jsDelivr by default. For production apps that need CDN independence or stricter
> supply-chain controls, self-host those ONNX Runtime assets and set
> `ortWasmPath`.

## See It In Action

Expand All @@ -36,6 +42,14 @@
<strong>Device: iOS</strong> · Expo example &nbsp;&nbsp;&nbsp; <strong>Device: Android</strong> · Word timings
</p>

<p align="center">
<img src="assets/web-example.gif" alt="KittenTTS React Native Web example running in a browser" width="90%" />
</p>

<p align="center">
<strong>Web</strong> · Browser example
</p>

---

## What Is KittenTTS React Native?
Expand All @@ -60,6 +74,7 @@ No cloud. No API key. No text leaving the device for speech generation.
| --- | --- | --- |
| React Native iOS | Developer preview | [Getting started](docs/getting-started.md) |
| React Native Android | Developer preview | [Getting started](docs/getting-started.md) |
| React Native Web | Developer preview | [Getting started](docs/getting-started.md#web) |
| Expo development build | Supported | [Expo setup](docs/getting-started.md#expo-development-build) |
| Expo Go | Not supported | [Why not?](docs/troubleshooting.md#expo-go-fails) |

Expand Down Expand Up @@ -109,6 +124,21 @@ const tts = await KittenTTS.create({
await tts.speak('This voice is generated on the device.');
```

Play audio in a web build:

```tsx
import {
KittenTTS,
createBrowserAudioPlayer,
} from '@kittentts/react-native';

const tts = await KittenTTS.create({
player: createBrowserAudioPlayer(),
});

await tts.speak('This voice is generated in the browser.');
```

[Full getting started guide →](docs/getting-started.md)

---
Expand Down Expand Up @@ -153,7 +183,7 @@ If the app opens in Expo Go, stop it and run `npx expo run:ios` or

## Features

- [On-device TTS inference](docs/getting-started.md) on iOS and Android.
- [On-device TTS inference](docs/getting-started.md) on iOS, Android, and web.
- [Model download and cache](docs/reference/api.md#cache-methods) with progress callbacks.
- [Bundled offline assets](docs/guides/offline-assets.md) for apps that cannot depend on a first-run download.
- [Expo development builds](docs/getting-started.md#expo-development-build); Expo Go is [not supported](docs/troubleshooting.md#expo-go-fails).
Expand Down
Binary file added assets/web-example.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
34 changes: 34 additions & 0 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ instance, and generate speech.
| React Native | `>= 0.72` |
| iOS | `15.1+` |
| Android | API `24+` |
| Web | modern browser with WebAssembly support |
| Node.js | `20+` recommended for examples |

Expo Go will not work. KittenTTS depends on native modules:
Expand All @@ -18,6 +19,8 @@ Expo Go will not work. KittenTTS depends on native modules:
- `react-native-fs`

Use a bare React Native app, an Expo development build, or a prebuilt Expo app.
React Native Web builds use `onnxruntime-web` and do not require those native
modules at runtime.

## Install

Expand Down Expand Up @@ -57,6 +60,35 @@ npm install react-native-sound
cd ios && pod install && cd ..
```

## Web

React Native Web builds resolve the package's browser entrypoint. The web
runtime uses `onnxruntime-web`, Cache API storage for downloaded model files,
and the same JavaScript CE phonemizer.

```tsx
import {
KittenTTS,
createBrowserAudioPlayer,
} from '@kittentts/react-native';

const tts = await KittenTTS.create({
player: createBrowserAudioPlayer(),
});

await tts.speak('Hello from KittenTTS on web.');
await tts.dispose();
```

The browser path also supports `generate()`, `wordTimings`, `wavData()`, and
`wavBase64()`. Pass `ortWasmPath` if your app needs to self-host ONNX Runtime
WASM assets instead of using the SDK defaults.

By default, browser builds load the pinned ONNX Runtime Web script and WASM
assets from jsDelivr. That keeps the SDK simple to drop into React Native Web,
but production apps that require tighter supply-chain control or CDN outage
isolation should self-host those files and set `ortWasmPath` to that directory.

## Generate Audio

Use `generate()` when you want audio data back without playing it immediately.
Expand Down Expand Up @@ -117,6 +149,8 @@ await tts.dispose();

The first `KittenTTS.create()` downloads the selected model, `voices.npz`, and
phonemizer files. Later calls reuse the device cache.
On web, the cache is stored through the browser Cache API when available and
falls back to memory storage.

Default model cache:

Expand Down
17 changes: 17 additions & 0 deletions docs/guides/playback.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,23 @@ const tts = await KittenTTS.create({
await tts.speak('This plays through react-native-sound.');
```

## Browser Audio

React Native Web builds can use the browser audio helper:

```tsx
import {
KittenTTS,
createBrowserAudioPlayer,
} from '@kittentts/react-native';

const tts = await KittenTTS.create({
player: createBrowserAudioPlayer(),
});

await tts.speak('This plays through an HTML audio element.');
```

## Generate First, Then Play

This is useful when the UI needs metadata from the generated result before
Expand Down
Loading