Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
147 changes: 147 additions & 0 deletions livekit-plugins/livekit-plugins-krisp/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
# Krisp VIVA Plugin for LiveKit Agents

Real-time noise reduction for LiveKit voice agents using [Krisp's VIVA SDK](https://krisp.ai).

## Features

- **`KrispVivaFilterFrameProcessor`**: Real-time noise reduction FrameProcessor for audio processing

## Installation

```bash
# Install the plugin
pip install livekit-plugins-krisp

# Install krisp-audio SDK separately (required for actual usage)
```

**Note:** The `krisp-audio` package is a proprietary SDK not available on public PyPI.
It must be obtained and installed separately from Krisp (https://krisp.ai/developers/).

## Prerequisites

### Required for All Features

1. **Krisp Audio SDK**: `pip install krisp-audio`
2. **License Key**: Obtain a license key from Krisp and set it as an environment variable:
```bash
export KRISP_VIVA_SDK_LICENSE_KEY=your-license-key-here
```

### For Noise Reduction

1. **Noise Reduction Model**: Obtain a noise reduction or voice isolation `.kef` model file from Krisp
2. **Set environment variable**:
```bash
export KRISP_VIVA_FILTER_MODEL_PATH=/path/to/noise_model.kef
```

## Quick Start

### Human-to-Bot Noise Cancellation / Voice Isolation (Recommended)

For cleaning up user audio before STT/VAD processing using the FrameProcessor approach:

```python
from livekit.agents import AgentSession, Agent, JobContext, room_io
from livekit.plugins import krisp, silero, openai

@server.rtc_session()
async def entrypoint(ctx: JobContext):
# Create Krisp FrameProcessor
processor = krisp.KrispVivaFilterFrameProcessor(
noise_suppression_level=100, # 0-100
frame_duration_ms=10,
sample_rate=16000,
)

session = AgentSession(
vad=silero.VAD.load(),
stt=openai.STT(),
llm=openai.LLM(model="gpt-4o-mini"),
tts=openai.TTS(),
)

# Start session with RoomIO and pass FrameProcessor directly
await session.start(
agent=MyAgent(),
room=ctx.room,
room_options=room_io.RoomOptions(
audio_input=room_io.AudioInputOptions(
sample_rate=16000,
frame_size_ms=10, # Must match Krisp frame_duration_ms
noise_cancellation=processor, # Pass FrameProcessor directly
),
),
)
```

**Audio Pipeline:** `Room → RoomIO (with KrispVivaFilterFrameProcessor) → VAD → STT → LLM`


## Configuration

### KrispVivaFilterFrameProcessor Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `model_path` | str | env var | Path to noise reduction `.kef` model |
| `noise_suppression_level` | int | 100 | Noise reduction intensity (0-100) |
| `frame_duration_ms` | int | 10 | Frame size: 10, 15, 20, 30, or 32ms |
| `sample_rate` | int | None | Optional: pre-initialize with sample rate |

### Supported Sample Rates

8000, 16000, 24000, 32000, 44100, 48000 Hz


## Important Notes

### Frame Size Requirements

⚠️ **Frames must match the configured duration exactly**

- 10ms @ 16kHz = 160 samples
- 20ms @ 16kHz = 320 samples
- 20ms @ 32kHz = 640 samples

The filter validates frame sizes and raises `ValueError` if incorrect.

### Resource Management

- Session created once (on first use or if `sample_rate` provided)
- Call `close()` when done to free resources

### Shared SDK Management

The plugin uses `KrispSDKManager` to manage the Krisp SDK instance:

- **Singleton Pattern**: SDK initialized only once, shared across all components and sessions
- **Reference Counting**: Tracks active users (filters)
- **Automatic Cleanup**: SDK destroyed when last component releases its reference

## Troubleshooting

### "Krisp SDK initialization failed" or Licensing Errors
Make sure the license key is set:
```bash
export KRISP_VIVA_SDK_LICENSE_KEY=your-license-key-here
```

### "Model path must be provided"
```bash
export KRISP_VIVA_FILTER_MODEL_PATH=/path/to/model.kef
```

### "Unsupported sample rate"
Supported: 8000, 16000, 24000, 32000, 44100, 48000 Hz


### "Frame size mismatch"
Ensure your audio frames match the configured `frame_duration_ms`.

For 20ms @ 16kHz, each frame must have exactly 320 samples.

### Silent output
- Verify model file is valid
- Test with known noisy audio
106 changes: 106 additions & 0 deletions livekit-plugins/livekit-plugins-krisp/examples/krisp_agent_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
#!/usr/bin/env python3
"""
Example: Voice Agent with Krisp Noise Cancellation

This example demonstrates how to integrate Krisp noise cancellation
into a LiveKit voice agent for human-to-bot conversations.

The audio pipeline:
Room → RoomIO (with KrispVivaFilterFrameProcessor) → VAD → STT → LLM → TTS → Room

Prerequisites:
1. Set KRISP_VIVA_FILTER_MODEL_PATH environment variable to your .kef model file
2. Install required packages:
- livekit-agents (with PR #4145 support for FrameProcessor)
- livekit-plugins-krisp
- livekit-plugins-silero (for VAD)
- livekit-plugins-openai (or your preferred STT/LLM/TTS)

Usage:
python krisp_agent_example.py dev
"""

import logging

from dotenv import load_dotenv

from livekit.agents import (
Agent,
AgentServer,
AgentSession,
JobContext,
cli,
room_io,
)
from livekit.plugins import krisp, openai, silero

logger = logging.getLogger("krisp-agent-example")
load_dotenv()


class KrispAgent(Agent):
"""Voice agent that uses Krisp for noise cancellation."""

def __init__(self) -> None:
super().__init__(
instructions=(
"You are a helpful voice assistant. "
"Keep your responses concise and conversational. "
"Do not use emojis or special characters in your responses."
),
)

async def on_enter(self):
"""Called when the agent enters the session."""
logger.info("Krisp agent entered session")
# Generate initial greeting (uninterruptible for AEC calibration)
self.session.generate_reply(allow_interruptions=False)


server = AgentServer()


@server.rtc_session()
async def entrypoint(ctx: JobContext):
"""Main entrypoint for the agent session."""

# Configure the agent session
session = AgentSession(
vad=silero.VAD.load(),
stt=openai.STT(model="whisper-1"),
llm=openai.LLM(model="gpt-4o-mini"),
tts=openai.TTS(voice="alloy"),
allow_interruptions=True,
min_endpointing_delay=0.5,
max_endpointing_delay=3.0,
)

logger.info("Starting agent session with RoomIO and Krisp noise cancellation")

# Create Krisp FrameProcessor for noise cancellation
processor = krisp.KrispVivaFilterFrameProcessor(
noise_suppression_level=100, # 0-100, where 100 is maximum suppression
frame_duration_ms=10,
sample_rate=16000, # Pre-load model at this sample rate
)

# Start the session with RoomIO configuration
# IMPORTANT: frame_size_ms must match Krisp's frame_duration_ms
await session.start(
agent=KrispAgent(),
room=ctx.room,
room_options=room_io.RoomOptions(
audio_input=room_io.AudioInputOptions(
sample_rate=16000, # Krisp supports: 8k, 16k, 24k, 32k, 44.1k, 48k
num_channels=1,
frame_size_ms=10, # Must match Krisp frame_duration_ms (10, 15, 20, 30, or 32)
noise_cancellation=processor, # Pass FrameProcessor directly
),
),
)

logger.info("✅ Krisp noise cancellation active")


if __name__ == "__main__":
cli.run_app(server)
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
#!/usr/bin/env python3
"""
Minimal Example: Test Krisp Audio Filter Directly

This is a minimal example to test that Krisp filtering works
without requiring a full agent setup. Good for testing and debugging.

Prerequisites:
1. Set KRISP_VIVA_FILTER_MODEL_PATH environment variable
2. Install: pip install livekit-plugins-krisp krisp-audio numpy

Usage:
python krisp_minimal_example.py
"""

import asyncio
import logging

import numpy as np

from livekit import rtc
from livekit.plugins.krisp import KrispVivaFilterFrameProcessor

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("krisp-minimal-test")


async def test_krisp_filter():
"""Test Krisp filter with synthetic audio."""

logger.info("=" * 60)
logger.info("Testing Krisp Audio Filter")
logger.info("=" * 60)

try:
# Create the frame processor
logger.info("\n1. Creating Krisp frame processor...")
krisp_processor = KrispVivaFilterFrameProcessor(
noise_suppression_level=100,
frame_duration_ms=10,
sample_rate=16000,
)
logger.info("✅ Krisp frame processor created successfully")

# Create a test audio frame (10ms @ 16kHz = 160 samples)
logger.info("\n2. Creating test audio frame...")
sample_rate = 16000
frame_duration_ms = 10
num_samples = int(sample_rate * frame_duration_ms / 1000)

# Generate synthetic audio (white noise + sine wave)
t = np.linspace(0, frame_duration_ms / 1000, num_samples)
sine_wave = np.sin(2 * np.pi * 440 * t) # 440 Hz tone
noise = np.random.normal(0, 0.1, num_samples) # Noise
audio_signal = (sine_wave + noise) * 0.5

# Convert to int16 PCM
audio_int16 = (audio_signal * 32767).astype(np.int16)

# Create AudioFrame
test_frame = rtc.AudioFrame(
data=audio_int16.tobytes(),
sample_rate=sample_rate,
num_channels=1,
samples_per_channel=num_samples,
)
logger.info(f"✅ Test frame created: {num_samples} samples @ {sample_rate}Hz")

# Process the frame through Krisp
logger.info("\n3. Processing frame through Krisp frame processor...")
filtered_frame = krisp_processor.process(test_frame)
logger.info("✅ Frame processed successfully")

# Verify output
logger.info("\n4. Verifying output...")
logger.info(f" Input samples: {test_frame.samples_per_channel}")
logger.info(f" Output samples: {filtered_frame.samples_per_channel}")
logger.info(f" Input rate: {test_frame.sample_rate}Hz")
logger.info(f" Output rate: {filtered_frame.sample_rate}Hz")

if filtered_frame.samples_per_channel == test_frame.samples_per_channel:
logger.info("✅ Output frame size matches input")
else:
logger.error("❌ Output frame size mismatch!")

# Test multiple frames
logger.info("\n5. Processing multiple frames...")
for i in range(10):
_ = krisp_processor.process(test_frame)
logger.info(f" Frame {i + 1}/10 processed")
logger.info("✅ Multiple frames processed successfully")

# Cleanup
logger.info("\n6. Cleaning up...")
krisp_processor.close()
logger.info("✅ Frame processor closed")

logger.info("\n" + "=" * 60)
logger.info("✅ ALL TESTS PASSED - Krisp frame processor is working correctly!")
logger.info("=" * 60)

except Exception as e:
logger.error("\n" + "=" * 60)
logger.error(f"❌ TEST FAILED: {e}")
logger.error("=" * 60)
raise


async def main():
"""Run all tests."""
# Test frame processor
await test_krisp_filter()


if __name__ == "__main__":
asyncio.run(main())
Loading