Skip to content

Conversation

@dhruvladia-sarvam
Copy link
Contributor

@dhruvladia-sarvam dhruvladia-sarvam commented Jan 23, 2026

Added:

  1. "saaras:v3" for STTT
  2. "bulbul:v3-beta" for TTS

Summary by CodeRabbit

  • New Features

    • Support for saaras:v3 STT with selectable modes: transcribe, translate, verbatim, translit, codemix; modes propagate through streaming and recognition.
    • Added bulbul:v3-beta TTS with 25+ new voices across Customer Care, Content Creation, and International categories.
  • Behavior Changes

    • Mode parameter applies only to saaras:v3; non-saaras models ignore or default mode to "transcribe."

✏️ Tip: You can customize this high-level summary in your review settings.


Open with Devin

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 23, 2026

📝 Walkthrough

Walkthrough

Adds mode-aware STT support for saaras:v3 (mode enum, validation, propagation through stream/recognize paths, URL/form updates, and reconnection on option changes) and expands TTS with bulbul:v3-beta, many new speakers, and updated model–speaker compatibility and validation.

Changes

Cohort / File(s) Summary
STT: mode-aware saaras:v3
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py
Add saaras:v3 to SarvamSTTModels; introduce SarvamSTTModes; add mode to SarvamSTTOptions, STT.__init__, STT.stream, _recognize_impl, and SpeechStream (constructor & update_options); validate mode only for saaras:v3; include mode in websocket URL and form data when applicable; propagate mode through streams, reconnections, and logs; reject/normalize mode use for non-saaras:v3.
TTS: bulbul:v3-beta and speakers
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py
Add bulbul:v3-beta to SarvamTTSModels; add many bulbul:v3-beta speakers across categories; extend MODEL_SPEAKER_COMPATIBILITY with bulbul:v3-beta mappings; allow bulbul:v3-beta in update_options validation.

Sequence Diagram(s)

sequenceDiagram
    participant Client as "Client"
    participant STT as "STT"
    participant SpeechStream as "SpeechStream"
    participant SarvamWS as "Sarvam WS"
    Client->>STT: stream(start, mode)
    STT->>SpeechStream: create(options including mode)
    SpeechStream->>SarvamWS: open websocket (URL params include mode when model == "saaras:v3")
    SarvamWS-->>SpeechStream: transcription events
    SpeechStream-->>STT: forward transcripts/events
    Note over SpeechStream,SarvamWS: SpeechStream.update_options(mode) → validate → reconnect with new mode
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

  • chenghao-mou
  • theomonnom
  • davidzhao

Poem

🐇 A rabbit hops through code and streams,
Modes for Saaras and Bulbul dreams.
Webs reconnect and options sing,
New voices and modes take wing.
Hooray — the pipeline hums with spring!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'v3:stt and tts models' directly reflects the main changes: adding v3 model variants (saaras:v3 for STT and bulbul:v3-beta for TTS) to the Sarvam plugin.
Docstring Coverage ✅ Passed Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py (1)

628-638: Gate pitch/loudness by model in streaming config.

The HTTP path explicitly omits pitch and loudness for v3-beta (see lines 489-492: "not supported in v3-beta"), but the streaming config sends them unconditionally. This inconsistency will cause v3-beta streaming sessions to fail with API rejection. Apply the same model check as the HTTP path.

♻️ Suggested adjustment
-                config_msg = {
-                    "type": "config",
-                    "data": {
-                        "target_language_code": self._opts.target_language_code,
-                        "speaker": self._opts.speaker,
-                        "pitch": self._opts.pitch,
-                        "pace": self._opts.pace,
-                        "loudness": self._opts.loudness,
-                        "enable_preprocessing": self._opts.enable_preprocessing,
-                        "model": self._opts.model,
-                    },
-                }
+                config_data = {
+                    "target_language_code": self._opts.target_language_code,
+                    "speaker": self._opts.speaker,
+                    "pace": self._opts.pace,
+                    "enable_preprocessing": self._opts.enable_preprocessing,
+                    "model": self._opts.model,
+                }
+                if self._opts.model == "bulbul:v2":
+                    config_data["pitch"] = self._opts.pitch
+                    config_data["loudness"] = self._opts.loudness
+                config_msg = {"type": "config", "data": config_data}
🤖 Fix all issues with AI agents
In `@livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py`:
- Around line 399-405: When update_options sets a new model (the block that
assigns self._opts.model), also revalidate the currently set speaker
(self._opts.speaker) if no new speaker is passed: check compatibility of the
existing speaker with the requested model and raise a ValueError if
incompatible. Implement this by adding a compatibility check (e.g., call a
helper like is_speaker_supported(model, self._opts.speaker) or inline logic)
immediately after setting model in update_options (the same scope where model,
speaker and self._opts.model are handled) so switching to "bulbul:v3-beta" with
an incompatible current speaker (e.g., "anushka") fails early rather than
causing runtime API errors.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7fe642d and 6f83e4a.

📒 Files selected for processing (2)
  • livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py
  • livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py
  • livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: unit-tests
  • GitHub Check: type-check (3.9)
  • GitHub Check: type-check (3.13)
🔇 Additional comments (4)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (1)

56-57: LGTM — model literal updated cleanly.

livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py (3)

50-50: LGTM — model enum expanded for v3-beta.


77-105: LGTM — speaker list expansion aligns with lowercased validation.


108-171: LGTM — compatibility mapping looks consistent with new speakers.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines 399 to 405
if model is not None:
if not model.strip():
raise ValueError("Model cannot be empty")
if model not in ["bulbul:v2"]:
if model not in ["bulbul:v2", "bulbul:v3-beta"]:
raise ValueError(f"Unsupported model: {model}")
self._opts.model = model

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Revalidate the existing speaker when switching models.
update_options(model=...) can set bulbul:v3-beta while keeping an incompatible current speaker (e.g., "anushka"), and the mismatch isn’t checked unless speaker is also passed. This can surface as runtime API errors later.

🐛 Proposed fix
         if model is not None:
             if not model.strip():
                 raise ValueError("Model cannot be empty")
             if model not in ["bulbul:v2", "bulbul:v3-beta"]:
                 raise ValueError(f"Unsupported model: {model}")
             self._opts.model = model
+            if speaker is None and not validate_model_speaker_compatibility(
+                model, self._opts.speaker
+            ):
+                compatible = MODEL_SPEAKER_COMPATIBILITY.get(model, {}).get("all", [])
+                raise ValueError(
+                    f"Speaker '{self._opts.speaker}' is not compatible with model '{model}'. "
+                    "Please choose a compatible speaker from: "
+                    f"{', '.join(compatible)}"
+                )
🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py` around
lines 399 - 405, When update_options sets a new model (the block that assigns
self._opts.model), also revalidate the currently set speaker
(self._opts.speaker) if no new speaker is passed: check compatibility of the
existing speaker with the requested model and raise a ValueError if
incompatible. Implement this by adding a compatibility check (e.g., call a
helper like is_speaker_supported(model, self._opts.speaker) or inline logic)
immediately after setting model in update_options (the same scope where model,
speaker and self._opts.model are handled) so switching to "bulbul:v3-beta" with
an incompatible current speaker (e.g., "anushka") fails early rather than
causing runtime API errors.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (1)

70-95: Duplicate mode field definition will cause unexpected behavior.

The mode field is defined twice in this dataclass:

  • Line 87: mode: SarvamSTTModes | str = "transcribe"
  • Line 95: mode: Literal["translate", "transcribe", "verbatim", "translit", "codemix"] = "transcribe"

In Python dataclasses, duplicate field names are not allowed and will cause issues. Additionally, the docstring has duplicate entries for mode at lines 77 and 81.

🐛 Proposed fix: Remove duplicate field and docstring entry
 `@dataclass`
 class SarvamSTTOptions:
     """Options for the Sarvam.ai STT service.

     Args:
         language: BCP-47 language code, e.g., "hi-IN", "en-IN"
         model: The Sarvam STT model to use
         mode: Mode for saaras:v3 (transcribe/translate/verbatim/translit/codemix)
         base_url: API endpoint URL (auto-determined from model if not provided)
         streaming_url: WebSocket streaming URL (auto-determined from model if not provided)
         prompt: Optional prompt for STT translate (saaras models only)
-        mode: Mode for saaras:v3 (transcribe/translate/verbatim/translit/codemix)
     """

     language: str  # BCP-47 language code, e.g., "hi-IN", "en-IN"
     api_key: str
     model: SarvamSTTModels | str = "saarika:v2.5"
     mode: SarvamSTTModes | str = "transcribe"
     base_url: str | None = None
     streaming_url: str | None = None
     prompt: str | None = None  # Optional prompt for STT translate (saaras models only)
     high_vad_sensitivity: bool | None = None
     sample_rate: int = 16000
     flush_signal: bool | None = None
     input_audio_codec: str | None = None
-    mode: Literal["translate", "transcribe", "verbatim", "translit", "codemix"] = "transcribe"
🧹 Nitpick comments (3)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (3)

282-284: Inconsistent pattern for checking NotGiven values.

Lines 282-284 use isinstance(x, type(NOT_GIVEN)) while the stream() method (lines 391-393) uses is_given(). For consistency and clarity, prefer the is_given() helper which is already imported.

♻️ Proposed fix: Use is_given() consistently
-        opts_language = self._opts.language if isinstance(language, type(NOT_GIVEN)) else language
-        opts_model = self._opts.model if isinstance(model, type(NOT_GIVEN)) else model
-        opts_mode = self._opts.mode if isinstance(mode, type(NOT_GIVEN)) else mode
+        opts_language = self._opts.language if not is_given(language) else language
+        opts_model = self._opts.model if not is_given(model) else model
+        opts_mode = self._opts.mode if not is_given(mode) else mode

Or alternatively, to match the pattern in stream():

-        opts_language = self._opts.language if isinstance(language, type(NOT_GIVEN)) else language
-        opts_model = self._opts.model if isinstance(model, type(NOT_GIVEN)) else model
-        opts_mode = self._opts.mode if isinstance(mode, type(NOT_GIVEN)) else mode
+        opts_language = language if is_given(language) else self._opts.language
+        opts_model = model if is_given(model) else self._opts.model
+        opts_mode = mode if is_given(mode) else self._opts.mode

560-602: update_options signature differs from the standard plugin pattern.

The current signature requires language and model as mandatory parameters:

def update_options(self, *, language: str, model: str, prompt: str | None = None, mode: str | None = None)

The standard pattern (e.g., in baseten/stt.py) uses NotGivenOr with NOT_GIVEN defaults, enabling partial updates without requiring all parameters:

def update_options(self, *, language: NotGivenOr[str] = NOT_GIVEN, ...)

If partial updates are intended to be supported, consider aligning with the standard pattern. If mandatory parameters are intentional (to always reconnect with full config), this is acceptable but worth documenting.


573-584: Consider extracting the allowed modes set to reduce duplication.

The same allowed_modes set is defined in both SarvamSTTOptions.__post_init__ (lines 106-112) and here. Extracting to a module-level constant would improve maintainability.

♻️ Proposed refactor

Add at module level (near line 57):

SARVAM_V3_ALLOWED_MODES: set[str] = {"transcribe", "translate", "verbatim", "translit", "codemix"}

Then use in both validation locations:

-            allowed_modes: set[str] = {
-                "transcribe",
-                "translate",
-                "verbatim",
-                "translit",
-                "codemix",
-            }
-            if mode not in allowed_modes:
+            if mode not in SARVAM_V3_ALLOWED_MODES:
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6f83e4a and d74ce83.

📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py
🧬 Code graph analysis (1)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (3)
livekit-plugins/livekit-plugins-baseten/livekit/plugins/baseten/stt.py (3)
  • model (128-129)
  • update_options (168-195)
  • update_options (224-244)
livekit-plugins/livekit-plugins-mistralai/livekit/plugins/mistralai/stt.py (2)
  • model (83-84)
  • update_options (90-107)
livekit-agents/livekit/agents/utils/misc.py (1)
  • is_given (25-26)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: type-check (3.9)
  • GitHub Check: type-check (3.13)
  • GitHub Check: unit-tests
🔇 Additional comments (6)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (6)

56-57: LGTM!

The new type definitions for SarvamSTTModels and SarvamSTTModes are well-defined and consistent with the validation logic used throughout the file.


105-118: LGTM!

The mode validation logic appropriately restricts mode options to saaras:v3 only, with a sensible default for other models.


173-174: LGTM!

The mode parameter is correctly added to the WebSocket URL only for the saaras:v3 model.


197-239: LGTM!

The mode parameter is properly integrated into the STT.__init__ method and correctly passed through to SarvamSTTOptions.


298-299: LGTM!

The mode is correctly added to the form data only for saaras:v3 model, consistent with the WebSocket URL building logic.


377-431: LGTM!

The stream() method correctly integrates the mode parameter with proper use of is_given() and appropriate type validation.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py`:
- Around line 572-590: The internal mode state becomes inconsistent because
__post_init__ forces non-v3 models to "transcribe" but update_options allows
arbitrary modes; update update_options to mirror __post_init__'s normalization:
when model != "saaras:v3" ensure self._opts.mode is set to "transcribe" (or
ignore any provided mode) and only accept other modes when model == "saaras:v3";
apply this logic in the update_options method that sets self._opts.model and
self._opts.mode so the internal state stays consistent with API gating.
🧹 Nitpick comments (2)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (2)

94-94: Remove commented-out dead code.

This line is a leftover from development and duplicates the mode field already defined on line 86.

🧹 Proposed fix
-    `#mode`: Literal["translate", "transcribe", "verbatim", "translit", "codemix"] = "transcribe"

104-117: Consider extracting allowed_modes to a module-level constant.

The same set of allowed modes is duplicated in update_options (lines 573-579). Extracting to a constant improves maintainability and ensures consistency.

♻️ Suggested refactor

Add a module-level constant near the type definitions:

# After line 57
SAARAS_V3_ALLOWED_MODES: set[str] = {
    "transcribe",
    "translate",
    "verbatim",
    "translit",
    "codemix",
}

Then use it in both __post_init__ and update_options:

         if self.model == "saaras:v3":
-            allowed_modes: set[str] = {
-                "transcribe",
-                "translate",
-                "verbatim",
-                "translit",
-                "codemix",
-            }
-            if self.mode not in allowed_modes:
+            if self.mode not in SAARAS_V3_ALLOWED_MODES:
                 raise ValueError(
                     "mode must be one of transcribe, translate, verbatim, translit, codemix"
                 )
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d74ce83 and 6fec239.

📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py
🧬 Code graph analysis (1)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (1)
livekit-agents/livekit/agents/utils/misc.py (1)
  • is_given (25-26)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: type-check (3.13)
  • GitHub Check: type-check (3.9)
  • GitHub Check: unit-tests
🔇 Additional comments (4)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (4)

56-57: LGTM!

Type definitions for the new model and modes are well-defined using Literal types for proper type safety.


172-173: LGTM!

Mode parameter is correctly added to WebSocket URL only for saaras:v3 model.


281-284: LGTM!

Mode parameter propagation follows the same pattern as language and model, with proper fallback to instance defaults.


390-399: LGTM!

Mode handling in stream() follows the established pattern for language and model parameters, with proper type guards.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (1)

255-300: Add mode value validation in non-streaming recognize.

_recognize_impl accepts any string for mode when model == "saaras:v3", allowing invalid values to reach the API instead of failing fast. The same validation used in SarvamSTTOptions.__post_init__() and streaming update_options() should be applied here before sending the request.

🛠️ Proposed fix
-        opts_mode = self._opts.mode if not is_given(mode) else mode
-        if is_given(mode) and opts_model != "saaras:v3":
-            raise ValueError("mode is only supported when model is saaras:v3")
+        opts_mode = self._opts.mode if not is_given(mode) else mode
+        if opts_model != "saaras:v3":
+            if is_given(mode):
+                raise ValueError("mode is only supported when model is saaras:v3")
+        else:
+            allowed_modes: set[str] = {
+                "transcribe",
+                "translate",
+                "verbatim",
+                "translit",
+                "codemix",
+            }
+            if opts_mode not in allowed_modes:
+                raise ValueError(
+                    "mode must be one of transcribe, translate, verbatim, translit, codemix"
+                )
🤖 Fix all issues with AI agents
In `@livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py`:
- Around line 562-606: update_options currently changes self._opts.model but
doesn't refresh endpoint fields, so reconnection uses stale URLs; modify
update_options (around the code that sets self._opts.model) to also set
self._opts.base_url and self._opts.streaming_url based on the new model value:
if model == "saaras:v3" set base_url to
"https://api.sarvam.ai/speech-to-text-translate" and streaming_url to
"wss://api.sarvam.ai/speech-to-text-translate/ws", otherwise set base_url to
"https://api.sarvam.ai/speech-to-text" and streaming_url to
"wss://api.sarvam.ai/speech-to-text/ws"; ensure these assignments occur before
triggering the reconnection/logging so the reconnect uses the updated endpoints.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6fec239 and db17cc3.

📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: livekit/agents PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-16T07:44:56.353Z
Learning: Implement Model Interface Pattern for STT, TTS, LLM, and Realtime models with provider-agnostic interfaces, fallback adapters for resilience, and stream adapters for different streaming patterns
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: unit-tests
  • GitHub Check: type-check (3.9)
  • GitHub Check: type-check (3.13)
🔇 Additional comments (4)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (4)

55-116: LGTM — mode validation/defaulting is clear and consistent.


157-176: Good: mode is only appended for saaras:v3.


195-237: LGTM — mode is properly plumbed into options at construction.


377-433: LGTM — stream path correctly carries mode into per-stream options.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py`:
- Around line 577-582: The routing in update_options currently checks
model.startswith("saaras:v2.5") and therefore misroutes saaras:v3 models; change
update_options to call and reuse the existing _get_urls_for_model(model) to
obtain the correct base and streaming URLs (or expand the condition to include
"saaras:v3") and then assign self._opts.base_url and self._opts.streaming_url
from that result so both places share the same routing logic (refer to
update_options, _get_urls_for_model, and the assignments to self._opts.base_url
/ self._opts.streaming_url).
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cb128a3 and 77c6622.

📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py
🧬 Code graph analysis (1)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (2)
livekit-agents/livekit/agents/stt/stt.py (1)
  • model (115-124)
livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/stt.py (3)
  • model (118-119)
  • update_options (157-174)
  • update_options (212-224)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: type-check (3.9)
  • GitHub Check: type-check (3.13)
  • GitHub Check: unit-tests
🔇 Additional comments (10)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (10)

56-57: LGTM!

Type definitions for models and modes are clear and well-structured.


77-86: LGTM!

Mode field is properly documented and has a sensible default value.


103-116: LGTM!

Mode validation in __post_init__ correctly enforces allowed modes for saaras:v3 and normalizes to "transcribe" for other models.


171-172: LGTM!

Mode is correctly gated to only be included in WebSocket parameters for saaras:v3.


188-230: LGTM!

Mode parameter is properly documented, accepted in constructor, and passed to options.


261-299: LGTM!

Mode parameter handling in _recognize_impl is well-designed:

  • Validates that explicit mode is only allowed for saaras:v3
  • Correctly gates form data inclusion

382-427: LGTM!

Mode parameter handling in stream() is consistent with _recognize_impl and properly validates and propagates the mode value.


585-602: LGTM!

Mode validation in update_options correctly:

  • Rejects explicit mode for non-v3 models
  • Validates mode value for v3
  • Normalizes mode to "transcribe" for non-v3 models

This properly addresses the prior review feedback about consistent mode handling.


603-613: LGTM!

Logging includes the mode value, providing good observability for debugging reconnection behavior.


130-133: Review comment is incorrect; current URL routing for saaras:v3 is correct.

According to Sarvam's official API documentation, saaras:v3 uses the regular https://api.sarvam.ai/speech-to-text endpoint. Translation functionality is controlled via the mode parameter (transcribe, translate, verbatim, translit, codemix) sent in the request, not by changing the endpoint URL. The current code correctly routes saaras:v3 to SARVAM_STT_BASE_URL. The proposed fix would incorrectly send saaras:v3 to the translate endpoint.

Likely an incorrect or invalid review comment.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (1)

280-299: ⚠️ Potential issue | 🟡 Minor

Missing mode value validation for explicit mode parameter.

When mode is explicitly provided with model="saaras:v3", there's no validation that the mode value is in the allowed set. Unlike stream() which passes mode through SarvamSTTOptions (which validates in __post_init__), _recognize_impl uses the mode directly.

🛡️ Proposed fix
         opts_mode = self._opts.mode if not is_given(mode) else mode
         if is_given(mode) and opts_model != "saaras:v3":
             raise ValueError("mode is only supported when model is saaras:v3")
+        if opts_model == "saaras:v3" and is_given(mode):
+            allowed_modes = {"transcribe", "translate", "verbatim", "translit", "codemix"}
+            if opts_mode not in allowed_modes:
+                raise ValueError(
+                    "mode must be one of transcribe, translate, verbatim, translit, codemix"
+                )
🤖 Fix all issues with AI agents
In `@livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py`:
- Around line 130-133: The current branch logic routes `saaras:v3` incorrectly
because it only checks `model.startswith("saaras:v2.5")`; change the condition
in the function that returns endpoints to use `model.startswith("saaras")` so
both `saaras:v2.5` and `saaras:v3` return SARVAM_STT_TRANSLATE_BASE_URL and
SARVAM_STT_TRANSLATE_STREAMING_URL (these symbols identify the translate
endpoints), and update the else-branch comment (currently `# saarika models`) to
clarify it only covers saarika models.
🧹 Nitpick comments (1)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (1)

103-116: Consider extracting allowed_modes as a module-level constant.

The allowed_modes set is defined here and duplicated in update_options (lines 583-589). Extracting it to a module-level constant would reduce duplication and ensure consistency.

♻️ Proposed refactor
+ALLOWED_SAARAS_V3_MODES: set[str] = {"transcribe", "translate", "verbatim", "translit", "codemix"}
+
 def _get_urls_for_model(model: str) -> tuple[str, str]:

Then use ALLOWED_SAARAS_V3_MODES in both __post_init__ and update_options.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 77c6622 and 10032db.

📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py
🧬 Code graph analysis (1)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (3)
livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/stt.py (3)
  • model (118-119)
  • update_options (157-174)
  • update_options (212-224)
livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py (3)
  • model (110-111)
  • update_options (149-174)
  • update_options (199-218)
livekit-agents/livekit/agents/utils/misc.py (1)
  • is_given (25-26)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: unit-tests
  • GitHub Check: type-check (3.13)
  • GitHub Check: type-check (3.9)
🔇 Additional comments (4)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (4)

56-57: LGTM!

Type definitions for models and modes are well-defined using Literal types.


171-172: LGTM!

Mode parameter is correctly gated to saaras:v3 only.


382-402: LGTM!

Mode handling correctly validates model compatibility and passes through SarvamSTTOptions for value validation.


562-607: LGTM!

The previous review concerns have been addressed:

  • URLs are now updated via _get_urls_for_model(model) (line 577)
  • Mode is correctly normalized to "transcribe" for non-v3 models (lines 596-597)
  • Mode validation is properly gated to saaras:v3 only

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +130 to 133
if model.startswith("saaras:v2.5"):
return SARVAM_STT_TRANSLATE_BASE_URL, SARVAM_STT_TRANSLATE_STREAMING_URL
else: # saarika models
return SARVAM_STT_BASE_URL, SARVAM_STT_STREAMING_URL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py | sed -n '120,145p'

Repository: livekit/agents

Length of output: 1151


🏁 Script executed:

# Search for any documentation or comments about saaras models and endpoints
rg -A 5 -B 5 "saaras" livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/ --context=3

Repository: livekit/agents

Length of output: 12474


🏁 Script executed:

# Check for tests that might indicate which endpoint should be used for which model
fd -t f "test.*\.py" livekit-plugins/livekit-plugins-sarvam/ -exec grep -l "saaras\|_get_urls_for_model" {} \;

Repository: livekit/agents

Length of output: 279


🏁 Script executed:

# Look for test files or examples
fd -t f "test" livekit-plugins/livekit-plugins-sarvam/ | head -20

Repository: livekit/agents

Length of output: 40


🏁 Script executed:

# Search for README or documentation
fd -t f -e md livekit-plugins/livekit-plugins-sarvam/

Repository: livekit/agents

Length of output: 519


🏁 Script executed:

# Look for any API endpoint documentation or comments in the codebase
rg -i "saaras.*endpoint\|translate.*endpoint\|speech.*to.*text" livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py -A 2 -B 2

Repository: livekit/agents

Length of output: 40


🏁 Script executed:

# Check how the endpoints are actually used in requests
rg -A 10 "self._opts.base_url\|base_url.*=\|self\._base_url" livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py

Repository: livekit/agents

Length of output: 40


🏁 Script executed:

cd livekit-plugins/livekit-plugins-sarvam && find . -type f -name "*test*" -o -name "*example*" | head -20

Repository: livekit/agents

Length of output: 40


🏁 Script executed:

# Look for README or documentation about API endpoints
cd livekit-plugins/livekit-plugins-sarvam && find . -type f -name "*.md"

Repository: livekit/agents

Length of output: 69


🏁 Script executed:

# Check the full stt.py file around the endpoint definitions
cd livekit-plugins/livekit-plugins-sarvam && head -50 livekit/plugins/sarvam/stt.py

Repository: livekit/agents

Length of output: 1424


🏁 Script executed:

# Search for how saaras:v3 mode is used and what endpoints it expects
rg "saaras:v3" livekit-plugins/livekit-plugins-sarvam/

Repository: livekit/agents

Length of output: 1736


🏁 Script executed:

cat livekit-plugins/livekit-plugins-sarvam/README.md

Repository: livekit/agents

Length of output: 756


🏁 Script executed:

# Look for how prompts are used with different models
rg -A 5 -B 5 "prompt" livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py | head -60

Repository: livekit/agents

Length of output: 2351


🏁 Script executed:

# Check if saaras:v3 actually uses the prompt or just mode
rg "saaras:v3.*prompt\|prompt.*saaras:v3" livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py

Repository: livekit/agents

Length of output: 40


🏁 Script executed:

# Look at how parameters are sent for different models
rg -A 10 "form_data.*saaras\|if.*model.*saaras" livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py

Repository: livekit/agents

Length of output: 40


🏁 Script executed:

# Find the full context around how form_data and endpoints are used
cat -n livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py | sed -n '170,240p'

Repository: livekit/agents

Length of output: 3145


🏁 Script executed:

# Look for the async _recognize_http method where the endpoint is actually used
cat -n livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py | sed -n '155,200p'

Repository: livekit/agents

Length of output: 1977


🏁 Script executed:

# Check for WebSocket connection and how it uses the URL
cat -n livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py | sed -n '340,420p'

Repository: livekit/agents

Length of output: 4378


🏁 Script executed:

# Search the entire file for where _get_urls_for_model is called
rg -n "_get_urls_for_model" livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py

Repository: livekit/agents

Length of output: 277


🏁 Script executed:

# Look at the initialization code to see how base_url and streaming_url are determined
cat -n livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py | sed -n '570,590p'

Repository: livekit/agents

Length of output: 1097


🏁 Script executed:

# Check where the URL is actually used to make requests
cat -n livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py | sed -n '430,470p'

Repository: livekit/agents

Length of output: 1584


🏁 Script executed:

# Look for any comments about saaras:v3 or endpoint documentation
grep -n "saaras\|endpoint\|translate" livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py | head -30

Repository: livekit/agents

Length of output: 2122


🏁 Script executed:

# Look for tests or examples showing how saaras:v3 should work
find livekit-plugins/livekit-plugins-sarvam -name "*.py" -type f | xargs grep -l "test\|example" | head -5

Repository: livekit/agents

Length of output: 40


🏁 Script executed:

# Check if there are any comments or documentation about endpoint selection for saaras:v3
cat -n livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py | sed -n '1,130p'

Repository: livekit/agents

Length of output: 5244


🏁 Script executed:

# Look for any API documentation or comments about mode and endpoint
rg -B 10 "speech-to-text-translate" livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py | head -40

Repository: livekit/agents

Length of output: 549


🏁 Script executed:

# Look at the recognize_http method to see how base_url is used
cat -n livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py | sed -n '245,290p'

Repository: livekit/agents

Length of output: 2159


saaras:v3 is routed to the wrong endpoint.

The condition model.startswith("saaras:v2.5") does not match saaras:v3, causing it to be routed to the base STT endpoint instead of the translate endpoint. Since saaras:v3 supports the translate mode and accepts prompts (both documented as translate features, just like saaras:v2.5), it should use the translate endpoint. Additionally, the comment # saarika models is misleading—the else branch catches saaras:v3 as well.

Change the condition to model.startswith("saaras") to route both saaras:v2.5 and saaras:v3 to the translate endpoint, and update the comment to clarify it only covers saarika models.

🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py` around
lines 130 - 133, The current branch logic routes `saaras:v3` incorrectly because
it only checks `model.startswith("saaras:v2.5")`; change the condition in the
function that returns endpoints to use `model.startswith("saaras")` so both
`saaras:v2.5` and `saaras:v3` return SARVAM_STT_TRANSLATE_BASE_URL and
SARVAM_STT_TRANSLATE_STREAMING_URL (these symbols identify the translate
endpoints), and update the else-branch comment (currently `# saarika models`) to
clarify it only covers saarika models.

if not model.strip():
raise ValueError("Model cannot be empty")
if model not in ["bulbul:v2"]:
if model not in ["bulbul:v2", "bulbul:v3-beta"]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't a great pattern, because when you introduce new models, the plugin cannot be used without an update.

we recommend not hard blocking model lists in the plugin, your server should be the authority here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, I'll remove this check

if model.startswith("saaras:"):
if model.startswith("saaras:v2.5"):
return SARVAM_STT_TRANSLATE_BASE_URL, SARVAM_STT_TRANSLATE_STREAMING_URL
else: # saarika models
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the right url for saaras:v3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"https://api.sarvam.ai/speech-to-text" and "wss://api.sarvam.ai/speech-to-text/ws" for rest and websocket respectively, that is set in variables SARVAM_STT_BASE_URL and SARVAM_STT_STREAMING_URL respectively

Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View issue and 4 additional flags in Devin Review.

Open in Devin Review

Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View issue and 5 additional flags in Devin Review.

Open in Devin Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants