Skip to content

ECHO-258 switch to custom runpod deployment for whisper#168

Merged
ArindamRoy23 merged 1 commit intomainfrom
feature/echo-258-switch-to-custom-runpod-deployment-for-whisper
May 29, 2025
Merged

ECHO-258 switch to custom runpod deployment for whisper#168
ArindamRoy23 merged 1 commit intomainfrom
feature/echo-258-switch-to-custom-runpod-deployment-for-whisper

Conversation

@ArindamRoy23
Copy link
Copy Markdown
Contributor

@ArindamRoy23 ArindamRoy23 commented May 29, 2025

Refactor RunPod Whisper transcription configuration and improve error handling

  • Removed the unused RUNPOD_WHISPER_MODEL environment variable from the configuration.
  • Updated the scheduling for RunPod transcription responses to trigger every 30 seconds instead of every minute.
  • Changed the transcription response handling to use 'joined_text' instead of 'transcription' for custom runpod.
  • Enhanced error handling in the transcription process to log and raise exceptions when fetching conversation chunks and details fails.
  • Updated prompt templates for multiple languages with new technical presentation examples.

Summary by CodeRabbit

  • Improvements

    • Increased the frequency of transcription response updates, providing more timely results.
    • Enhanced error handling and logging for external data fetches, improving reliability and transparency.
    • Updated the extraction of transcript text for improved compatibility with response formats.
  • Prompt Updates

    • Replaced transcription prompt templates for English, German, Spanish, French, and Dutch with richer contextual examples, better reflecting technical presentations and mixed-language scenarios.
  • Other Changes

    • Removed the requirement to specify a transcription model in environment variables and request payloads.

… handling

- Removed the unused RUNPOD_WHISPER_MODEL environment variable from the configuration.
- Updated the scheduling for RunPod transcription responses to trigger every 30 seconds instead of every minute.
- Changed the transcription response handling to use 'joined_text' instead of 'transcription'.
- Enhanced error handling in the transcription process to log and raise exceptions when fetching conversation chunks and details fails.
- Updated prompt templates for multiple languages with new technical presentation examples.
@linear
Copy link
Copy Markdown

linear bot commented May 29, 2025

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented May 29, 2025

Walkthrough

This change removes the RUNPOD_WHISPER_MODEL environment variable and its usage from the RunPod transcription flow, updates error handling and logging for Directus API calls, modifies the frequency of a scheduled job, changes how transcript text is extracted from responses, and replaces all default Whisper prompt templates with detailed contextual or illustrative examples.

Changes

File(s) Change Summary
echo/server/dembrane/config.py, echo/server/dembrane/transcribe.py Removed RUNPOD_WHISPER_MODEL usage and related model specification in transcription payload.
echo/server/dembrane/transcribe.py Added error handling and logging for Directus API calls; improved debug logging for transcription.
echo/server/dembrane/scheduler.py Changed scheduled job interval from every 1 minute to every 30 seconds.
echo/server/dembrane/tasks.py Changed transcript extraction key from transcription to joined_text.
echo/server/prompt_templates/default_whisper_prompt.*.jinja Replaced all language-specific Whisper prompt templates with detailed, contextual sample prompts.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Scheduler
    participant TaskProcessor
    participant RunPodAPI
    participant Directus

    Scheduler->>TaskProcessor: Trigger task_update_runpod_transcription_response (every 30s)
    TaskProcessor->>RunPodAPI: Fetch transcription response
    RunPodAPI-->>TaskProcessor: Return response with joined_text
    TaskProcessor->>Directus: Update conversation chunk with transcript
Loading
sequenceDiagram
    participant User
    participant Transcriber
    participant Directus
    participant RunPodAPI

    User->>Transcriber: Request transcription
    Transcriber->>Directus: Fetch conversation chunk/details/metadata
    alt On error
        Transcriber->>Transcriber: Log error and raise ValueError
    else Success
        Transcriber->>RunPodAPI: Queue transcription (no model specified)
        RunPodAPI-->>Transcriber: Acknowledge request
    end
Loading

Assessment against linked issues

Objective Addressed Explanation
Switch to custom RunPod deployment for Whisper (ECHO-258)

Possibly related PRs

Suggested labels

Feature

Suggested reviewers

  • spashii
  • ussaama
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@ArindamRoy23 ArindamRoy23 requested a review from spashii May 29, 2025 10:48
@ArindamRoy23 ArindamRoy23 enabled auto-merge May 29, 2025 10:48
@coderabbitai coderabbitai bot requested a review from ussaama May 29, 2025 10:48
@ArindamRoy23 ArindamRoy23 added this pull request to the merge queue May 29, 2025
@spashii spashii changed the title ECHO-258 switch-to-custom-runpod-deployment-for-whisper ECHO-258 switch to custom runpod deployment for whisper May 29, 2025
@coderabbitai coderabbitai bot added the Feature label May 29, 2025
Merged via the queue into main with commit 1159f86 May 29, 2025
6 of 7 checks passed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e979054 and 1093ee5.

📒 Files selected for processing (9)
  • echo/server/dembrane/config.py (0 hunks)
  • echo/server/dembrane/scheduler.py (1 hunks)
  • echo/server/dembrane/tasks.py (1 hunks)
  • echo/server/dembrane/transcribe.py (5 hunks)
  • echo/server/prompt_templates/default_whisper_prompt.de.jinja (1 hunks)
  • echo/server/prompt_templates/default_whisper_prompt.en.jinja (1 hunks)
  • echo/server/prompt_templates/default_whisper_prompt.es.jinja (1 hunks)
  • echo/server/prompt_templates/default_whisper_prompt.fr.jinja (1 hunks)
  • echo/server/prompt_templates/default_whisper_prompt.nl.jinja (1 hunks)
💤 Files with no reviewable changes (1)
  • echo/server/dembrane/config.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: ci-check-server
🔇 Additional comments (7)
echo/server/prompt_templates/default_whisper_prompt.nl.jinja (1)

1-1: LGTM! Solid prompt engineering here.

This Dutch technical presentation example with mixed English technical terms is much more contextually rich than the previous simple instruction. The realistic bilingual technical jargon will definitely improve Whisper's transcription accuracy for Dutch speakers in tech environments.

echo/server/dembrane/scheduler.py (1)

29-29: LGTM! Frequency optimization looks solid.

Bumping the RunPod transcription polling from every minute to every 30 seconds - this will definitely improve response times for users waiting on their transcriptions. Clean implementation using APScheduler's CronTrigger.

echo/server/dembrane/tasks.py (1)

937-937: LGTM! Clean adaptation to the new RunPod response schema.

Switching from transcription to joined_text field - this matches the custom RunPod deployment's response format perfectly. The change is surgical and maintains all the existing error handling around it.

echo/server/prompt_templates/default_whisper_prompt.fr.jinja (1)

1-1:

Details

✅ Verification successful

Solid contextual prompt engineering, but verify token limits. LGTM.

This rich contextual prompt should significantly improve transcription accuracy for technical content mixing French and English. The example-driven approach is proper engineering.

Whisper has prompt token limitations that could affect performance. Run this to verify the prompt length:


🏁 Script executed:

#!/bin/bash
# Check prompt template lengths across all languages to ensure they don't exceed Whisper limits
echo "Checking prompt template lengths..."
for template in echo/server/prompt_templates/default_whisper_prompt.*.jinja; do
    if [ -f "$template" ]; then
        chars=$(cat "$template" | wc -c)
        words=$(cat "$template" | wc -w)
        echo "$(basename $template): $chars characters, $words words"
    fi
done

# Whisper typically handles up to ~224 tokens (roughly 1000-1500 characters)
echo "Note: Whisper prompt limit is typically ~224 tokens (~1000-1500 characters)"

Length of output: 2876


Prompt Length Verification Complete – All Systems Go
All default_whisper_prompt.*.jinja templates clock in between 534 – 588 characters (72 – 87 words), well below the ~1000–1500 character (~224 token) Whisper limit. No adjustments needed—proceed to launch! LGTM.

echo/server/prompt_templates/default_whisper_prompt.es.jinja (1)

1-1: Consistent prompt engineering across languages. LGTM.

The Spanish template follows the same solid pattern as the French version - contextual technical presentation framing with bilingual terminology. This consistency across language templates is proper architecture.

echo/server/dembrane/transcribe.py (2)

193-193: Clean prompt construction refactor. LGTM.

Simplifying the string concatenation with a space prefix is cleaner than the multiline approach. Good refactor.


205-216:

Details

✅ Verification successful

More robust error handling. LGTM.

Adding error handling around the conversation chunk lookup is solid defensive programming.

Verify that removing RUNPOD_WHISPER_MODEL doesn't break the transcription requests:


🏁 Script executed:

#!/bin/bash
# Check if RUNPOD_WHISPER_MODEL is still referenced anywhere in the codebase
echo "Searching for RUNPOD_WHISPER_MODEL references..."
rg -i "runpod.*whisper.*model|whisper.*model.*runpod" --type py

# Check the config file to see what RunPod-related variables remain
echo -e "\nRunPod configuration variables:"
rg "RUNPOD_WHISPER" echo/server/dembrane/config.py || echo "Config file not available for review"

# Check if model parameter was being used in payload construction
echo -e "\nChecking payload construction patterns:"
ast-grep --pattern 'input_payload = {
  $$$
}'

Length of output: 1853


Sequence clear: Defensive error handling ✅ & RUNPOD_WHISPER_MODEL removal safe 🚀

  • Wrapped directus.get_items in try/except (echo/server/dembrane/transcribe.py lines 205–216) to log errors and rethrow as ValueError.
  • Ripgrep scan found no RUNPOD_WHISPER_MODEL references across Python files.
  • Payload construction patterns show no usage of a model parameter.

All systems green—this change is ready for warp speed. LGTM.

@@ -1 +1 @@
System: Du bist eine hilfreiche Transkribent. Bitte transkribiere das folgende Audio *IN DEUTSCH*. No newline at end of file
Hier is een transcriptie van een technische presentatie. De spreker gebruikt natuurlijk Nederlands met enkele Engelse technical terms zoals gebruikelijk. De presentator zegt: "In een vrij gemakkelijk network is het mogelijk om de requirements voor een change te evalueren. We gebruiken modern tooling voor ons system management. Het is quite possible dat een enkele wijziging impact heeft, maar we blijven altijd in het Nederlands praten over onze processes en workflows. De change management procedure helpt ons om disruptions te voorkomen en stability te behouden in ons environment. No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical issue: Wrong language content detected!

This German prompt template .de.jinja contains Dutch text instead of German content. This will seriously mess up German transcriptions - Whisper will be primed for Dutch when it should be expecting German.

Fix this by replacing with proper German content:

-Hier is een transcriptie van een technische presentatie. De spreker gebruikt natuurlijk Nederlands met enkele Engelse technical terms zoals gebruikelijk. De presentator zegt: "In een vrij gemakkelijk network is het mogelijk om de requirements voor een change te evalueren. We gebruiken modern tooling voor ons system management. Het is quite possible dat een enkele wijziging impact heeft, maar we blijven altijd in het Nederlands praten over onze processes en workflows. De change management procedure helpt ons om disruptions te voorkomen en stability te behouden in ons environment.
+Hier ist eine Transkription einer technischen Präsentation. Der Sprecher verwendet natürlich Deutsch mit einigen englischen technical terms wie üblich. Der Präsentator sagt: "In einem ziemlich einfachen network ist es möglich, die requirements für eine change zu evaluieren. Wir verwenden moderne tooling für unser system management. Es ist quite possible, dass eine einzelne Änderung impact hat, aber wir sprechen immer auf Deutsch über unsere processes und workflows. Das change management procedure hilft uns, disruptions zu vermeiden und stability in unserem environment zu behalten.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Hier is een transcriptie van een technische presentatie. De spreker gebruikt natuurlijk Nederlands met enkele Engelse technical terms zoals gebruikelijk. De presentator zegt: "In een vrij gemakkelijk network is het mogelijk om de requirements voor een change te evalueren. We gebruiken modern tooling voor ons system management. Het is quite possible dat een enkele wijziging impact heeft, maar we blijven altijd in het Nederlands praten over onze processes en workflows. De change management procedure helpt ons om disruptions te voorkomen en stability te behouden in ons environment.
Hier ist eine Transkription einer technischen Präsentation. Der Sprecher verwendet natürlich Deutsch mit einigen englischen technical terms wie üblich. Der Präsentator sagt: "In einem ziemlich einfachen network ist es möglich, die requirements für eine change zu evaluieren. Wir verwenden moderne tooling für unser system management. Es ist quite possible, dass eine einzelne Änderung impact hat, aber wir sprechen immer auf Deutsch über unsere processes und workflows. Das change management procedure hilft uns, disruptions zu vermeiden und stability in unserem environment.
🤖 Prompt for AI Agents
In echo/server/prompt_templates/default_whisper_prompt.de.jinja at line 1, the
file contains Dutch text instead of German, which is incorrect for a German
prompt template. Replace the entire Dutch text with an accurate German
transcription prompt that matches the intended use for German language content
to ensure Whisper processes German transcriptions correctly.

@@ -1 +1 @@
System: You are a helpful transcriptionist. Please transcribe the following audio *IN ENGLISH*. No newline at end of file
Here is a transcript from a business presentation about technology and change management. The speaker maintains English throughout while occasionally using technical terms. The presenter explains: "In our network environment, it's essential to carefully plan any changes. Our system administrators use modern tools for network management. The change management process helps us prevent disruptions and maintain system stability. We've implemented quality assurance procedures and risk assessment protocols. Even when discussing complex technical concepts, we continue speaking in English. No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Strong contextual prompt but might be too domain-specific. LGTM with consideration.

The business/technical presentation context is solid engineering, but this template is quite specific to enterprise scenarios. Consider if this fits all your audio content types or if you need more generic fallbacks.

🤖 Prompt for AI Agents
In echo/server/prompt_templates/default_whisper_prompt.en.jinja at line 1, the
prompt is very specific to business and technical presentations, which may not
suit all audio content types. Review your use cases and consider creating
additional, more generic prompt templates as fallbacks to ensure broader
applicability beyond enterprise scenarios.

Comment on lines +156 to +173
try:
conversation = directus.get_items(
"conversation",
{
"query": {
"filter": {"id": {"_eq": chunk["conversation_id"]}},
"fields": [
"id",
"project_id",
"project_id.language",
"project_id.default_conversation_transcript_prompt",
],
},
},
},
)
)
except Exception as e:
logger.error(f"Failed to get conversation for {conversation_chunk_id}: {e}")
raise ValueError(f"Failed to get conversation for {conversation_chunk_id}: {e}") from e
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Consistent error handling pattern. LGTM.

Good consistency with the error handling pattern established in the previous block.

Same suggestion about exception type consistency - consider using TranscriptionError instead of ValueError for consistency with the module's exception hierarchy.

🤖 Prompt for AI Agents
In echo/server/dembrane/transcribe.py around lines 156 to 173, the exception
raised on failure to get the conversation uses ValueError, which is inconsistent
with the module's exception hierarchy. Replace the ValueError with
TranscriptionError to maintain consistent error handling across the module.
Ensure to import TranscriptionError if not already imported.

Comment on lines +111 to +123
try:
chunks = directus.get_items(
"conversation_chunk",
{
"query": {
"filter": {"id": {"_eq": conversation_chunk_id}},
"fields": ["id", "path", "conversation_id", "timestamp"],
},
},
},
)
)
except Exception as e:
logger.error(f"Failed to get chunks for {conversation_chunk_id}: {e}")
raise ValueError(f"Failed to get chunks for {conversation_chunk_id}: {e}") from e
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Solid error handling improvement. LGTM.

Adding try-catch blocks around external API calls is proper defensive programming. The error logging provides good debugging context.

Consider using a more specific exception type instead of generic ValueError:

-        raise ValueError(f"Failed to get chunks for {conversation_chunk_id}: {e}") from e
+        raise TranscriptionError(f"Failed to get chunks for {conversation_chunk_id}: {e}") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
chunks = directus.get_items(
"conversation_chunk",
{
"query": {
"filter": {"id": {"_eq": conversation_chunk_id}},
"fields": ["id", "path", "conversation_id", "timestamp"],
},
},
},
)
)
except Exception as e:
logger.error(f"Failed to get chunks for {conversation_chunk_id}: {e}")
raise ValueError(f"Failed to get chunks for {conversation_chunk_id}: {e}") from e
try:
chunks = directus.get_items(
"conversation_chunk",
{
"query": {
"filter": {"id": {"_eq": conversation_chunk_id}},
"fields": ["id", "path", "conversation_id", "timestamp"],
},
},
)
except Exception as e:
logger.error(f"Failed to get chunks for {conversation_chunk_id}: {e}")
raise TranscriptionError(f"Failed to get chunks for {conversation_chunk_id}: {e}") from e
🤖 Prompt for AI Agents
In echo/server/dembrane/transcribe.py around lines 111 to 123, replace the
generic ValueError raised in the except block with a more specific custom
exception or a relevant built-in exception that better describes the failure
context of the external API call. Define a custom exception class if needed, and
raise that instead to improve error clarity and handling downstream.


data = {"input": input_payload}

logger.debug(f"***data: {data}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Debug logging exposes sensitive data. Critical issue.

The debug log outputs the entire request payload which likely contains signed URLs and API keys. This is a security risk in production logs.

Apply this diff to sanitize the debug output:

-        logger.debug(f"***data: {data}")
+        # Log payload structure without sensitive data
+        safe_data = {
+            "input": {
+                "audio": "[SIGNED_URL_REDACTED]", 
+                "initial_prompt": data["input"].get("initial_prompt", "")[:100] + "..." if len(data["input"].get("initial_prompt", "")) > 100 else data["input"].get("initial_prompt", ""),
+                "language": data["input"].get("language")
+            }
+        }
+        logger.debug(f"Transcription request payload: {safe_data}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
logger.debug(f"***data: {data}")
# Log payload structure without sensitive data
safe_data = {
"input": {
"audio": "[SIGNED_URL_REDACTED]",
"initial_prompt": data["input"].get("initial_prompt", "")[:100] + "..." if len(data["input"].get("initial_prompt", "")) > 100 else data["input"].get("initial_prompt", ""),
"language": data["input"].get("language")
}
}
logger.debug(f"Transcription request payload: {safe_data}")
🤖 Prompt for AI Agents
In echo/server/dembrane/transcribe.py at line 56, the debug log prints the
entire request payload including sensitive information like signed URLs and API
keys. To fix this, modify the debug statement to sanitize or redact sensitive
fields before logging, ensuring no secrets are exposed in production logs.

@spashii spashii deleted the feature/echo-258-switch-to-custom-runpod-deployment-for-whisper branch May 29, 2025 14:53
@coderabbitai coderabbitai bot mentioned this pull request Sep 16, 2025
spashii pushed a commit that referenced this pull request Nov 18, 2025
Refactor RunPod Whisper transcription configuration and improve error handling

- Removed the unused RUNPOD_WHISPER_MODEL environment variable from the configuration.
- Updated the scheduling for RunPod transcription responses to trigger every 30 seconds instead of every minute.
- Changed the transcription response handling to use 'joined_text' instead of 'transcription'.
- Enhanced error handling in the transcription process to log and raise exceptions when fetching conversation chunks and details fails.
- Updated prompt templates for multiple languages with new technical presentation examples.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants